AI Models & Infra

vLLM: Easy, Fast, and Cheap LLM Serving for Everyone

Speakers

Kaichao You

Date / Time

2024-10-18

9:30

Presentation Slides

Presentation Video

YouTube

vLLM is a fast and easy-to-use library for LLM inference and serving. In this talk, I will briefly introduce the evolution of the vLLM project, the open-source community behind it, and highlight some features that are interesting to many users.