Next Generation Media & Device

Open-Sourced Text-to-Video Model: CogVideoX

Speakers

Yuxuan Zhang

Date / Time

2024-10-17

14:00

Presentation Slides

Presentation Video

YouTube

We introduce CogVideoX, a large-scale diffusion transformer model designed for generating videos based on text prompts. Results show that CogVideoX demonstrates state-of-the-art performance across both multiple machine metrics and human evaluations. The model weight of CogVideoX is publicly available at https://github.com/THUDM/CogVideo.