MoCha is a pioneering model for Dialogue-driven Movie Shot Generation.
All talking characters are generated solely from Speech + Text. Click βΆοΈ to bring them to life.
The full set with prompt is available at π€Our-Results-on-MoChaBenchβ.
We released a benchmark π€MoChaBenchβ tailored for Dialogue-driven Movie Shot Generation.
When combined with a TTS model, MoCha can also achieve Text β Speech + Video generation, similar to the Veo 3 setting.
All videos presented in this project are solely for research demonstration purposes and have no commercial use.