MoCha

Towards Movie-Grade Talking Character Synthesis
1GenAI, Meta 2University of Waterloo
*Work done during the first author’s internship at GenAI, Meta Project Lead

MoCha is a model for Dialogue-driven Movie Shot Generation.

All talking characters are generated solely from Speech and Text.
Click ▶️ to bring them to life.
The full set is available in the 🤗Our-Results-on-MoChaBench☕.
We released a benchmark 🤗MoChaBench☕ tailored for Dialogue-driven Movie Shot Generation.
All videos presented in this project are solely for research demonstration purposes and have no commercial use.



Emotion Control




Action Control




Multi-Characters




Multi-character Conversation with Turn-based Dialogue




Portrait Talking Characters




BibTeX

If you find this project useful for your research, please cite the following:

@article{wei2025mocha,
  title={MoCha: Towards Movie-Grade Talking Character Synthesis},
  author={Wei, Cong and Sun, Bo and Ma, Haoyu and Hou, Ji and Juefei-Xu, Felix and He, Zecheng and Dai, Xiaoliang and Zhang, Luxin and Li, Kunpeng and Hou, Tingbo and others},
  journal={arXiv preprint arXiv:2503.23307},
  year={2025}
}