Cong Wei

University of Waterloo; Vector Institute;

prof_pic.jpg

I am a Ph.D. student at the Cheriton School of Computer Science, University of Waterloo, supervised by Prof. Wenhu Chen. I interned at ModiFace (2022 with Dr. Brendan Duke and Irene). Before starting my Ph.D., I received my master’s degree in Computer Science from the University of Toronto, where I was fortunately advised by Prof. Florian Shkurti. I also earned my bachelor’s degree in CS from UofT, with the privilege of being advised by Prof. David Duvenaud.

My research interests include:

Generative Models Designing more controllable methods for image, video, and 3D generation and editing.

Multimodal Learning Integrating world knowledge into multimodal LLMs for complex reasoning + Retrieval-Augmented Generation.

My Email: cong.wei [at] uwaterloo [dot] ca

news

Dec 6, 2023 Invited talk about UniIR multimodal Retrieval at Objective. Inc.
Nov 5, 2023 🔥 Check out our recent preprint on Multimodal Retrieval: UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Nov 3, 2023 Invited talk about Efficient Vision Transformer at Vector Institute 2023 Endless Summer School (ESS).
Nov 1, 2023 Check out our new multimodal benchmark MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Feb 28, 2023 One paper accepted to CVPR 2023 about Efficient Vision Transformer.

selected publications

  1. arXiv
    UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
    Cong Wei, Yang Chen, Haonan Chen, and 5 more authors
    arXiv:2311.17136, 2023
    uniir.jpg
  2. arXiv
    MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
    Xiang Yue, Yuansheng Ni, Kai Zhang, and 9 more authors
    arXiv:2311.16502, 2023
    mmmu.jpg
  3. TMLR
    DreamEdit: Subject-driven Image Editing
    Tianle Li, Max Ku*, Cong Wei*, and 1 more author
    Transactions on Machine Learning Research (TMLR) 2023, 2023
    dreamedit.png
  4. CVPR
    Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers
    Cong Wei*, Brendan Duke*, Ruowei Jiang, and 3 more authors
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
    sparsifiner.png