I am a PhD student at Gaoling School of Artificial Intelligence (GSAI), Renmin University of China (RUC). I am advised by Prof. Ruihua Song. Prior to this, I graduated from School of Information and Gaoling School of Artificial Intelligence, Renmin University of China where I obtained my bachelor’s degree and master’s degree in 2020 and 2023, respectively.
Research Interests
Various Text Generation problems, especially in Multimodal and Creative scenarios;
Large Language Models (LLMs) & their Applications
News
Oct 10, 2024
Our work Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study was accepted by MMM 2025!
Sep 21, 2024
Our work BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain was accepted by EMNLP 2024 Findings!
Jul 16, 2024
Our work See or Guess: Counterfactually Regularized Image Captioning was accepted by ACM MM 2024!
Jun 28, 2024
We published YuLan-Base-12B & YuLan-Chat-3-12B, a series of LLMs trained from scratch! See our report.
Sep 3, 2023
Started a new journey as a PhD student at GSAI, RUC.
Jun 26, 2023
Received my master’s degree! Happy graduation!
Publications
MMM 2025
Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study
Qian Cao, Ruihua Song, and Xu Chen
In Proceedings of the 31st International Conference on Multimedia Modeling, 2025, Jan 2025
@inproceedings{Cao2024BSharedRAG,title={Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study},author={Cao, Qian and Song, Ruihua and Chen, Xu},booktitle={Proceedings of the 31st International Conference on Multimedia Modeling, 2025,},year={2025},month=jan,doi={10.1007/978-981-96-2071-5_20},}
@inproceedings{Guan2024BSharedRAG,title={BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain},author={Guan, Kaisi and Cao, Qian and Sun, Yuchong and Wang, Xiting and Song, Ruihua},booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024,},year={2024},month=nov,doi={10.18653/v1/2024.findings-emnlp.62},}
@inproceedings{Cao2024SeeOrGuess,title={See or Guess: Counterfactually Regularized Image Captioning},author={Cao, Qian and Chen, Xu and Song, Ruihua and Wang, Xiting and Huang, Xinting and Ren, Yuchen},booktitle={Proceedings of the 32th ACM international conference on Multimedia,},year={2024},month=oct,doi={10.1145/3664647.3681458},}
@article{zhu2024yulan,title={YuLan: An Open-source Large Language Model},author={Zhu, Yutao and Zhou, Kun and Mao, Kelong and Chen, Wentong and Sun, Yiding and Chen, Zhipeng and Cao, Qian and Wu, Yihan and Chen, Yushuo and Wang, Feng and others},journal={arXiv preprint arXiv:2406.19853},year={2024},}
@inproceedings{Cao2022MultiModalEI,title={Multi-Modal Experience Inspired AI Creation},author={Cao, Qian and Chen, Xu and Song, Ruihua and Jiang, Hao and Yang, Guang and Cao, Zhao},booktitle={Proceedings of the 30th ACM international conference on Multimedia,},year={2022},month=oct,doi={10.1145/3503161.3548189},}
Experiences
2022.6 - 2023.3.6
NLP Center @ AI Lab
Research Intern
2020.9 - Now
AIMind Group @ Beijing Key Lab of Big Data Management and Analysis, RUC
Research Assistant
2021.6 - 2022.2
Bing powered Xiaoice @ Software Technology Center at Asia