Qian Cao

I am a PhD student at Gaoling School of Artificial Intelligence (GSAI), Renmin University of China (RUC). I am advised by Prof. Ruihua Song. Prior to this, I graduated from School of Information and Gaoling School of Artificial Intelligence, Renmin University of China where I obtained my bachelor’s degree and master’s degree in 2020 and 2023, respectively.

My research centers on building intelligent systems that integrate multimodal understanding, creative reasoning, and open-domain generation, with a focus on advancing both the controllability and applicability of AI in real-world scenarios.

Main Research Interests:

Creative AI: Developing and evaluating LLMs for controllable and human-like creative text generation.
Multimodal Understanding & Interaction: Exploring vision-language interplay and building multimodal LLMs with stronger cross-modal reasoning abilities.
Applied LLM Systems: Constructing efficient and domain-specific LLM systems, from foundational models to knowledge-augmented applications.

My long-term goal is to develop AI systems that seamlessly unify perception, knowledge, and creativity, enabling machines not only to understand the world, but also to generate meaningful and creative content across modalities.

News

Jan 15, 2026	Our new work DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing is now available! Check it out on [arXiv] !
Jun 9, 2025	Joined the foundation LLM team at Kuaishou Technology as a research intern!
May 25, 2025	Our new work Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator is now available! Check it out on [arXiv] to explore our dataset [CreataSet] and out creativity evaluator [CrEval] !
Oct 10, 2024	Our work Understanding the Roles of Visual Modality in Multimodal Dialogue: An Empirical Study was accepted by MMM 2025!
Sep 21, 2024	Our work BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain was accepted by EMNLP 2024 Findings!
Jul 16, 2024	Our work See or Guess: Counterfactually Regularized Image Captioning was accepted by ACM MM 2024!

Selected Publications

arXiv

DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing

Qian Cao, Yahui Liu, Wei Bi, Yi Zhao, Ruihua Song, Xiting Wang, Ruiming Tang, Guorui Zhou, and Han Li

arXiv preprint arXiv:2601.09609 2026

Bib arXiv PDF

@article{cao2026dpwriter,
  title = {DPWriter: Reinforcement Learning with Diverse Planning Branching for Creative Writing},
  author = {Cao, Qian and Liu, Yahui and Bi, Wei and Zhao, Yi and Song, Ruihua and Wang, Xiting and Tang, Ruiming and Zhou, Guorui and Li, Han},
  journal = {arXiv preprint arXiv:2601.09609},
  year = {2026},
}

arXiv

Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator

Qian Cao, Xiting Wang, Yuzhuo Yuan, Yahui Liu, Fang Luo, and Ruihua Song

arXiv preprint arXiv:2505.19236 2025

Bib arXiv PDF Overview Project Page

@article{cao2025evaluating,
  title = {Evaluating Text Creativity across Diverse Domains: A Dataset and Large Language Model Evaluator},
  author = {Cao, Qian and Wang, Xiting and Yuan, Yuzhuo and Liu, Yahui and Luo, Fang and Song, Ruihua},
  journal = {arXiv preprint arXiv:2505.19236},
  year = {2025},
}

EMNLP 2024

BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain

Kaisi Guan, Qian Cao, Yuchong Sun, Xiting Wang, and Ruihua Song

In Findings of the Association for Computational Linguistics: EMNLP 2024, Nov 2024

Bib arXiv PDF Overview Project Page

@inproceedings{Guan2024BSharedRAG,
  title = {BSharedRAG: Backbone Shared Retrieval-Augmented Generation for the E-commerce Domain},
  author = {Guan, Kaisi and Cao, Qian and Sun, Yuchong and Wang, Xiting and Song, Ruihua},
  booktitle = {Findings of the Association for Computational Linguistics: EMNLP 2024,},
  year = {2024},
  month = nov,
  doi = {10.18653/v1/2024.findings-emnlp.62},
}

ACM MM 2024

See or Guess: Counterfactually Regularized Image Captioning

Qian Cao, Xu Chen, Ruihua Song, Xiting Wang, Xinting Huang, and Yuchen Ren

In Proceedings of the 32th ACM international conference on Multimedia, Oct 2024

Bib arXiv PDF Overview

@inproceedings{Cao2024SeeOrGuess,
  title = {See or Guess: Counterfactually Regularized Image Captioning},
  author = {Cao, Qian and Chen, Xu and Song, Ruihua and Wang, Xiting and Huang, Xinting and Ren, Yuchen},
  booktitle = {Proceedings of the 32th ACM international conference on Multimedia,},
  year = {2024},
  month = oct,
  doi = {10.1145/3664647.3681458},
}

arXiv

YuLan: An Open-source Large Language Model

Yutao Zhu, Kun Zhou, Kelong Mao, Wentong Chen, Yiding Sun, Zhipeng Chen, Qian Cao, Yihan Wu, Yushuo Chen, Feng Wang, and others

arXiv preprint arXiv:2406.19853 Oct 2024

Bib arXiv PDF

@article{zhu2024yulan,
  title = {YuLan: An Open-source Large Language Model},
  author = {Zhu, Yutao and Zhou, Kun and Mao, Kelong and Chen, Wentong and Sun, Yiding and Chen, Zhipeng and Cao, Qian and Wu, Yihan and Chen, Yushuo and Wang, Feng and others},
  journal = {arXiv preprint arXiv:2406.19853},
  year = {2024},
}

ACM MM 2022

Multi-Modal Experience Inspired AI Creation

Qian Cao, Xu Chen, Ruihua Song, Hao Jiang, Guang Yang, and Zhao Cao

In Proceedings of the 30th ACM international conference on Multimedia, Oct 2022

Bib arXiv PDF Overview

@inproceedings{Cao2022MultiModalEI,
  title = {Multi-Modal Experience Inspired AI Creation},
  author = {Cao, Qian and Chen, Xu and Song, Ruihua and Jiang, Hao and Yang, Guang and Cao, Zhao},
  booktitle = {Proceedings of the 30th ACM international conference on Multimedia,},
  year = {2022},
  month = oct,
  doi = {10.1145/3503161.3548189},
}

Experiences

2025.6 - 2026.1	Foundation LLM Team @ , Kuaishou	Research Intern
2022.6 - 2023.3.6	AI Lab NLP Center @ , Tencent	Research Intern
2020.9 - Now	AIMind Group @ Beijing Key Lab of Big Data Management and Analysis, RUC	Research Assistant
2021.6 - 2022.2	Bing powered Xiaoice, Software Technology Center at Asia @ , Microsoft	Research Intern
2021.9 - 2021.12	Gaoling School of Artificial Intelligence, RUC	Teaching Assistant

Services

Conference Reviewer: ICLR'25, MM'24-25, ACL ARR'24-25, EMNLP'23, COLING'25
Journal Reviewer: Discover Computing