About Me

I’m Keqing, currently working at Meituan LLM Team. My professional experience includes reasoning models(like o1), mixture of experts(MOE) and LLM alignment. Before that, I also participate in dialogue systems, including end2end dialogue system and dialogue pretrain.

My research interests focus on LLM, including:

  • complex reasoning: Complex reasoning abilities are a key milestone in the development of LLMs, and the rise of reasoning models has rapidly advanced the field. My focus is on the evolution of foundational models and the optimization of Long-COT RL. For reasoning models, we need to build new technical pipelines—innovating from pre-training to post-training, from data to algorithms—to push the boundaries of what’s possible.
  • reinforcement learning in real-world settings: While LLMs have made impressive strides in reasoning tasks like code and math, they’ve yet to translate into real-world productivity. LLM-driven end-to-end agent systems — such as DeepResearch, GUI Agent, and Embodied Agent — offer an exciting and imaginative path forward. My core interest lies in reinforcement learning in real-world settings, pushing the limits of intelligence through interaction with dynamic environments.
  • LLM alignment: Alignment is an essential process when working with LLMs to ensure that models align with human values. My primary focus is on scalable alignment learning, including data evaluation and optimization, as well as preference learning algorithms. This work is crucial for shaping models that are not only powerful but also ethically sound and aligned with our goals.

News

  • 2025.1: We have one papers accepted by ICLR2025
  • 2024.6: We have two papers accepted by EMNLP2024
  • 2024.2: We have two papers accepted by COLING2024, one paper accepted by ACL2024
  • 2023.12: We have one paper accepted by ICLR2023
  • 2023.10: We have four papers accepted by EMNLP2023
  • 2023.5: We have four papers accepted by ACL2023
  • 2022.12: We have four papers accepted by EMNLP2022
  • 2022.8: We have three papers accepted by COLING2022
  • 2022.8: We have one paper accepted by CIKM2022
  • 2022.4: We have two papers accepted by NAACL2022
  • 2022.3: We have one paper accepted by SIGIR2022
  • 2022.2: We have one paper accepted by ACL2022

Experience

  1. Full employee in Meituan LLM Group, Mar 2023 - Now:
    • Research area in reasoing models(like o1), mixture of experts(MOE) and LLM aligment.
  2. Full employee in Meituan NLP Group, Jun 2021 - Mar 2023:
    • Research area in dialogue system and dialogue pretrain.
  3. Research Intern in Alibaba DAMO, Jun 2020 - Oct 2020:
    • Research area in recommendation system.
  4. Research Intern in Tencent Wechat AI Lab, Mar 2020 - Jun 2020:
    • Research area in zero-shot learning and slot filling.
  5. Research Intern in Meituan NLP Group, Oct 2019 - Mar 2020:
    • Research area in GCN and dialogue system.

Education

  • 2018-2021, Master in Artificial Intelligence, BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS

  • 2014-2018, Bachelor in Communication Engineering, BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS

Publication

Please see the full paper list in Semantic Scholar

  1. SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild, Arxiv
    • Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He
    • paper, code
  2. AgentRefine: Enhancing Agent Generalization through Refinement Tuning, ICLR2025
    • Dayuan Fu*, Keqing He*, Wei Wang, Jingang Wang, Xunliang Cai, Weiran Xu, etc
    • paper, code
  3. DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning, ACL2024
    • Yejie Wang*, Keqing He*, Mengdi Zhang, Jingang Wang, Xunliang Cai, Weiran Xu, etc
    • paper
  4. How Do Your Code LLMs perform? Empowering Code Instruction Tuning with Really Good Data, EMNLP2024
    • Yejie Wang*, Keqing He*, Jingang Wang, Mengdi Zhang, Xunliang Cai, Weiran Xu, etc
    • paper
  5. Scaling Laws Across Model Architectures: A Comparative Analysis of Dense and MoE Models in Large Language Models, EMNLP2024
    • Siqi Wang, Zhengyu Chen, Bei Li, Keqing He, Min Zhang, Jingang Wang
    • paper
  6. What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning, ICLR2023
    • Wei Liu*, Weihao Zeng*, Keqing He, Yong Jiang, Junxian He
    • paper, code
  7. Large Language Models Meet Open-World Intent Discovery and Recognition: An Evaluation of ChatGPT, EMNLP2023
    • Xiaoshuai Song*, Keqing He*, Pei Wang, Guanting Dong, Yutao Mou, Jingang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu
    • paper, code
  8. DemoNSF: A Multi-task Demonstration-based Generative Framework for Noisy Slot Filling Task, EMNLP2023 Findings
    • Guanting Dong*, Tingfeng Hui*, Zhuoma GongQue, Jinxu Zhao, Daichi Guo, Gang Zhao, Keqing He, Weiran Xu
    • paper, code
  9. Continual Generalized Intent Discovery: Marching Towards Dynamic and Open-world Intent Recognition, EMNLP2023 Findings
    • Xiaoshuai Song*, Yutao Mou*, Keqing He*, Yueyan Qiu, Jinxu Zhao, Pei Wang, Weiran Xu
    • paper, code
  10. APP: Adaptive Prototypical Pseudo-Labeling for Few-shot OOD Detection, EMNLP2023 Findings
    • Pei Wang*, Keqing He*, Yutao Mou*, Xiaoshuai Song, Yanan Wu, Jingang Wang, Yunsen Xian, Xunliang Cai, Weiran Xu
    • paper, code
  11. Decoupling Pseudo Label Disambiguation and Representation Learning for Generalized Intent Discovery, ACL2023
    • Yutao Mou*, Xiaoshuai Song*, Keqing He*, Chen Zeng, Pei Wang, Jingang Wang, Yunsen Xian, Weiran Xu
    • paper, code
  12. FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for Task-Oriented Dialogue, ACL2023
    • Weihao Zeng*, Keqing He*, Yejie Wang, Chen Zeng, Jingang Wang, Yunsen Xian, Weiran Xu
    • paper, code
  13. Seen to Unseen: Exploring Compositional Generalization of Multi-Attribute Controllable Dialogue Generation, ACL2023
    • Weihao Zeng*, Lulu Zhao*, Keqing He, Ruotong Geng, Jingang Wang, Wei Wu, Weiran Xu
    • paper
  14. Generative Zero-Shot Prompt Learning for Cross-Domain Slot Filling with Inverse Prompting, ACL2023 Findings
    • Xuefeng Li*, Liwen Wang*, Guanting Dong*, Keqing He, Jinzheng Zhao, Hao Lei, Jiachi Liu, Weiran Xu
    • paper, code
  15. UniNL: Aligning Representation Learning with Scoring Function for OOD Detection via Unified Neighborhood Learning, EMNLP2022
    • Yutao Mou*, Pei Wang*, Keqing He*, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
    • paper, code
  16. Watch the Neighbors: A Unified K-Nearest Neighbor Contrastive Learning Framework for OOD Intent Discovery, EMNLP2022 oral
    • Yutao Mou*, Keqing He*, Pei Wang, Yanan Wu, Jingang Wang, Wei Wu, Weiran Xu
    • paper, code
  17. Semi-Supervised Knowledge-Grounded Pre-training for Task-Oriented Dialog Systems, EMNLP2022 SereTOD Workshop (Championship of Track II)
    • Weihao Zeng*, Keqing He*, Zechen Wang*, Dayuan Fu, Guanting Dong, Ruotong Geng, Pei Wang, Jingang Wang, Chaobo Sun, Wei Wu, Weiran Xu
    • paper, code
  18. Disentangling Confidence Score Distribution for Out-of-Domain Intent Detection with Energy-Based Learning, EMNLP2022 SereTOD Workshop
    • Yanan Wu*, Zhiyuan Zeng*, Keqing He*, Yutao Mou, Pei Wang, Yuanmeng Yan, Weiran Xu
    • paper, code
  19. Distribution Calibration for Out-of-Domain Detection with Bayesian Approximation, COLING2022
    • Yanan Wu*, Zhiyuan Zeng*, Keqing He*, Yutao Mou, Pei Wang, Weiran Xu
    • paper, code
  20. PSSAT: A Perturbed Semantic Structure Awareness Transferring Method for Perturbation-Robust Slot Filling, COLING2022
    • Guanting Dong*, Daichi Guo*, LiWen Wang*, Xuefeng Li*, Zechen Wang, Chen Zeng, Keqing He, Jinzheng Zhao, Hao Lei, Xinyue Cui, Yi Huang, Junlan Feng, Weiran Xu
    • paper
  21. Unified Knowledge Prompt Pretraining for Customer Service Dialogues, CIKM2022
    • Keqing He, Jingang Wang, Chaobo Sun, Wei Wu
    • paper
  22. Domain-Oriented Prefix-Tuning: Towards Efficient and Generalizable Fine-tuning for Zero-Shot Dialogue Summarization, NAACL2022 oral
    • Lulu Zhao*, Fujia Zheng*, Weihao Zeng, Keqing He, Weiran Xu, Huixing Jiang, Wei Wu, Yanan Wu
    • paper, code
  23. Revisit Overconfidence for OOD Detection: Reassigned Contrastive Learning with Adaptive Class-dependent Threshold, NAACL2022
    • Yanan Wu*, Keqing He*, Yuanmeng Yan, Qixiang Gao, Zhiyuan Zeng, Fujia Zheng, Lulu Zhao, Huixing Jiang, Wei Wu, Weiran Xu
    • paper, code
  24. ADPL: Adversarial Prompt-based Domain Adaptation for Dialogue Summarization with Knowledge Disentanglement, SIGIR2022
    • Lulu Zhao*, Fujia Zheng*, Weihao Zeng, Keqing He, Ruotong Geng, Huixing Jiang, Wei Wu, Weiran Xu
    • paper
  25. Disentangled Knowledge Transfer for OOD Intent Discovery with Unified Contrastive Learning, ACL2022
    • Yutao Mou*, Keqing He*, Yanan Wu*, Zhiyuan Zeng, Hong Xu, Huixing Jiang, Wei Wu, Weiran Xu
    • paper, code
  26. Bridge to Target Domain by Prototypical Contrastive Learning and Label Confusion: Re-explore Zero-Shot Learning for Slot Filling, EMNLP2021 oral
    • Liwen Wang*, Xuefeng Li*, Jiachi Liu, Keqing He, Yuanmeng Yan, Weiran Xu
    • paper, code
  27. A Finer-grain Universal Dialogue Semantic Structures based Model For Abstractive Dialogue Summarization, EMNLP2021 Findings
    • Yuejie Lei*, Fujia Zheng*, Yuanmeng Yan, Keqing He, Weiran Xu
    • paper, code
  28. Novel Slot Detection: A Benchmark for Discovering Unknown Slot Types in the Task-Oriented Dialogue System, ACL2021 oral
    • Yanan Wu*, Zhiyuan Zeng*, Keqing He*, Hong Xu, Yuanmeng Yan, Huixing Jiang and Weiran Xu
    • paper, code
  29. Modeling Discriminative Representations for Out-of-Domain Detection with Supervised Contrastive Learning, ACL2021
    • Zhiyuan Zeng*, Keqing He*, Yuanmeng Yan, Zijun Liu, Yanan Wu, Hong Xu, Huixing Jiang and Weiran Xu
    • paper, code
  30. Scheduled Dialog Policy Learning: An Automatic Curriculum Learning Framework for Task-oriented Dialog System, ACL2021 Findings
    • Sihong Liu, Jinchao Zhang, Keqing He, Weiran Xu and Jie Zhou
    • paper
  31. Adversarial Self-Supervised Learning for Out-of-Domain Detection, NAACL2021 oral
    • Zhiyuan Zeng, Keqing He, Yuanmeng Yan, Hong Xu, Weiran Xu
    • paper, code
  32. Dynamically Disentangling Social Bias from Task-Oriented Representations with Adversarial Attack, NAACL2021
    • Liwen Wang*, Yuanmeng Yan*, Keqing He, Yanan Wu, Weiran Xu
    • paper, code
  33. Contrastive Zero-Shot Learning for Cross-Domain Slot Filling with Adversarial Attack, COLING2020 oral
    • Keqing He, Jinchao Zhang, Yuanmeng Yan, Weiran XU, Cheng Niu, Jie Zhou
    • paper
  34. Syntactic Graph Convolution Network for Spoken Language Understanding, COLING2020
    • Keqing He*, Shuyu Lei*, Jiangnan Xia, Yushu Yang, Huixing Jiang, Zhongyuan Wang
    • paper
  35. A Deep Generative Distance-Based Classifier for Out-of-Domain Detection with Mahalanobis Space, COLING2020 oral
    • Hong Xu, Keqing He, Yuanmeng Yan, Sihong Liu, Zijun Liu, Weiran XU
    • paper, code
  36. Adversarial Semantic Decoupling for Recognizing Open-Vocabulary Slots, EMNLP2020 oral
    • Yuanmeng Yan*, Keqing He*, Hong Xu, Sihong Liu, Fanyu Meng, Min Hu, Weiran XU
    • paper, code
  37. Learning to Tag OOV Tokens by Integrating Contextual Representation and Background Knowledge, ACL2020
    • Keqing He, Yuanmeng Yan, Hong Xu, Sihong Liu, Weiran Xu
    • paper

Contact

  • Address: Beijing, China
  • Email: helicbupt@gmail.com
  • Blog: https://helicqin.github.io