Jing Peng

I am now a Zhiyuan Honor Ph.D. Student at Shanghai Jiao Tong University (SJTU), X-LANCE Lab, advised by Prof. Kai Yu (and co-advised by Prof. Shinji Watanabe).

My research focuses on Speech Large Language Models (Speech LLMs), with an emphasis on building well-aligned speech understanding systems that are robust to domain shift and multi-speaker conditions.

Research Interests

Generally, I am focusing on Speech Large Language Models (Speech LLMs) for speech understanding and reasoning:

Multimodal alignment between speech and text for instruction-following speech systems
Efficient adaptation for low-resource / cross-domain settings
Speaker-attributed ASR (SA-ASR) and multi-speaker understanding

Research Experience

My recent work spans both academic labs and industry research:

Speech LLMs for Speech Understanding (AISpeech, Suzhou, Jiangsu)
I work on ASR and multimodal alignment methods that connect speech representations with language model reasoning and instruction following.
SA-ASR with Speech LLMs (Shenzhen Research Institute of Big Data, Remote)
I explore Speech LLM-based frameworks for speaker-attributed transcription, aiming to improve speaker consistency and controllability in multi-speaker scenarios.
Speaker Discrimination on Omni/SLM (Hi Lab, Xiaohongshu, Shanghai)
I study explicit speaker discrimination and implicit speaker selection strategies for multi-speaker understanding, with an eye toward robust speaker identity modeling under real-world conditions.

Publications (Selected)

You can see the full list on Publications.

indicates equal contribution.

A Survey on Speech Large Language Models for Understanding
Jing Peng, Y. Wang, Y. Fang, Y. Xi, X. Li, X. Zhang, K. Yu.
arXiv:2410.18908. Accepted by IEEE JSTSP.
https://arxiv.org/abs/2410.18908
TASU: Text-Only Alignment for Speech Understanding
Jing Peng, Y. Yang, X. Li, Y. Xi, Q. Tang, Y. Fang, J. Li, K. Yu.
arXiv:2511.03310. Accepted by ICASSP 2026.
https://arxiv.org/abs/2511.03310
Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
Y. Fang, Jing Peng, X. Li, Y. Xi, C. Zhang, G. Zhong, K. Yu.
arXiv:2506.05671. Accepted by ASRU 2025.
https://arxiv.org/abs/2506.05671
MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR
Junjie Li, Jing Peng, Yangui Fang, Shuai Wang, Kai Yu.
arXiv:2508.18998. Accepted by ICASSP 2026.
https://arxiv.org/abs/2508.18998

Contact Information

I am so happy to chat and collaborate on the topics above and you can contact me by:

Email: jing.peng@sjtu.edu.cn
GitHub: https://github.com/PigeonDan1
Google Scholar: https://scholar.google.com/citations?user=Uo0mj0AAAAAJ&hl=en
LinkedIn: https://www.linkedin.com/in/jing-peng-7ab8682a4/