Publications

* indicates equal contribution.

Speech LLM Survey & Benchmark

  • ISA-Bench: Benchmarking Instruction Sensitivity for Large Audio Language Models
    B. Li, W. Huang, Y. Qiu, Y. Guo, H. Wang, Z. Li, Jing Peng, Z. Ma, X. Chen, K. Yu
    ICASSP 2026 (accepted), 2026
  • A Survey on Speech Large Language Models for Understanding
    Jing Peng*, Y. Wang*, Y. Fang, Y. Xi, X. Li, X. Zhang, K. Yu
    IEEE JSTSP (accepted), 2024

Speech LLM Alignment

  • TASU: Text-Only Alignment for Speech Understanding
    Jing Peng, Y. Yang, X. Li, Y. Xi, Q. Tang, Y. Fang, J. Li, K. Yu
    ICASSP 2026 (accepted), 2026

Speech LLM Domain Adaptation

  • Low-Resource Domain Adaptation for Speech LLMs via Text-Only Fine-Tuning
    Y. Fang*, Jing Peng*, X. Li, Y. Xi, C. Zhang, G. Zhong, K. Yu
    ASRU 2025 (accepted), 2025

Speech LLM Modular Adaptation

  • MOSA: Mixtures of Simple Adapters Outperform Monolithic Approaches in LLM-based Multilingual ASR
    Junjie Li, Jing Peng, Yangui Fang, Shuai Wang, Kai Yu
    ICASSP 2026 (accepted), 2026

LLM-based ASR Post-processing

  • Fewer Hallucinations, More Verification: A Three-Stage LLM-Based Framework for ASR Error Correction
    Y. Fang, B. Cheng, Jing Peng, X. Li, Y. Xi, C. Zhang, G. Zhong
    ASRU 2025 (accepted), 2025

Controllable / Contextual ASR

  • Joint Decoding Method for Controllable Contextual Speech Recognition Based on Speech LLM
    Y. Fang*, J. Peng*, Y. Xi, X. Li, H. Li, C. Zhang, G. Zhong, K. Yu
    arXiv preprint, 2025