Research Scientist at Apple MLR, working on building more capable LLMs through retrieval augmentation and scalable data curation.
I completed my PhD in Computer Science at the University of Waterloo, with a dissertation titled "Novel Methods for Natural Language Modeling and Pretraining", under the supervision of Professor Ming Li. Prior to my doctoral studies, I obtained a Master's degree from the Chinese Academy of Sciences, where I began my NLP research journey under the mentorship of Professor Chengqing Zong.
Preprint, 2026
*equal contribution
ICLR 2026
Preprint, 2025
IEEE ASRU 2025 (Best Demo)
*equal contribution
ACL 2024
EMNLP 2024
ACL 2024 Workshop (KnowLLM)
Dissertation: Novel Methods for Natural Language Modeling and Pretraining
Findings of EMNLP 2022