About

I am a Machine Learning Researcher at Apple MLR, specializing in Natural Language Processing (NLP) and Speech Technologies.

Academic Background

I completed my PhD in Computer Science at the University of Waterloo in 2023, with a dissertation titled “Novel Methods for Natural Language Modeling and Pretraining“, under the supervision of Professor Ming Li.
Prior to my doctoral studies, I obtained a Master’s degree from the Chinese Academy of Sciences, where I began my NLP research journey under the mentorship of Professor Chengqing Zong.

Academic Service

I serve as an Area Chair for ACL Rolling Review and have been an active reviewer for top-tier NLP and ML conferences since 2019, including ACL, EMNLP, ICML, NeurIPS, and ICLR. My contributions were recognized with an Outstanding Reviewer Award at ICML 2022. Most recently, I organized a challenge track at the Embodied AI Workshop in CVPR 2024.

Research Interest

My research spans a broad and dynamic range of Natural Language Processing domains, including: language model pretraining, text-speech joint modeling, sentence representation learning, text summarization, machine translation, spoken language understanding, multilingual NLP, and information retrieval.

Currently, I am deeply engaged in advancing audio and text sequence modeling, developing innovative approaches that push the boundaries of computational linguistics and machine learning.

Selected Publications

He Bai*, Tatiana Likhomanenko*, Ruixiang Zhang, Zijin Gu, Zakaria Aldeneh, Navdeep Jaitly. dMel: Speech tokenization made simple. Preprint. [pdf] (*equal)

Y Zhang*, H Bai*, R Zhang*, J Gu, S Zhai, J Susskind, N Jaitly. How Far Are We from Intelligent Visual Deductive Reasoning? COLM 2024 [pdf][code]. (*equal)

He Bai*, Renjie Zheng*, Junkun Chen, Xintong Li, Mingbo Ma, Liang Huang. A3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing. ICML 2022 [pdf][code]. (*equal)

He Bai, Tong Wang, Alessandro Sordoni, Peng Shi. Better Language Model with Hypernym Class Prediction. ACL 2022 [pdf] [code].

He Bai, Peng Shi, Jimmy Lin, Yuqing Xie, Luchen Tan, Kun Xiong, Wen Gao, Ming Li. Segatron: Segment-awareTransformer for Language Modeling and Understanding. AAAI 2021. (full paper) [pdf] [code]

He Bai, Peng Shi, Jimmy Lin, Luchen Tan, Kun Xiong, Wen Gao, Jie Liu, Ming Li. Semantics of the Unwritten: The Effect of End of Paragraph and Sequence Tokens on Text Generation. ACL-SRW 2021 (workshop paper) [pdf] [code].

He Bai, Yu Zhou, Jiajun Zhang and Chengqing Zong. Memory Consolidation for Contextual Spoken Language Understanding with Dialogue Logistic Inference. ACL 2019. [pdf] [code]

He Bai, Yu Zhou, Jiajun Zhang, Liang Zhao, Mei-Yuh Hwang and Chengqing Zong. Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language. COLING 2018. [pdf]