Mu Yang

About Me

Hi, there! My name is Mu Yang. I’m a 3rd year Ph.D. student in Electrical and Computer Engineering (ECE) at University of Texas at Dallas. My advisor is Dr. John H. L. Hansen. I’m a member of UTD Center for Robust Speech Systems (UTD CRSS).

My research interests include Speech Recognition and Speech Synthesis. My recent works have focused on accented (non-native) speech assessment and multilingual Automatic Speech Recognition. In the past, I also had experience in Spoken Language Understanding (Intent Classification) and Natural Language Processing (Event and Event Temporal Relation Extraction).

Interests

Speech Recognition
Speech Synthesis
Natural/Spoken Language Processing

Education

Ph.D. in Electrical and Computer Engineering, 2021-

University of Texas at Dallas, USA
Ph.D. in Computer Science, 2020-2021 (quitted)

Texas A&M University, USA
M.S. in Electrical Engineering, 2017-2019

University of Southern California, USA
B.Eng. in Communication Engineering, 2013-2017

Chongqing University, China
Exchange Student, 2016

National Sun Yat-sen University, Taiwan

Publications

Mu Yang, Bowen Shi, Matthew Le, Wei-Ning Hsu, Andros Tjandra (2024). Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation. arXiv preprint.

PDF Audio Samples

Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka (2024). DiariST: Streaming Speech Translation with Speaker Diarization. ICASSP 2024.

PDF

Mu Yang, Ram C. M. C. Shekar, Okim Kang, John H. L. Hansen (2023). What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model. Interspeech 2023.

PDF

Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem Kalinli (2023). Learning ASR Pathways: A Sparse Multilingual ASR Model. ICASSP 2023.

PDF

Mu Yang, Kevin Hirschi, Stephen D. Looney, Okim Kang, John H. L. Hansen (2022). Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment. Interspeech 2022 (Oral).

PDF Audio Samples

Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang (2022). Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis. ICASSP 2022.

PDF Code Audio Samples

Mu Yang, Darpit Dave, Madhav Erraguntla, Gerard L. Cote, Ricardo Gutierrez-Osuna (2022). Joint Hypoglycemia Prediction and Glucose Forecasting via Deep Multi-task Learning. ICASSP 2022.

PDF

Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng (2021). EventPlus: A Temporal Event Understanding Pipeline. NAACL 2021 (Demonstrations).

PDF Code Demo

Mu Yang, Karolina Nurzynska, Ann E. Walts, Arkadiusz Gertych (2020). A CNN-based Active Learning Framework to Identify Mycobacteria in Digitized Ziehl-Neelsen Stained Human Tissues. Computerized Medical Imaging and Graphics 2020.

PDF

Kung-Hsiang Huang, Mu Yang, Nanyun Peng (2020). Biomedical Event Extraction with Hierarchical Knowledge Graphs. EMNLP 2020 (Findings).

PDF Code

Rujun Han, I-Hung Hsu, Mu Yang, Aram Galstyan, Ralph Weischedel, Nanyun Peng (2019). Deep Structured Neural Network for Event Temporal Relation Extraction. CoNLL 2019.

PDF Code

Prashanth G. Shivakumar, Mu Yang, Panayiotis Georgiou (2019). Spoken Language Intent Detection using Confusion2Vec. Interspeech 2019.

PDF Dataset

Experience

Research Intern

Meta AI

May 2024 – Aug 2024 New York City, NY, USA

Mentors: Bowen Shi, Matthew Le, Wei-Ning Hsu. Manager: Andros Tjandra (FAIR Audiobox team).

Text-to-audio generation.

Research Intern

Microsoft

May 2023 – Aug 2023 Redmond, WA, USA

Mentors: Naoyuki Kanda, Xiaofei Wang. Manager: Takuya Yoshioka (Cognitive Services Research Speech team).

Speech Translation.

Research Intern

Meta AI

May 2022 – Aug 2022 New York City, NY, USA

Mentors: Andros Tjandra, Chunxi Liu, David Zhang. Manager: Duc Le, Ozlem Kalinli (AI Speech team).

Develop multilingual ASR technologies for on-device scenario.

Research Assistant

USC ISI

Jul 2019 – Jul 2020 Los Angeles, CA, USA

Supervisor: Nanyun Peng.

NLP projects (Information Extraction).

Selected Projects

Mu Yang

July, 2021 Project

Mis-pronunciation Detection based on Phoneme Recognition

A Mis-pronunciation Detection system, with word-level aligned phonemes predictions.

Mu Yang, James Bunning, Shiyu Mou, Sharada Murali, Yixin Yang

December, 2018 USC course EE599: Deep Learning Lab for Speech Processing

Synthing: A WaveNet-based Singing Voice Synthisizer

Final project for USC course EE599: Deep Learning Lab for Speech Processing - a WaveNet-based singing voice synthesizer. This is a partial implementation of the paper A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs.

Mu Yang, Tao Chen, Chang Su, Zhe Yang

November, 2018 USC course CSCI544: Applied Natural Language Processing

Collection and Classification of Lyrics

Web crawler of lyrics and corresponding music genre. Multiple baseline classifiers, such as Naive Bayes, SVM and Neural Approach(LSTM) are applied to identify the genre of a song by analyzing its lyrics.

Mu Yang

April, 2018 USC course EE522: Immersive Audio Processing

Digital Room Correction using Parallel Second-order Filter-based Equalizer

A Parallel second-order-based equalizer for Room Impulse Response Calibration.

Misc.

I love music! I play guitar (a lot), bass (a little), drum (a little) and violin (>10 years back). I play and sing in a few (unprofessional) bands. We covered songs from our favorite artists. Check out some of our videos!

I sing and play guitar
I sing and play guitar (forgive the poor video quality…)
A campus event, 1 (forgive the poor video quality…)
A campus event, 2 (forgive the poor video quality…)