Publications | Mu Yang's Website

Mu Yang, Bowen Shi, Matthew Le, Wei-Ning Hsu, Andros Tjandra (2024). Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation. arXiv preprint.

PDF Audio Samples

Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka (2024). DiariST: Streaming Speech Translation with Speaker Diarization. ICASSP 2024.

PDF

Mu Yang, Ram C. M. C. Shekar, Okim Kang, John H. L. Hansen (2023). What Can an Accent Identifier Learn? Probing Phonetic and Prosodic Information in a Wav2vec2-based Accent Identification Model. Interspeech 2023.

PDF

Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem Kalinli (2023). Learning ASR Pathways: A Sparse Multilingual ASR Model. ICASSP 2023.

PDF

Mu Yang, Kevin Hirschi, Stephen D. Looney, Okim Kang, John H. L. Hansen (2022). Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment. Interspeech 2022 (Oral).

PDF Audio Samples

Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang (2022). Towards Lifelong Learning of Multilingual Text-To-Speech Synthesis. ICASSP 2022.

PDF Code Audio Samples

Mu Yang, Darpit Dave, Madhav Erraguntla, Gerard L. Cote, Ricardo Gutierrez-Osuna (2022). Joint Hypoglycemia Prediction and Glucose Forecasting via Deep Multi-task Learning. ICASSP 2022.

PDF

Mu Yang (2021). Mis-pronunciation Detection based on Phoneme Recognition. Project.

Audio Samples

Mingyu Derek Ma, Jiao Sun, Mu Yang, Kung-Hsiang Huang, Nuan Wen, Shikhar Singh, Rujun Han, Nanyun Peng (2021). EventPlus: A Temporal Event Understanding Pipeline. NAACL 2021 (Demonstrations).

PDF Code Demo

Mu Yang, Karolina Nurzynska, Ann E. Walts, Arkadiusz Gertych (2020). A CNN-based Active Learning Framework to Identify Mycobacteria in Digitized Ziehl-Neelsen Stained Human Tissues. Computerized Medical Imaging and Graphics 2020.

PDF

Kung-Hsiang Huang, Mu Yang, Nanyun Peng (2020). Biomedical Event Extraction with Hierarchical Knowledge Graphs. EMNLP 2020 (Findings).

PDF Code

Rujun Han, I-Hung Hsu, Mu Yang, Aram Galstyan, Ralph Weischedel, Nanyun Peng (2019). Deep Structured Neural Network for Event Temporal Relation Extraction. CoNLL 2019.

PDF Code

Prashanth G. Shivakumar, Mu Yang, Panayiotis Georgiou (2019). Spoken Language Intent Detection using Confusion2Vec. Interspeech 2019.

PDF Dataset

Mu Yang (2019). An example preprint / working paper.

PDF Code Dataset Poster Slides Video Source Document Custom Link

Mu Yang, James Bunning, Shiyu Mou, Sharada Murali, Yixin Yang (2018). Synthing: A WaveNet-based Singing Voice Synthisizer. USC course EE599: Deep Learning Lab for Speech Processing.

PDF Code Dataset Audio Samples

Mu Yang, Tao Chen, Chang Su, Zhe Yang (2018). Collection and Classification of Lyrics. USC course CSCI544: Applied Natural Language Processing.

PDF Code

Mu Yang (2018). Digital Room Correction using Parallel Second-order Filter-based Equalizer. USC course EE522: Immersive Audio Processing.

PDF Code Audio Samples

Mu Yang (2017). Faster-RCNN for Pedestrian Detection in Videos. Graduation Project for Undergraduates at Chongqing University.

Mu Yang, Robert Ford (2015). An example journal article. Journal of Source Themes, 1(1).

PDF Cite Code Slides

Mu Yang, Robert Ford (2013). An example conference paper. In ICW.

PDF Cite Code Dataset Project Slides Video Source Document