Mu Yang's Website
Mu Yang's Website
Home
Publications
Experience
Projects
Misc
CV
Light
Dark
Automatic
3
Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation
Current leading Text-To-Audio (TTA) generation models suffer from degraded performance on zero-shot and few-shot settings. It is often …
Mu Yang
,
Bowen Shi
,
Matthew Le
,
Wei-Ning Hsu
,
Andros Tjandra
PDF
Audio Samples
Improving Mispronunciation Detection with Wav2vec2-based Momentum Pseudo-Labeling for Accentedness and Intelligibility Assessment
Current leading mispronunciation detection and diagnosis (MDD) systems achieve promising performance via end-to-end phoneme …
Mu Yang
,
Kevin Hirschi
,
Stephen D. Looney
,
Okim Kang
,
John H. L. Hansen
PDF
Audio Samples
An example preprint / working paper
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.
Mu Yang
PDF
Code
Dataset
Poster
Slides
Video
Source Document
Custom Link
Cite
×