proj

Mis-pronunciation Detection based on Phoneme Recognition

A Mis-pronunciation Detection system, with word-level aligned phonemes predictions.

Synthing: A WaveNet-based Singing Voice Synthisizer

Final project for USC course EE599: Deep Learning Lab for Speech Processing - a WaveNet-based singing voice synthesizer. This is a partial implementation of the paper [A Neural Parametric Singing Synthesizer Modeling Timbre and Expression from Natural Songs](https://www.mdpi.com/2076-3417/7/12/1313).

Collection and Classification of Lyrics

Web crawler of lyrics and corresponding music genre. Multiple baseline classifiers, such as Naive Bayes, SVM and Neural Approach(LSTM) are applied to identify the genre of a song by analyzing its lyrics.

Digital Room Correction using Parallel Second-order Filter-based Equalizer

A Parallel second-order-based equalizer for Room Impulse Response Calibration.

Faster-RCNN for Pedestrian Detection in Videos

Train and deploy a Faster-RCNN framework to perform pedestrain detection in videos.