This assignment implements an Auto-Encoder Neural Network based Audio Denoiser. Training speech is a female speech and training noise is from cafeteria noise, while test speech is a male speech and test noise is from raining noise. The Auto-Encoder trys to reconstruct the clean audio from the mixed audio. Mel-frequency Spectrum features are used to train the Neural Network.
Mel-frequency Spectrum features are used to train the Auto-encoder Network.
For more details, take a look at the GitHub page.
Audio downloaded from YouTube. Sox is used for audio pre-processing(e.g. downsampling and trimming)
Listen to some of the audio samples below!