Auto-Encoder-based Audio Denoiser

Waveform

Abstract

This assignment implements an Auto-Encoder Neural Network based Audio Denoiser. Training speech is a female speech and training noise is from cafeteria noise, while test speech is a male speech and test noise is from raining noise. The Auto-Encoder trys to reconstruct the clean audio from the mixed audio. Mel-frequency Spectrum features are used to train the Neural Network.

Publication
USC course EE599: Deep Learning Lab for Speech Processing

Method

Mel-frequency Spectrum features are used to train the Auto-encoder Network.

For more details, take a look at the GitHub page.

Dataset

Audio downloaded from YouTube. Sox is used for audio pre-processing(e.g. downsampling and trimming)

Results(Audio Samples)

Listen to some of the audio samples below!