论文部分内容阅读
A parameter mask is proposed and analyzed in this paper to speech enhancement for robust automatic speech recognition (ASR).With the frame work of computational auditory scene analysis (CASA),ideal binary mask (IBM) is used to get the signal to noise ratio (SNR) improvement,but not the ASR performance improvement.The gap between the SNR and ASR improvement is great.To conventional ASR system,the main goal is providing the similar energy distribution to the clean target speech and no matter the energy comes from the speech or noise.We use the SNR in time frequency (T-F) unit to generate the parameter mask (PM) which is used to estimate the clean speech energy from the mixture signals.Experiment results show the higher ASR performance of the proposed method than IBM with very small SNR performance decrease.