Zero-shot test-time adaptation via knowledge distillation for personalized speech denoising and dereverberation

2024
Download Copy BibTeX
Copy BibTeX
We propose a personalization framework to adapt compact models to test time environments and improve their speech enhancement performance in noisy and reverberant conditions. The use-cases are when the end-user device encounters only one or a few speakers and noise types that tend to reoccur in the specific acoustic environment. Hence, we postulate a small personalized model that suffices to handle this focused subset of the original universal speech enhancement problem. The study addresses a major data shortage issue: although the goal is to learn from a specific user’s speech signals and the test time environment, the target clean speech is unavailable for model training due to privacy-related concerns and technical difficulty of recording noise and reverberation-free voice signals. The proposed zero-shot personalization method utilizes no clean speech target. Instead, it employs the knowledge distillation framework, where the more advanced denoising results from an overly large teacher work as pseudo targets to train a small student model. Evaluation on various test time conditions suggest that the proposed personalization approach can significantly enhance the compact student model’s test time performance. Personalized models outperform larger non-personalized baseline models, demonstrating that personalization achieves model compression with no loss in dereverberation and denoising performance.
Research areas

Latest news

GB, MLN, Edinburgh
We’re looking for a Machine Learning Scientist in the Personalization team for our Edinburgh office experienced in generative AI and large models. You will be responsible for developing and disseminating customer-facing personalized recommendation models. This is a hands-on role with global impact working with a team of world-class engineers and scientists across the Edinburgh offices and wider organization. You will lead the design of machine learning models that scale to very large quantities of data, and serve high-scale low-latency recommendations to all customers worldwide. You will embody scientific rigor, designing and executing experiments to demonstrate the technical efficacy and business value of your methods. You will work alongside aRead more