Two-pass endpoint detection for speech recognition

Anirudh Raju; Aparna Khare; Di He; Ilya Sklyar; Long Chen; Sam Alptekin; Viet Anh Tranh; Zhe Zhang; Colin Vaz; Venkatesh Ravichandran; Roland Maas; Ariya Rastrow

Publication

Two-pass endpoint detection for speech recognition

By Anirudh Raju, Aparna Khare, Di He, Ilya Sklyar, Long Chen, Sam Alptekin, Viet Anh Tranh, Zhe Zhang, Colin Vaz, Venkatesh Ravichandran, Roland Maas, Ariya Rastrow

2023

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

Endpoint (EP) detection is a key component of far-field speech recognition systems that assist the user through voice commands. The endpoint detector has to trade-off between accuracy and latency, since waiting longer reduces the cases of users being cut-off early. We propose a novel two-pass solution for endpointing, where the utterance endpoint detected from a first pass endpointer is verified by a 2nd-pass model termed EP Arbitrator. Our method improves the trade-off between early cut-offs and latency over a baseline endpointer, as tested on datasets including voice-assistant transactional queries, conversational speech, and the public SLURP corpus. We demonstrate that our method shows improvements regardless of the first-pass EP model used.

Two-pass endpoint detection for speech recognition

Latest news

Work with us