-
Interspeech 20192019Recent works on end-to-end trainable neural network based approaches have demonstrated state-of-the-art results on dialogue state tracking. The best performing approaches estimate a probability distribution over all possible slot values. However, these approaches do not scale for large value sets commonly present in real-life applications and are not ideal for tracking slot values that were not observed
-
SiPS 20192019The Audio Front-End (AFE) is a key component in mitigating acoustic environmental challenges for far-field automatic speech recognition (ASR) on Amazon Echo family of products. A critical component of the AFE is the Beam Selector, which identifies which beam points to the target user. In this paper, we proposed a new SIR beam selector that utilizes subband-based signal-to-interference ratios to learn the
-
Interspeech 20192019Neural language models (NLM) have been shown to outperform conventional n-gram language models by a substantial margin in Automatic Speech Recognition (ASR) and other tasks. There are, however, a number of challenges that need to be addressed for an NLM to be used in a practical large-scale ASR system. In this paper, we present solutions to some of the challenges, including training NLM from heterogenous
-
Interspeech 20192019Automatic bandwidth extension (restoring high-frequency information from low sample rate audio) has a number of applications in speech processing. We introduce an end-to-end deep learning based system for speech bandwidth extension for use in a downstream automatic speech recognition (ASR) system. Specifically we propose a conditional generative adversarial network enriched with ASR-specific loss functions
-
ACL 20192019Studies on emotion recognition(ER) show that combining lexical and acoustic information results in more robust and accurate models. The majority of the studies focus on settings where both modalities are available in training and evaluation. However, in practice, this is not always the case; getting ASR output may represent a bottleneck in a deployment pipeline due to computational complexity or privacy
Related content
-
July 09, 2021The conference’s mission is to bring together stakeholders working toward improving the truthfulness and trustworthiness of online communications.
-
July 08, 2021Amazon Visiting Academic Barbara Poblete helps to build safer, more-diverse online communities — and to aid disaster response.
-
July 02, 2021Giving a neural generation model “control knobs” enables modulation of the content of generated language.
-
July 01, 2021Methods share a two-stage training process in which a model learns a representation from audio data, then learns to predict that representation from text.
-
June 24, 2021The organization focuses on furthering the state of the art on discourse- and dialogue-related technologies.
-
June 17, 2021Combining classic signal processing with deep learning makes method efficient enough to run on a phone.