-
AAAI 20182018The Alexa Meaning Representation Language (AMRL) is a compositional graph-based semantic representation that includes fine-grained types, properties, actions, and roles and can represent a wide variety of spoken language. AMRL increases the ability of virtual assistants to represent more complex requests, including logical and conditional statements as well as ones with nested clauses. Due to this representational
-
ACL 20182018In this paper, we explore the task of mapping spoken language utterances to one of thousands of natural language understanding domains in intelligent personal digital assistants (IPDAs). This scenario is observed for many mainstream IPDAs in industry that allow third parties to develop thousands of new domains to augment built-in ones to rapidly increase domain coverage and overall IPDA capabilities. We
-
ICASSP 20182018We present an end-of-utterance detector for real-time automatic speech recognition in far-field scenarios. The proposed system consists of three components: a long short-term memory (LSTM) neural network trained on acoustic features, an LSTM trained on 1-best recognition hypotheses of the automatic speech recognition (ASR) decoder, and a feedforward deep neural network (DNN) combining embeddings derived
-
SLT 20182018Despite rapid advances in speech recognition, current models remain brittle to superficial perturbations to their inputs. Small amounts of noise can destroy the performance of an otherwise state-of-the-art model. To harden models against background noise, practitioners often perform data augmentation, adding artificially-noised examples to the training set, carrying over the original label. In this paper
-
Interspeech 20182018We propose a simple recurrent model for detecting rare sound events, when the time boundaries of events are available for training. Our model optimizes the combination of an utterancelevel loss, which classifies whether an event occurs in an utterance, and a frame-level loss, which classifies whether each frame corresponds to the event when it does occur. The two losses make use of a shared vectorial representation
Related content
-
December 18, 2020Researchers propose a method to automatically generate training data for Alexa by identifying cases in which customers rephrase unsuccessful requests.
-
December 14, 2020Parallel speech recognizers, language ID, and translation models geared to conversational speech are among the modifications that make Live Translation possible.
-
December 03, 2020Scientists are recognized for their contributions to conversational understanding systems.
-
December 03, 2020Determining the optimal architectural parameters reduces network size by 84% while improving performance on natural-language-understanding tasks.
-
November 25, 2020Method significantly reduces bias while maintaining comparable performance on machine learning tasks.
-
November 24, 2020Newly named IEEE Fellow discusses his experience in the field of conversational AI, and the ways he and his team are working to make Alexa more intelligent.