Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Towards Universal Dialogue Act Tagging for Task-Oriented Dialogues

Shachi Paul, Rahul Goel, Dilek Hakkani-Tür

Interspeech 2019

2019

Machine learning approaches for building task-oriented dialogue systems require large conversational datasets with labels to train on. We are interested in building task-oriented dialogue systems from human-human conversations, which may be available in ample amounts in existing customer care center logs or can be collected from crowd workers. Annotating these datasets can be prohibitively expensive. Recently

Related: New Alexa Research on Task-Oriented Dialogue Systems

Conversational AI
Improving ASR confidence scores for Alexa using acoustic and hypothesis embeddings

Prakhar Swarup, Roland Maas, Sri Garimella, Sri Harish Mallidi, Björn Hoffmeister

Interspeech 2019

2019

In automatic speech recognition, confidence measures provide a quantitative representation used to assess the reliability of generated hypothesis text. For personal assistant devices like Alexa, speech recognition errors are inevitable due to the growing number of applications. Hence, confidence scores provide an important metric to downstream consumers to gauge the correctness of ASR hypothesis text and

Conversational AI
Joint multiple intent detection and slot labeling for goal-oriented dialog

Rashmi Gangadharaiah

NAACL 2019

2019

Neural network models have recently gained traction for sentence-level intent classification and token-based slot-label identification. In many real-world scenarios, users have multiple intents in the same utterance, and a tokenlevel slot label can belong to more than one intent. We investigate an attention-based neural network model that performs multi-label classification for identifying multiple intents

Machine learning
Neural Text Normalization with Subword Units

Courtney Mansfield, Ming Sun, Yuzong Liu, Ankur Gandhe, Björn Hoffmeister

NAACL 2019

2019

Text normalization (TN) is an important step in conversational systems. It converts written text to its spoken form to facilitate speech recognition, natural language understanding and text-to-speech synthesis. Finite state transducers (FSTs) are commonly used to build grammars that handle text normalization (Sproat, 1996; Roark et al., 2012). However, translating linguistic knowledge into grammars requires

Related: Should Alexa read “2/3” as “two-thirds” or “February Third”?: The science of text normalization

Conversational AI
Realizing Petabyte Scale Acoustic Modeling

Sree Hari Krishnan Parthasarathi, Pranav Ladkat, Nitin Sivakrishnan, Nikko Ström

IEEE Journal on Emerging and Selected Topics in Circuits and System (JETCAS)

2019

Large scale machine learning (ML) systems such as the Alexa automatic speech recognition (ASR) system continue to improve with increasing amounts of manually transcribed training data. Instead of scaling manual transcription to impractical levels, we utilize semi-supervised learning (SSL) to learn acoustic models (AM) from the vast firehose of untranscribed audio data. Learning an AM from 1 Million hours

Conversational AI

Leveraging unannotated data to bootstrap Alexa functions more quickly

Anuj Goyal

January 22, 2019

Developing a new natural-language-understanding system usually requires training it on thousands of sample utterances, which can be costly and time-consuming to collect and annotate. That’s particularly burdensome for small developers, like many who have contributed to the library of more than 70,000 third-party skills now available for Alexa.

Conversational AI
_{Projection image adapted from Michael Horvath under the CC BY-SA 4.0 license}

New method for compressing neural networks better preserves accuracy

Anish Acharya, Rahul Goel

January 15, 2019

Neural networks have been responsible for most of the top-performing AI systems of the past decade, but they tend to be big, which means they tend to be slow. That’s a problem for systems like Alexa, which depend on neural networks to process spoken requests in real time.

Conversational AI
How Alexa may learn to retrieve stored "memories"

Rasool Fakoor

December 21, 2018

In May 2018, Amazon launched Alexa’s Remember This feature, which enables customers to store “memories” (“Alexa, remember that I took Ben’s watch to the repair store”) and recall them later by asking open-ended questions (“Alexa, where is Ben’s watch?”).

Search and information retrieval
How Alexa knows “peanut butter” is one shopping-list item, not two

Sanchit Agarwal

December 18, 2018

At a recent press event on Alexa's latest features, Alexa’s head scientist, Rohit Prasad, mentioned multistep requests in one shot, a capability that allows you to ask Alexa to do multiple things at once. For example, you might say, “Alexa, add bananas, peanut butter, and paper towels to my shopping list.” Alexa should intelligently figure out that “peanut butter” and “paper towels” name two items, not four, and that bananas are a separate item.

Conversational AI
With New Data Representation Scheme, Alexa Can Better Match Skills to Customer Requests

Young-Bum Kim

December 17, 2018

In recent years, data representation has emerged as an important research topic within machine learning.

Conversational AI
New Approach to Language Modeling Reduces Speech Recognition Errors by Up to 15%

Ankur Gandhe

December 13, 2018

Language models are a key component of automatic speech recognition systems, which convert speech into text. A language model captures the statistical likelihood of any particular string of words, so it can help decide between different interpretations of the same sequence of sounds.

Conversational AI

Conversational AI

Publications

Related content

Work with us