Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Generating Token-Level Explanations for Natural Language Inference

James Thorne, Andreas Vlachos, Christos Christodoulopoulos, Arpit Mittal

NAACL 2019

2019

The task of Natural Language Inference (NLI) is widely modeled as supervised sentence pair classification. While there has been a lot of work recently on generating explanations of the predictions of classifiers on a single piece of text, there have been no attempts to generate explanations of classifiers operating on pairs of sentences. In this paper, we show that it is possible to generate token-level

Machine learning
Training Neural Machine Translation To Apply Terminology Constraints

Georgiana Dinu, Prashant Mathur, Marcello Federico, Yaser Al-Onaizan

ACL 2019

2019

This paper proposes a novel method to inject custom terminology into neural machine translation at run time. Previous works have mainly proposed modifications to the decoding algorithm in order to constrain the output to include run-time-provided target terms. While being effective, these constrained decoding methods add, however, significant computational overhead to the inference step, and, as we show

Related: EMNLP: Mitigating bias and "getting closer to the user"

Conversational AI
Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion

Alex Sokolov, Tracy Rohlin, Ariya Rastrow

Interspeech 2019

2019

Grapheme-to-phoneme (G2P) models are a key component in Automatic Speech Recognition (ASR) systems, such as the ASR system in Alexa, as they are used to generate pronunciations for out-of-vocabulary words that do not exist in the pronunciation lexicons (mappings like ”e c h o” → ”E k oU”). Most G2P systems are monolingual and based on traditional joint-sequence-based n-gram models. As an alternative, we

Machine learning
Time Masking: Leveraging Temporal Information in Spoken Dialogue Systems

Rylan Conway, Lambert Mathias

SIGDIAL 2019

2019

In a spoken-dialogue system, dialogue state tracker (DST) components track the state of the conversation by updating a distribution of values associated with each of the slots being tracked for the current user turn, using the interactions until then. Much of the previous work has relied on modeling the natural order of the conversation, using distance based offsets as an approximation of time. In this

Conversational AI
Spatial acoustic modeling invariant to multiple microphone array geometries

Kenichi Kumatani, Wu Minhua, Shiva Sundaram, Nikko Ström, Björn Hoffmeister

ICASSP 2019

2019

The use of spatial information with multiple microphones can improve far-field automatic speech recognition (ASR) accuracy. However, conventional microphone array techniques degrade speech enhancement performance when there is an array geometry mismatch between design and test conditions. Moreover, such speech enhancement techniques do not always yield ASR accuracy improvement due to the difference between

Conversational AI

Adapting Alexa to regional language variations

Young-Bum Kim

June 11, 2019

As Alexa expands into new countries, she usually has to be trained on new languages. But sometimes, she has to be re-trained on languages she’s already learned. British English, American English, and Indian English, for instance, are different enough that for each of them, we trained a new machine learning model from scratch.

Conversational AI
Animation by O’Reilly Science Art

Teaching Alexa to follow conversations

Arpit Gupta

June 06, 2019

New approach to reference resolution rewrites queries to clarify ambiguous references.

Conversational AI
Amazon Unveils Novel Alexa Dialog Modeling for Natural, Cross-Skill Conversations

Alexa Science Team

June 05, 2019

Today, customer exchanges with Alexa are generally either one-shot requests, like “Alexa, what’s the weather?”, or interactions that require multiple requests to complete more complex tasks.

Conversational AI
Using adversarial training to recognize speakers’ emotions

Viktor Rozgic

May 21, 2019

A person’s tone of voice can tell you a lot about how they’re feeling. Not surprisingly, emotion recognition is an increasingly popular conversational-AI research topic.

Conversational AI
Should Alexa read “2/3” as “two-thirds” or “February Third”?: The science of text normalization

Ming Sun

May 16, 2019

Text normalization is an important process in conversational AI. If an Alexa customer says, “book me a table at 5:00 p.m.”, the automatic speech recognizer will transcribe the time as “five p m”. Before a skill can handle this request, “five p m” will need to be converted to “5:00PM”. Once Alexa has processed the request, it needs to synthesize the response — say, “Is 6:30 p.m. okay?” Here, 6:30PM will be converted to “six thirty p m” for the text-to-speech synthesizer. We call the process of converting “5:00PM” to “five p m” text normalization and its counterpart — converting “five p m” to “5:00PM” — inverse text normalization.

Conversational AI
Training a Machine Learning Model in English Improves Its Performance in Japanese

Judith Gaspers

May 13, 2019

Recently, we published a paper showing that training a neural network to do language processing in English, then retraining it in German, drastically reduces the amount of German-language training data required to achieve a given level of performance.

Conversational AI

Conversational AI

Publications

Related content

Work with us