Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Evaluating robustness to input perturbations for neural machine translation

Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan

ACL 2020

2020

Neural Machine Translation (NMT) models are sensitive to small perturbations in the input. Robustness to such perturbations is typically measured using translation quality metrics such as BLEU on the noisy input. This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input. We focus on a class of models employing

Conversational AI
Multiresolution and multimodal speech recognition with transformers

Georgios Paraskevopoulos, Srinivas Parthasarathy, Aparna Khare, Shiva Sundaram

ACL 2020

2020

This paper presents an audio visual automatic speech recognition (AV-ASR) system using a Transformer-based architecture. We particularly focus on the scene context provided by the visual information, to ground the ASR. We extract representations for audio features in the encoder layers of the transformer and fuse video features using an additional crossmodal multihead attention layer. Additionally, we incorporate

Conversational AI
schuBERT: Optimizing elements of BERT

Ashish Khetan, Zohar Karnin

ACL 2020

2020

Transformers (Vaswani et al., 2017) have gradually become a key component for many state-of-the-art natural-language-representation models. A recent Transformer-based model — BERT (Devlin et al., 2018) — achieved state-of-the-art results on various natural-language-processing tasks, including GLUE, SQuAD v1.1, and SQuAD v2.0. This model however is computationally prohibitive and has a huge number of parameters

Conversational AI
The Cascade Transformer: An application for efficient answer sentence selection

Luca Soldaini, Alessandro Moschitti

ACL 2020

2020

Large transformer-based language models have been shown to be very effective in many classification tasks. However, their computational complexity prevents their use in applications requiring the classification of a large set of candidates. While previous works have investigated approaches to reduce model size, relatively little attention has been paid to techniques to improve batch throughput during inference

Conversational AI
A multitask learning approach for diacritic restoration

Sawsan Alqahtani, Ajay Mishra, Mona Diab

ACL 2020

2020

In many languages like Arabic, diacritics are used to specify pronunciations as well as meanings. Such diacritics are often omitted in written text, increasing the number of possible pronunciations and meanings for a word. This results in a more ambiguous text making computational processing on such text more difficult. Diacritic restoration is the task of restoring missing diacritics in the written text

Conversational AI

Amazon scientists use transfer learning to accelerate development of new Alexa capabilities

Angeliki Metallinou

May 24, 2018

Amazon scientists are continuously expanding Alexa’s natural-language-understanding (NLU) capabilities to make Alexa smarter, more useful, and more engaging.

Conversational AI
Yang, Jun

Amazon Scientist Outlines Multilayer System For Smart Speaker Echo Cancellation And Voice Enhancement

Jun Yang

May 11, 2018

Smart speakers, such as the Amazon Echo family of products, are growing in popularity among consumer and business audiences. In order to improve the automatic speech recognition (ASR) and full-duplex voice communication (FDVC) performance of these smart speakers, acoustical echo cancellation (AEC) and noise reduction systems are required. These systems reduce the noises and echoes that can impact operation, such as an Echo device accurately hearing the wake word “Alexa.”

Conversational AI
Amazon and University of Sheffield researchers make large-scale fact extraction and verification dataset publicly available

Arpit Mittal

May 04, 2018

In recent years, the amount of textual information produced daily has increased exponentially. This information explosion has been accelerated by the ease with which data can be shared across the web. Most of the textual information is generated as free-form text, and only a small fraction is available in structured format (Wikidata, Freebase etc.) that can be processed and analyzed directly by machines.

Search and information retrieval
Making Alexa more friction-free

Ruhi Sarikaya

April 25, 2018

This morning, I am delivering a keynote talk at the World Wide Web Conference in Lyon, France, with the title, Conversational AI for Interacting with the Digital and Physical World.

Conversational AI
Alexa scientists present two new techniques that improve wake word performance

Minhua Wu

April 12, 2018

The Amazon Echo is a hands-free smart home speaker you control with your voice. The first important step in enabling a delightful customer experience with an Echo or other Alexa-enabled device is wake word detection, so accurate detection of “Alexa” or substitute wake words is critical. It is challenging to build a wake word system with low error rates when there are limited computation resources on the device and it's in the presence of background noise such as speech or music.

Conversational AI
Alexa scientists address challenges of end-pointing

Roland Maas

April 10, 2018

Just as Alexa can wake up without the need to press a button, she also automatically detects when a user finishes her query and expects a response. This task is often called “end-of-utterance detection,” “end-of-query detection,” “end-of-turn detection,” or simply “end-pointing.”

Conversational AI

Conversational AI

Publications

Related content

Work with us