Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Measuring and mitigating dialog-to-API constraint violations of in-context learning

Shufan Wang, Sebastien Jean, Sailik Sengupta, James Gung, Nikolaos Pappas, Yi Zhang

EMNLP 2023

2023

In executable task-oriented semantic parsing, the system aims to translate users’ utterances in natural language to machine-interpretable programs (API calls) that can be executed according to pre-defined API specifications. With the popularity of Large Language Models (LLMs), in-context learning offers a strong baseline for such scenarios, especially in data-limited regimes (Hu et al., 2022; Shin et al

Conversational AI
CESAR: Automatic induction of compositional instructions for multi-turn dialogs

Taha Aksu, Devamanyu Hazarika, Shikib Mehri, Seokhwan Kim, Dilek Hakkani-Tür, Yang Liu, Mahdi Namazifar

EMNLP 2023

2023

Instruction-based multitasking has played a critical role in the success of large language models (LLMs) in multi-turn dialog applications. While publicly available LLMs have shown promising performance, when exposed to complex instructions with multiple constraints, they lag against state-of-the-art models like Chat-GPT. In this work, we hypothesize that the availability of large-scale complex demonstrations

Conversational AI
A multi-modal multilingual benchmark for document image classification

Yoshinari Fujinuma, Siddharth Varia, Nishant Sankaran, Bonan Min, Srikar Appalaraju, Yogarshi Vyas

EMNLP 2023

2023

Document image classification is different from plain-text document classification and consists of classifying a document by understanding the content and structure of documents such as forms, emails, and other such documents. We show that the only existing dataset for this task (Lewis et al., 2006) has several limitations and we introduce two newly curated multilingual datasets (WIKI-DOC and MULTIEURLEX

Conversational AI
Plan, verify and switch: Integrated reasoning with diverse x-of-thoughts

Tengxiao Liu, Qipeng Guo, Yuqing Yang, Xiangkun Hu, Yue Zhang, Xipeng Qiu, Zheng Zhang

EMNLP 2023

2023

As large language models (LLMs) have shown effectiveness with different prompting methods, such as Chain of Thought, Program of Thought, we find that these methods have formed a great complementarity to each other on math reasoning tasks. In this work, we propose XoT, an integrated problem solving framework by prompting LLMs with diverse reasoning thoughts. For each question, XoT always begins with selecting

Conversational AI
Semantic matching for text classification with complex class descriptions

Brian de Silva, Kuan-Wen Huang, Gwang Lee, Karen Hovsepian, Yan Xu, Mingwei Shen

EMNLP 2023

2023

Text classifiers are an indispensable tool for machine learning practitioners, but adapting them to new classes is expensive. To reduce the cost of new classes, previous work exploits class descriptions and/or labels from existing classes. However, these approaches leave a gap in the model development cycle as they support either zero- or few-shot learning but not both. Existing classifiers either do not

Conversational AI

“Alexa, Turn Down the Lights and Play Music”: The Science of Handling Compound Requests

Rahul Goel

May 02, 2019

Traditionally, Alexa has interpreted customer requests according to their intents and slots. If you say, “Alexa, play ‘What’s Going On?’ by Marvin Gaye,” the intent should be PlayMusic, and “‘What’s Going On?’” and “Marvin Gaye” should fill the slots SongName and ArtistName.

Conversational AI
Training Speech Synthesizers on Data from Multiple Speakers

Jakub Lachowicz

April 25, 2019

When a customer asks Alexa to play “Hey Jude”, and Alexa responds, “Playing 'Hey Jude' by the Beatles,” that response is generated by a text-to-speech (TTS) system, which converts textual inputs into synthetic-speech outputs...

Conversational AI
Using wake word acoustics to filter out background speech improves speech recognition by 15%

Xing Fan

April 22, 2019

One of the ways that we’re always trying to improve Alexa’s performance is by teaching her to ignore speech that isn’t intended for her. At this year’s International Conference on Acoustics, Speech, and Signal Processing, my colleagues and I will present a new technique for doing this, which could complement the techniques that Alexa already uses.

Conversational AI
Two new papers discuss how Alexa recognizes sounds

Ming Sun

April 18, 2019

Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off. At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.

Conversational AI
Signal processor improves Echo’s bass response, loudness, and speech recognition accuracy

Jun Yang

April 11, 2019

Multiband dynamics processing, which separately modifies volume in different frequency bands of an audio signal, is known to improve listeners’ audio experiences. But in the context of voice-controlled systems like the Amazon Echo family of products, it can also improve automatic speech recognition by making echo cancellation easier.

Conversational AI
Cross-lingual transfer learning for bootstrapping AI systems reduces new-language data requirements

Quynh Ngoc Thi Do, Judith Gaspers

April 08, 2019

Transfer learning is the technique of adapting a machine learning model trained on abundant data to a new context in which training data is sparse. On the Alexa team, we’ve explored transfer learning as a way to bootstrap new functions and to add new classification categories to existing machine learning systems.

Conversational AI

Conversational AI

Publications

Related content

Work with us