Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Cross-lingual knowledge distillation via flow-based voice conversion for robust polyglot text-to-speech

Dariusz Piotrowski, Renard Korzeniowski, Alessio Falai, Sebastian Cygert, Kamil Pokora, Georgi Tinchev, Ziyao Zhang, Kayoko Yanagisawa

International Conference on Neural Information Processing (ICONIP2023)

2023

In this work, we introduce a framework for cross-lingual speech synthesis, which involves an upstream Voice Conversion (VC) model and a downstream Text-To-Speech (TTS) model. The proposed framework consists of 4 stages. In the first two stages, we use a VC model to convert utterances in the target locale to the voice of the target speaker. In the third stage, the converted data is combined with the linguistic

Conversational AI
A zero-shot approach for multi-user task-oriented dialog generation

Shiv Surya, Yohan Jo, Arijit Biswas, Alexandros Potamianos

INLG 2023

2023

Prior art investigating task-oriented dialog and automatic generation of such dialogs have focused on single-user dialogs between a single user and an agent. However, there is limited study on adapting such AI agents to multiuser conversations (involving multiple users and an agent). Multi-user conversations are richer than single-user conversations containing social banter and collaborative decision making

Conversational AI
Investigating the representation of open domain dialogue context for transformer models

Vishakh Padmakumar, Behnam Hedayatnia, Di Jin, Patrick Lange, Seokhwan Kim, Nanyun Peng, Yang Liu, Dilek Hakkani-Tür

SIGDIAL 2023

2023

The bulk of work adapting transformer models to open-domain dialogue represents dialogue context as the concatenated set of turns in natural language. However, it is unclear if this is the best approach. In this work, we investigate this question by means of an empirical controlled experiment varying the dialogue context format from text-only formats (all recent utterances, summaries, selected utterances

Conversational AI
Towards greener yet powerful code generation via quantization: An empirical study

Xiaokai Wei, Sujan Gonugondla, Shiqi Wang, Wasi Ahmad, Baishakhi Ray, Haifeng Qian, Xiaopeng LI, Varun Kumar, Zijian Wang, Yuchen Tian, Qing Sun, Ben Athiwaratkun, Mingyue Shang, Murali Krishna Ramanathan, Parminder Bhatia, Bing Xiang

ESEC/FSE 2023

2023

ML-powered code generation aims to assist developers to write code in a more productive manner by intelligently generating code blocks based on natural language prompts. Recently, large pretrained deep learning models have pushed the boundary of code generation and achieved impressive performance.] However, the huge number of model parameters poses a significant challenge to their adoption in a typical

Conversational AI
LEMMA: Learning language-conditioned multi-robot manipulation

Steven Gong, Xiaofeng Gao, Qiaozi (QZ) Gao, Suhaila Shakiah, Govind Thattai, Gaurav Sukhatme

Robotics and Automation Letters

2023

Complex manipulation tasks often require robots with complementary capabilities to collaborate. We introduce a benchmark for LanguagE-Conditioned Multi-robot MAnipulation (LEMMA) focused on task allocation and longhorizon object manipulation based on human language instructions in a tabletop setting. LEMMA features 8 types of procedurally generated tasks with varying degree of complexity, some of which

Machine learning

Animation by O’Reilly Science Art

Teaching Alexa to follow conversations

Arpit Gupta

June 06, 2019

New approach to reference resolution rewrites queries to clarify ambiguous references.

Conversational AI
Amazon Unveils Novel Alexa Dialog Modeling for Natural, Cross-Skill Conversations

Alexa Science Team

June 05, 2019

Today, customer exchanges with Alexa are generally either one-shot requests, like “Alexa, what’s the weather?”, or interactions that require multiple requests to complete more complex tasks.

Conversational AI
Using adversarial training to recognize speakers’ emotions

Viktor Rozgic

May 21, 2019

A person’s tone of voice can tell you a lot about how they’re feeling. Not surprisingly, emotion recognition is an increasingly popular conversational-AI research topic.

Conversational AI
Should Alexa read “2/3” as “two-thirds” or “February Third”?: The science of text normalization

Ming Sun

May 16, 2019

Text normalization is an important process in conversational AI. If an Alexa customer says, “book me a table at 5:00 p.m.”, the automatic speech recognizer will transcribe the time as “five p m”. Before a skill can handle this request, “five p m” will need to be converted to “5:00PM”. Once Alexa has processed the request, it needs to synthesize the response — say, “Is 6:30 p.m. okay?” Here, 6:30PM will be converted to “six thirty p m” for the text-to-speech synthesizer. We call the process of converting “5:00PM” to “five p m” text normalization and its counterpart — converting “five p m” to “5:00PM” — inverse text normalization.

Conversational AI
Training a Machine Learning Model in English Improves Its Performance in Japanese

Judith Gaspers

May 13, 2019

Recently, we published a paper showing that training a neural network to do language processing in English, then retraining it in German, drastically reduces the amount of German-language training data required to achieve a given level of performance.

Conversational AI
How we add new skills to Alexa’s name-free skill selector

Young-Bum Kim

May 03, 2019

Using cosine similarity rather than dot product to compare vectors helps prevent "catastrophic forgetting".

Conversational AI

Conversational AI

Publications

Related content

Work with us