Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

BOFFIN TTS: Few-Shot Speaker Adaptation by Bayesian Optimization

Henry Moss, Vatsal Aggarwal, Nishant Prateek, Javier González, Roberto Barra-Chicote

ICASSP 2020

2020

We present BOFFIN TTS (Bayesian Optimization For FIne-tuning Neural Text To Speech), a novel approach for few-shot speaker adaptation. Here, the task is to fine-tune a pre-trained TTS model to mimic a new speaker using a small corpus of target utterances. We demonstrate that there does not exist a one-size-fits-all adaptation strategy, with convincing synthesis requiring a corpus-specific configuration

Conversational AI
Voice-based Reformulation of Community Answers

Nachshon Cohen, Simone Filice, David Carmel

The Web Conference 2020

2020

Community Question Answering (CQA) websites, such as Stack Exchange or Quora, allow users to freely ask questions and obtain answers from other users, i.e., the community. Personal assistants, such as Amazon Alexa or Google Home, can also exploit CQA data to answer a broader range of questions and increase customers’ engagement. However, the voice-based interaction poses new challenges to the Question Answering

Conversational AI
Channel-Attention Dense U-Net for Multichannel Speech Enhancement

Bahareh Tolooshams, Ritwik Giri, Andrew H. Song, Umut Isik, Arvindh Krishnaswamy

ICASSP 2020

2020

Supervised deep learning has gained significant attention fo rspeech enhancement recently. The state-of-the-art deep-learning methods perform the task by learning a ratio/binary mask that is applied to the mixture in the time-frequency domain to produce the clean speech. Despite the great performance in the single-channel setting, these frameworks lag in performance in the multichannel setting, as the majority

Conversational AI
Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

Subendhu Rongali, Luca Soldaini, Emilio Monti, Wael Hamza

The Web Conference 2020

2020

Virtual assistants such as Amazon Alexa, Apple Siri, and Google Assistant often rely on a semantic parsing component to understand which action(s) to execute for an utterance spoken by its users. Traditionally, rule-based or statistical slot-filling systems have been used to parse “simple” queries — that is, queries that contain a single action and can be decomposed into a set of non-overlapping entities

Conversational AI
Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Sanna Wagner, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram

ICASSP 2020

2020

In this work, we investigated the teacher-student training paradigm to train a fully learnable multi-channel acoustic model for far-field automatic speech recognition (ASR). Using a large offline teacher model trained on beamformed audio, we trained a simpler multi-channel student acoustic model used in the speech recognition system. For the student, both multi-channel feature extraction layers and the

Conversational AI

Stacy Reilly

How we taught Alexa to correct her own defects

Chenlei (Edward) Guo

January 21, 2020

Self-learning system uses customers’ rephrased requests as implicit error signals.

Machine learning
Stacy Reilly

The research behind Alexa’s popular whispered speech

Marius Cotescu

January 16, 2020

According to listener tests, whispers produced by a new machine learning model sound as natural as vocoded human whispers.

Conversational AI
Alexa’s ASRU papers concentrate on extracting high-value training data

Larry Hardesty

December 11, 2019

Related data selection techniques yield benefits for both speech recognition and natural-language understanding.

Conversational AI
Alexa at five: Looking back, looking forward

Rohit Prasad

November 06, 2019

Today is the fifth anniversary of the launch of the Amazon Echo, so in a talk I gave yesterday at the Web Summit in Lisbon, I looked at how far Alexa has come and where we’re heading next.

Conversational AI
Improving cross-lingual transfer learning by filtering training data

Quynh Ngoc Thi Do, Judith Gaspers

October 28, 2019

In a paper we’re presenting at this year’s Conference on Empirical Methods in Natural Language Processing, we describe experiments with a new data selection technique.

Conversational AI
The FEVER data set: What doesn’t kill it will make it stronger

Christos Christodoulopoulos, Arpit Mittal

October 17, 2019

This year at EMNLP, we will cohost the Second Workshop on Fact Extraction and Verification — or FEVER — which will explore techniques for automatically assessing the veracity of factual assertions online.

Conversational AI

Conversational AI

Publications

Related content

Work with us