Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Federated representation learning for automatic speech recognition

Guruprasad Viswanathan Ramesh, Gopinath (Nath) Chennupati, Milind Rao, Anit Kumar Sahu, Ariya Rastrow, Jasha Droppo

2023 ISCA SPSC Symposium

2023

Federated Learning (FL) offers a privacy-preserving approach to model training, allowing edge devices to learn collaboratively without sharing data. Edge devices like Alexa and Siri are prospective sources of unlabeled audio data that can be tapped to learn robust audio representations. In this work, we bring Self-supervised Learning (SSL) and FL together to learn representations for Automatic Speech Recognition

Conversational AI
Voice conversion for Lombard speaking style with implicit and explicit acoustic feature conditioning

Dominika Woszczyk, Sam Ribeiro, Tom Merritt, Daniel Korzekwa

Interspeech 2023 Workshop on Machine Learning Challenges for Hearing Aids

2023

Text-to-Speech (TTS) systems in Lombard speaking style can im-prove the overall intelligibility of speech, useful for hearing loss and noisy conditions. However, training those models requires a large amount of data and the Lombard effect is challenging to record due to speaker and noise variability and tiring recording conditions. Voice conversion (VC) has been shown to be a useful augmentation technique

Conversational AI
Generating product insights from community Q&A

Lital Kuchy, Ran Levy, Avihai Mejer, Noam Segev, Shunit Agmon, Miriam Farber

CIKM 2023

2023

In e-commerce sites, customer questions on the product detail page express the customers’ information needs about the product. The answers to these questions often provide the necessary information. In this work, we present and address the novel task of generating product insights from community questions and answers (Q&A). These insights can be presented to customers to assist them in their shopping journey

Conversational AI
Disentangling user conversations with voice assistants for online shopping

Nikhita Vedula, Marcus Collins, Oleg Rokhlenko

SIGIR 2023

2023

Conversation disentanglement aims to identify and group utterances from a conversation into separate threads. Existing methods in the literature primarily focus on disentangling multi-party conversations involving three or more speakers, which enables their models to explicitly or implicitly incorporate speaker-related feature signals while disentangling. Most existing models require a large amount of human

Conversational AI
Faithful low-resource data-to-text generation through cycle training

Zhuoer Wang, Marcus Collins, Nikhita Vedula, Simone Filice, Shervin Malmasi, Oleg Rokhlenko

ACL 2023

2023

Methods to generate text from structured data have advanced significantly in recent years, primarily due to fine-tuning of pre-trained lan-guage models on large datasets. However, such models can fail to produce output faithful to the input data, particularly on out-of-domain data. Sufficient annotated data is often not avail-able for specific domains, leading us to seek an unsupervised approach to improve

Conversational AI

Speech synthesizer learns expressive style from one-second voice sample

Vatsal Aggarwal

May 12, 2020

Users find speech with transferred expression 9% more natural than standard synthesized speech.

Conversational AI
Alexa & Friends features Manoj Sindhwani, Alexa Speech vice president

Staff writer

May 07, 2020

Watch the recording of Manoj Sindhwani's live interview with Alexa evangelist Jeff Blankenburg.

Conversational AI
Credit: Stacy Reilly

How Alexa knows when you’re talking to her

Kellen Gillespie

May 06, 2020

Leveraging semantic content improves performance of acoustic-only model for detecting device-directed speech.

Conversational AI
Credit: Jordan Stead

ICASSP: What “signal processing” has come to mean

Larry Hardesty

May 04, 2020

Alexa scientist Ariya Rastrow on the blurring boundaries between acoustic processing and language understanding.

Conversational AI
“Pseudo-labels”, negative examples help Alexa match skills to customer requests

Joo-Kyung Kim

April 30, 2020

Letting a machine learning system label its own examples improves performance.

Conversational AI
Credit: Harsha Sundar

Locating multiple sound sources from raw audio

Harshavardhan Sundar

April 27, 2020

An end-to-end deep-learning-based solution circumvents the “permutation problem”.

Conversational AI

Conversational AI

Publications

Related content

Work with us