Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Improving Emotion Classification through Variational Inference of Latent Variables

Srinivas Parthasarathy, Viktor Rozgic, Ming Sun, Chao Wang

ICASSP 2019

2019

Conventional models for emotion recognition from speech signal are trained in supervised fashion using speech utterances with emotion labels. In this study we hypothesize that speech signal depends on multiple latent variables including the emotional state, age, gender, and speech content. We propose an Adversarial Autoencoder (AAE) to perform variational inference over the latent variables and reconstruct

Related: Using adversarial training to recognize speakers’ emotions

Conversational AI
Fine-grained robust prosody transfer for single-speaker neural text-to-speech

Viacheslav Klimkov, Srikanth Ronanki, Jonas Rohnke, Thomas Drugman

Interspeech 2019

2019

We present a neural text-to-speech system for fine-grained prosody transfer from one speaker to another. Conventional approaches for end-to-end prosody transfer typically use either fixed-dimensional or variable-length prosody embedding via a secondary attention to encode the reference signal. How-ever, when trained on a single-speaker dataset, the conventional prosody transfer systems are not robust enough

Related: Neural TTS Makes Speech Synthesizers More Versatile

Conversational AI
Jointly embedding semi-structured OpenIE mentions and knowledge-bases for relation extraction with entity-free parameters

Xin Luna Dong, Andrew Hamel

NAACL 2019

2019

In this paper, we consider advancing webscale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB). Traditional techniques from universal schema and from schema mapping fall in two extremes: either they perform instance-level inference relying on embedding for (subject, object) pairs, thus cannot handle pairs absent

Machine learning
Cross-lingual Transfer Learning for Japanese Named Entity Recognition

Andrew Johnson, Penny Karanasou, Judith Gaspers

NAACL 2019

2019

This work explores cross-lingual transfer learning (TL) for named entity recognition, focusing on bootstrapping Japanese from English. A deep neural network model is adopted and the best combination of weights to transfer is extensively investigated. Moreover, a novel approach is presented that overcomes linguistic differences between this language pair by romanizing a portion of the Japanese input. Experiments

Related: Training a Machine Learning Model in English Improves Its Performance in Japanese

Conversational AI
Towards achieving robust universal neural vocoding

Jaime Lorenzo Trueba, Thomas Drugman, Javier Latorre, Tom Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal

Interspeech 2019

2019

This paper explores the potential universality of neural vocoders. We train a WaveRNN-based vocoder on 74 speakers coming from 17 languages. This vocoder is shown to be capable of generating speech of consistently good quality (98% relative mean MUSHRA when compared to natural speech) regardless of whether the input spectrogram comes from a speaker or style seen during training or from an out-of-domain

Related: Neural TTS Makes Speech Synthesizers More Versatile

Conversational AI

More-efficient “kernel methods” dramatically reduce training time for natural-language-understanding systems

Alessandro Moschitti

January 24, 2019

Machine learning systems often act on “features” extracted from input data. In a natural-language-understanding system, for instance, the features might include words’ parts of speech, as assessed by an automatic syntactic parser, or whether a sentence is in the active or passive voice.

Conversational AI
Leveraging unannotated data to bootstrap Alexa functions more quickly

Anuj Goyal

January 22, 2019

Developing a new natural-language-understanding system usually requires training it on thousands of sample utterances, which can be costly and time-consuming to collect and annotate. That’s particularly burdensome for small developers, like many who have contributed to the library of more than 70,000 third-party skills now available for Alexa.

Conversational AI
_{Projection image adapted from Michael Horvath under the CC BY-SA 4.0 license}

New method for compressing neural networks better preserves accuracy

Anish Acharya, Rahul Goel

January 15, 2019

Neural networks have been responsible for most of the top-performing AI systems of the past decade, but they tend to be big, which means they tend to be slow. That’s a problem for systems like Alexa, which depend on neural networks to process spoken requests in real time.

Conversational AI
How Alexa may learn to retrieve stored "memories"

Rasool Fakoor

December 21, 2018

In May 2018, Amazon launched Alexa’s Remember This feature, which enables customers to store “memories” (“Alexa, remember that I took Ben’s watch to the repair store”) and recall them later by asking open-ended questions (“Alexa, where is Ben’s watch?”).

Search and information retrieval
How Alexa knows “peanut butter” is one shopping-list item, not two

Sanchit Agarwal

December 18, 2018

At a recent press event on Alexa's latest features, Alexa’s head scientist, Rohit Prasad, mentioned multistep requests in one shot, a capability that allows you to ask Alexa to do multiple things at once. For example, you might say, “Alexa, add bananas, peanut butter, and paper towels to my shopping list.” Alexa should intelligently figure out that “peanut butter” and “paper towels” name two items, not four, and that bananas are a separate item.

Conversational AI
With New Data Representation Scheme, Alexa Can Better Match Skills to Customer Requests

Young-Bum Kim

December 17, 2018

In recent years, data representation has emerged as an important research topic within machine learning.

Conversational AI

Conversational AI

Publications

Related content

Work with us