Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

AutoEval-ToD: Automated evaluation of task-oriented dialog systems

Arihant Jain, Purav Aggarwal, Rishav Sahay, Chaosheng Dong, Anoop S V K K Saladi

NAACL 2025

2025

Task-oriented Dialog systems (ToD) are essential in automating user interactions, but their complex design and dynamic nature make evaluation particularly challenging. Current evaluation methodologies heavily depend on human annotators, which can be inefficient, subjective, and expensive to scale. To advance the field, there is a pressing need for a reliable, scalable, and systematic evaluation framework

Conversational AI
Constrained decoding with speculative lookaheads

Nishanth Sridhar Nakshatri, Shamik Roy, Rajarshi (Raj) Das, Un Ch, Leo Boytsov, Rashmi Gangadharaiah

NAACL 2025

2025

Constrained decoding with lookahead heuristics (CDLH) is a highly effective method for aligning LLM generations to human preferences. However, the extensive lookahead rollout operations for each generated token makes CDLH prohibitively expensive, resulting in low adoption in practice. In contrast, common decoding strategies such as greedy decoding are extremely efficient, but achieve very low constraint

Conversational AI
On localizing and deleting toxic memories in large language models

Anubrata Das, Manoj Kumar, Ninareh Mehrabi, Anil Ramakrishna, Anna Rumshisky, Kai-Wei Chang, Aram Galstyan, Morteza Ziyadi, Rahul Gupta

NAACL 2025

2025

Ensuring that large language models (LLMs) do not generate harmful text is critical for their safe deployment. A common failure mode involves producing toxic responses to otherwise innocuous prompts. While various detoxification methods have been proposed, the underlying mechanisms that drive toxic generation in LLMs are not yet fully understood. Our work aims to provide a mechanistic understanding of toxic

Conversational AI
Trust dynamics in AI-assisted development: Definitions, factors, and implications

Sadra Sabouri, Philipp Eibl, Xinyi Zhou, Morteza Ziyadi, Nenad Medvidovic, Lars Lindemann, Souti Chattopadhyay

ICSE 2025

2025

Software developers increasingly rely on AI code generation utilities. To ensure that “good” code is accepted into the code base and “bad” code is rejected, developers must know when to trust an AI suggestion. Understanding how developers build this intuition is crucial to enhancing developer-AI collabo-rative programming. In this paper, we seek to understand how developers (1) define and (2) evaluate the

Conversational AI
InfoPO: On mutual information maximization for large language model alignment

Teng Xiao, Zhen Ge, Sujay Sanghavi, Tian Wang, Julian Katz-Samuels, Skylar Versage, Qingjun Cui, Trishul Chilimbi

NAACL 2025

2025

We study the post-training of large language models (LLMs) with human preference data. Recently, direct preference optimization and its variants have shown considerable promise in aligning language models, eliminating the need for reward models and online sampling. Despite these benefits, these methods rely on explicit assumptions about the Bradley-Terry (BT) model, which makes them prone to over-fitting

Conversational AI

How to Make Neural Language Models Practical for Speech Recognition

Anirudh Raju

August 29, 2019

An automatic-speech-recognition system — such as Alexa’s — converts speech into text, and one of its key components is its language model. Given a sequence of words, the language model computes the probability that any given word is the next one. For instance, a language model would predict that a sentence that begins “Toni Morrison won the Nobel” is more likely to conclude “Prize” than “dries”. Language models can thus help decide between competing interpretations of the same acoustic information.

Conversational AI
Neural TTS Makes Speech Synthesizers More Versatile

Jaime Lorenzo Trueba, Viacheslav Klimkov

August 22, 2019

A text-to-speech system, which converts written text into synthesized speech, is what allows Alexa to respond verbally to requests or commands...

Conversational AI
Animation by Nick Little

New AI system helps accelerate Alexa skill development

Boya Yu

August 15, 2019

Embedding entity names from diverse skills in a shared representations space enables system to suggest neglected entity names with 88.5% accuracy.

Conversational AI
More-Efficient Machine Learning Models for On-Device Operation

Chieh-Chi Kao, Ming Sun, Bowen Shi

August 13, 2019

Neural networks are responsible for most recent advances in artificial intelligence, including many of Alexa’s latest capabilities. But neural networks tend to be large and unwieldy, and in recent years, the Alexa team has been investigating techniques for making them efficient enough to run on-device.

Conversational AI
Representing Data at Three Levels of Generality Improves Multitask Machine Learning

Mengwen Liu

August 8, 2019

Alexa currently has more than 90,000 skills, or abilities contributed by third-party developers — the Uber ride-sharing skill, the Jeopardy! trivia game skill, the Starbucks drink-ordering skill, and so on.

Conversational AI
Who’s on First? How Alexa Is Learning to Resolve Referring Terms

Chetan Naik, Pushpendre Rastogi

August 7, 2019

This year, at the Association for Computational Linguistics’ Workshop on Natural-Language Processing for Conversational AI, my colleagues and I won one of two best-paper awards for our work on slot carryover.

Conversational AI

Conversational AI

Publications

Related content

Work with us