Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Paralinguistics-enhanced large language modeling of spoken dialogue

GUAN-TING LIN, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yi Gu, Shalini Ghosh, Andreas Stolcke, Hung-yi Lee, Ivan Bulyko

ICASSP 2024

2024

Large Language Models (LLMs) have demonstrated superior abilities in tasks such as chatting, reasoning, and question-answering. However, standard LLMs may ignore crucial paralinguistic information, such as sentiment, emotion, and speaking style, which are essential for achieving natural, human-like spoken conversation, especially when such information is conveyed by acoustic cues. We therefore propose Paralinguistics-enhanced

Conversational AI
How robust are LLMs to in-context majority label bias?

Karan Gupta, Sumegh Roychowdhury, Siva Rajesh Kasa, Santhosh Kasa, Anish Bhanushali, Nikhil Pattisapu, Prasanna Srinivasa Murthy, Alok Chandra

AAAI 2024 Workshop on Responsible Language Models

2024

In the In-Context Learning (ICL) setup, various forms of label biases can manifest. One such manifestation is majority label bias, which arises when the distribution of labeled examples in the in-context samples is skewed towards one or more specific classes making Large Language Models (LLMs) more prone to predict those labels. Such discrepancies can arise from various factors, including logistical constraints

Conversational AI
Noise-free audio signal processing in noisy environment: A hardware and algorithm solution

Yarong Feng, Zongyi Liu, Shunyan Luo, Yuan Ling, Shujing Dong, Shuyi Wang, Bruce Ferry

NeurIPS 2023 Workshop on Robustness of Zero/Few-shot Learning in Foundation Models (R0-FoMo)

2024

Dealing with background noise is a challenging task in audio signal processing, negatively impacting algorithm performance and system robustness. In this paper, we propose a simple solution that combines recording hardware modification and algorithm improvement to tackle the challenge. The proposed solution could produce clean and noise-free high-quality audio recording even in noisy recording environment

Conversational AI
MIVC: Multiple instance visual component for visual-language models

Wenyi Wu, Qi Li, Wenliang Zhong, Junzhou Huang

WACV 2024

2024

Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance. However, it’s under-explored how to consolidate entity understanding through a varying number of images and to align it with the pre-trained language models for generative tasks. In this paper, we propose MIVC, a general multiple instance visual component to bridge the gap between various

Related: Vision-language models that can handle multi-image inputs

Computer vision
VCC: Scaling transformers to 128K tokens or more by prioritizing important tokens

Zhanpeng Zeng, Cole Hawkins, Mingyi Hong, Aston Zhang, Nikolaos Pappas, Vikas Singh, Shuai Zheng

NeurIPS 2023

2023

Transformers are central in modern natural language processing and computer vision applications. Despite recent works devoted to reducing the quadratic cost of such models (as a function of the sequence length), dealing with ultra long sequences (e.g., with more than 16K tokens) remains challenging. Applications such as answering questions based on a book or summarizing a scientific article are inefficient

Conversational AI

Amazon-sponsored workshop advances deep learning for code

Staff writer

May 02, 2023

ICLR workshop sponsored by Amazon CodeWhisperer features Amazon papers on a novel contrastive-learning framework for causal language models and a way to gauge the robustness of code generation models.

Conversational AI
How Amazon scientists are driving success for Alexa in the car

John Roach

April 12, 2023

From noisy cars to unreliable signals, researchers have worked to extend the Alexa experience to vehicles on the move.

Conversational AI
Five finalists selected for inaugural Alexa Prize SimBot Challenge

Alexa Prize team

April 06, 2023

University teams are competing to help advance the science of conversational embodied AI and robust human AI interaction.

Conversational AI
How Amazon Chime SDK’s voice tone analysis works

Masahito Togami, Mike Goodwin

April 03, 2023

Combining acoustic and lexical information improves real-time voice sentiment analysis.

Conversational AI
Amazon, MIT research symposium focused on cutting-edge technology

Staff writer

March 31, 2023

Attendees explored new avenues of research in areas including robotics and conversational AI via roundtables moderated by researchers from Amazon.

Conversational AI
IIT Bombay

Amazon and IIT Bombay launch multiyear collaboration

Staff writer

March 27, 2023

Initiative will advance artificial intelligence and machine learning research within speech, language, and multimodal-AI domains.

Conversational AI

Conversational AI

Publications

Related content

Work with us