-
EMNLP 20232023Studies in bias and fairness in natural language processing have primarily examined social biases within a single language and/or across few attributes (e.g. gender, race). However, biases can manifest differently across various languages for individual attributes. As a result, it is critical to examine biases within each language and attribute. Of equal importance is to study how these biases compare across
-
EMNLP 2023 Eighth Conference on Machine Translation (WMT23)2023Neural metrics trained on human evaluations of MT tend to correlate well with human judgments, but their behavior is not fully understood. In this paper, we perform a controlled experiment and compare a baseline metric that has not been trained on human evaluations (Prism) to a trained version of the same metric (Prism+FT). Surprisingly, we find that Prism+FT becomes more robust to machinetranslated references
-
NeurIPS 20232023This study focuses on the evaluation of the Open Question Answering (Open-QA) task, which can directly estimate the factuality of large language models (LLMs). Current automatic evaluation methods have shown limitations, indicating that human evaluation still remains the most reliable approach. We introduce a new task, Evaluating QA Evaluation (QA-Eval) and the corresponding dataset EVOUNA, designed to
-
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following2023Learning from human feedback is a prominent technique to align the output of large language models (LLMs) with human expectations. Reinforcement learning from human feedback (RLHF) leverages human preference signals that are in the form of ranking of response pairs to perform this alignment. However, human preference on LLM outputs can come in much richer forms including natural language, which may provide
-
NeurIPS 20232023Spoken language understanding (SLU) systems often exhibit suboptimal performance in processing atypical speech, typically caused by neurological conditions and motor impairments. Recent advancements in Text-to-Speech (TTS) synthesis-based augmentation for more fair SLU have struggled to accurately capture the unique vocal characteristics of atypical speakers, largely due to insufficient data. To address
Related content
-
May 02, 2023ICLR workshop sponsored by Amazon CodeWhisperer features Amazon papers on a novel contrastive-learning framework for causal language models and a way to gauge the robustness of code generation models.
-
April 12, 2023From noisy cars to unreliable signals, researchers have worked to extend the Alexa experience to vehicles on the move.
-
April 06, 2023University teams are competing to help advance the science of conversational embodied AI and robust human AI interaction.
-
April 03, 2023Combining acoustic and lexical information improves real-time voice sentiment analysis.
-
March 31, 2023Attendees explored new avenues of research in areas including robotics and conversational AI via roundtables moderated by researchers from Amazon.
-
March 27, 2023Initiative will advance artificial intelligence and machine learning research within speech, language, and multimodal-AI domains.