Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Better-performing “25519” elliptic-curve cryptography

September 10, 2024

Automated reasoning and optimizations specific to CPU microarchitectures improve both performance and assurance of correct implementation.

Automated reasoning
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- EMNLP 2024
  
  Conversational AI
  
  November 12 - 16, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

NoLACE: Improving low-complexity speech codec enhancement through adaptive temporal shaping

Jan Buethe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Mike Goodwin

ICASSP 2024

2024

Speech codec enhancement methods are designed to remove distortions added by speech codecs. While classical methods are very low in complexity and add zero delay, their effectiveness is rather limited. Compared to that, DNN-based methods deliver higher quality but they are typically high in complexity and/or require delay. The recently proposed Linear Adaptive Coding Enhancer (LACE) addresses this problem

Machine learning
S2E: Towards an end-to-end entity resolution solution from acoustic signal

Kangrui Ruan, Cynthia He, Jiyang Wang, Xiaozhou Joey Zhou, Helian Feng, Ali Kebarighotbi

ICASSP 2024

2024

The traditional cascading Entity Resolution (ER) pipeline suffers from propagated errors from upstream tasks. We address this issue by formulating a new end-to-end (E2E) ER problem, Signal-to-Entity (S2E), resolving query entity mentions to actionable entities in textual catalogs directly from audio queries instead of audio transcriptions in raw or parsed format. Additionally, we extend the E2E Spoken Language

Conversational AI
Post-training embedding alignment for decoupling enrollment and runtime speaker recognition models

Chenyang Gao, Brecht Desplanques, Chelsea J.-T. Ju, Aman Chadha, Andreas Stolcke

ICASSP 2024

2024

Automated speaker identification (SID) is a crucial step for the per-sonalization of a wide range of speech-enabled services. Typical SID systems use a symmetric enrollment-verification framework with a single model to derive embeddings both offline for voice profiles extracted from enrollment utterances, and online from runtime utter-ances. Due to the distinct circumstances of enrollment and runtime, such

Machine learning
Learning action embeddings for off-policy evaluation

Matej Cief, Jacek Golebiowski, Philipp Schmidt, Ziawasch Abedjan, Artur Bekasov

ECIR 2024

2024

Off-policy evaluation (OPE) methods allow us to compute the expected reward of a policy by using the logged data collected by a different policy. However, when the number of actions is large, or certain actions are under-explored by the logging policy, existing estimators based on inverse-propensity scoring (IPS) can have a high or even infinite variance. Saito and Joachims [13] propose marginalized IPS

Machine learning
Hot-fixing wake word recognition for end-to-end ASR via neural model reprogramming

Pin-Jui Ku, I-Fan Chen, Huck Yang, Anirudh Raju, Pranav Dheram, Pegah Ghahremani, Brian King, Jing Liu, Roger Ren, Phani Nidadavolu

ICASSP 2024

2024

This paper proposes two novel variants of neural reprogramming to enhance wake word recognition in streaming end-to-end ASR models without updating model weights. The first, “trigger-frame reprogramming”, prepends the input speech feature sequence with the learned trigger-frames of the target wake word to adjust ASR model’s hidden states for improved wake word recognition. The second, “predictor-state initialization

Machine learning

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us