Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Better-performing “25519” elliptic-curve cryptography

September 10, 2024

Automated reasoning and optimizations specific to CPU microarchitectures improve both performance and assurance of correct implementation.

Automated reasoning
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- EMNLP 2024
  
  Conversational AI
  
  November 12 - 16, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

Noise-robust DSP-assisted neural pitch estimation with very low complexity

Krishna Subramani, Jean-Marc Valin, Jan Buethe, Paris Smaragdis, Mike Goodwin

ICASSP 2024

2024

Pitch estimation is an essential step of many speech processing algorithms, including speech coding, synthesis, and enhancement. Recently, pitch estimators based on deep neural networks (DNNs) have been outperforming well-established DSP-based techniques. Unfortunately, these new estimators can be impractical to deploy in real-time systems, both because of their relatively high complexity, and the fact

Machine learning
Denoising and selecting pseudo-heatmaps for semi-supervised human pose estimation

Zhuoran Yu, Manchen Wang, Yanbei Chen, Paolo Favaro, Davide Modolo

WACV 2024

2024

We propose a new semi-supervised learning design for human pose estimation that revisits the popular dual-student framework and enhances it two ways. First, we introduce a denoising scheme to generate reliable pseudo-heatmaps as targets for learning from unlabeled data. This uses multi-view augmentations and a threshold-and-refine procedure to produce a pool of pseudo-heatmaps. Second, we select the learning

Computer vision
Maximum-entropy adversarial audio augmentation for keyword spotting

Zuzhao Ye, Gregory Ciccarelli, Brian Kulis

ICASSP 2024

2024

Data augmentation is a key tool for improving the performance of deep networks, particularly when there is limited labeled data. In some fields, such as computer vision, augmentation methods have been extensively studied; however, for speech and audio data, there are relatively fewer methods developed. Using adversarial learning as a starting point, we develop a simple and effective augmentation strategy

Machine learning
STORYANALOGY: Deriving story-level analogies from large language models to unlock analogical understanding

Cheng Jiayang, Lin Qiu, Tsz Ho Chan, Tianqing Fang, Weiqi Wang, Chunkit Chan, Dongyu Ru, Qipeng Guo, Hongming Zhang, Yangqiu Song, Yue Zhang, Zheng Zhang

EMNLP 2023

2024

Analogy-making between narratives is crucial for human reasoning. In this paper, we evaluate the ability to identify and generate analogies by constructing a first-of-its-kind large-scale story-level analogy corpus, STORYANALOGY, which contains 24K story pairs from diverse domains with human annotations on two similarities from the extended Structure-Mapping Theory. We design a set of tests on STORYANALOGY

Conversational AI
Promptformer: Prompted conformer transducer for ASR

Sergio Duarte Torres, Arunasish Sen, Aman Rana, Lukas Drude, Alejandro Gomez Alanis, Andreas Schwarz, Leif Rādel, Volker Leutnant

ICASSP 2024

2024

Context cues carry information which can improve multiturn interactions in automatic speech recognition (ASR) systems. In this paper, we introduce a novel mechanism inspired by hyper-prompting to fuse textual context with acoustic representations in the attention mechanism. Results on a test set with multi-turn interactions show that our method achieves 5.9% relative word error rate reduction (rWERR) over

Machine learning

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us