Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Better-performing “25519” elliptic-curve cryptography

September 10, 2024

Automated reasoning and optimizations specific to CPU microarchitectures improve both performance and assurance of correct implementation.

Automated reasoning
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- EMNLP 2024
  
  Conversational AI
  
  November 12 - 16, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

De-noised vision-language fusion guided by visual cues for e-commerce product search

Zhizhang Hu, Shasha Li, Ming Du, Arnab Dhua, Doug Gray

CVPR 2024 Workshop on Multimodal Learning and Applications

2024

In e-commerce applications, vision-language multimodal transformer models play a pivotal role in product search. The key to successfully training a multimodal model lies in the alignment quality of image-text pairs in the dataset. However, the data in practice is often automatically collected with minimal manual intervention. Hence the alignment of image-text pairs is far from ideal. In e-commerce, this

Computer vision
Benchmarking zero-shot recognition with vision-language models: Challenges on granularity and specificity

Zhenlin Xu, Yi Zhu, Tiffany Deng, Abhay Mittal, Yanbei Chen, Manchen Wang, Paolo Favaro, Joe Tighe, Davide Modolo

CVPR 2024 Workshop on "What is Next in Multimodal Foundation Models?"

2024

This paper presents novel benchmarks for evaluating vision-language models (VLMs) in zero-shot recognition, focusing on granularity and specificity. Although VLMs ex-cel in tasks like image captioning, they face challenges in open-world settings. Our benchmarks test VLMs’ consistency in understanding concepts across semantic granularity levels and their response to varying text specificity. Findings show

Computer vision
A simple strategy for body estimation from partial-view images

Yafei Mao, Xuelu Li, Brandon Smith, JinJin Li, Raja Bala

CVPR 2024 Workshop on Computer Vision for Fashion, Art, and Design

2024

Virtual try-on and product personalization have become increasingly important in modern online shopping, high-lighting the need for accurate body measurement estimation. Although previous research has advanced in estimating 3D body shapes from RGB images, the task is inherently ambiguous as the observed scale of human subjects in the images depends on two unknown factors: capture distance and body dimensions

Computer vision
Adapting uni-modal language models for dense multi-modal co-reference resolution using parameter augmentation

Sam Osebe, Prashan Wanigasekara, Thomas Gueudre, Thanh Tran

ICLR 2024 Workshop on LLM Agents, LREC-COLING 2024 Workshop on e-Commerce and NLP

2024

The context of modern smart voice assistants are often multi-modal, where images, audio and video content are consumed by users simultaneously. In such a setup, co-reference resolution is especially challenging, and runs across modalities and dialogue turns. We explore the problem of multi-modal co-reference resolution in multi-turn dialogues and quantify the performance of multi-modal LLMs on a specially

Computer vision
Hallucination detection in LLM-enriched product listings

Ling Jiang, Keer Jiang, Xiaoyu Chu, Saaransh Gulati, Pulkit Garg

EMNLP 2024 Workshop on e-Commerce and NLP

2024

E-commerce faces persistent challenges with data quality issue of product listings. Recent advances in Large Language Models (LLMs) offer a promising avenue for automated product listing enrichment. However, LLMs are prone to hallucinations, which we define as the generation of content that is unfaithful to the source input. This poses significant risks in customer-facing applications. Hallucination detection

Conversational AI

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us