Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

The life of a prescription at Amazon Pharmacy

September 30, 2024

From pricing estimation and regulatory compliance to inventory management and chatbot assistants, machine learning models help Amazon Pharmacy customers stay healthy and save time and money.

Conversational AI
How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- CIKM 2024
  
  Information and knowledge management
  
  October 21 - 25, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

Noise-free audio signal processing in noisy environment: A hardware and algorithm solution

Yarong Feng, Zongyi Liu, Shunyan Luo, Yuan Ling, Shujing Dong, Shuyi Wang, Bruce Ferry

NeurIPS 2023 Workshop on Robustness of Zero/Few-shot Learning in Foundation Models (R0-FoMo)

2024

Dealing with background noise is a challenging task in audio signal processing, negatively impacting algorithm performance and system robustness. In this paper, we propose a simple solution that combines recording hardware modification and algorithm improvement to tackle the challenge. The proposed solution could produce clean and noise-free high-quality audio recording even in noisy recording environment

Conversational AI
ReCLIP: Refine contrastive language image pre-training with source free domain adaptation

Xuefeng Hu, Ke Zhang, Lu Xia, Albert Chen, Jiajia Luo, Yuyin Sun, Ken Wang, Nan Qiao, Xiao Zeng, Min Sun, Cheng-Hao Kuo, Ram Nevatia

WACV 2024

2024

Large-scale pre-trained vision-language models (VLM) such as CLIP have demonstrated noteworthy zero-shot classification capability, achieving 76.3% top-1 accuracy on ImageNet without seeing any examples. However, while applying CLIP to a downstream target domain, the presence of visual and text domain gaps and cross-modality misalignment can greatly impact the model performance. To address such challenges

Computer vision
MIVC: Multiple instance visual component for visual-language models

Wenyi Wu, Qi Li, Wenliang Zhong, Junzhou Huang

WACV 2024

2024

Vision-language models have been widely explored across a wide range of tasks and achieve satisfactory performance. However, it’s under-explored how to consolidate entity understanding through a varying number of images and to align it with the pre-trained language models for generative tasks. In this paper, we propose MIVC, a general multiple instance visual component to bridge the gap between various

Related: Vision-language models that can handle multi-image inputs

Computer vision
DocFormerv2: Local features for document understanding

Srikar Appalaraju, Peng Tang, Qi Dong, Nishant Sankaran, Yichu Zhou, R. Manmatha

AAAI 2024

2024

We propose DocFormerv2, a multi-modal transformer for Visual Document Understanding (VDU). The VDU domain entails understanding documents (beyond mere OCR predictions) e.g., extracting information from a form, VQA for documents and other tasks. VDU is challenging as it needs a model to make sense of multiple modalities (visual, language and spatial) to make a prediction. Our approach, termed DocFormerv2

Related: New pretraining tasks enable better document understanding

Computer vision
LightLT: a lightweight representation quantization framework for long-tail data

Haoyu Wang, Ruirui Li, Zhengyang Wang, Xianfeng Tang, Danni (Danqing) Zhang, Monica Cheng, Bing Yin, Jasha Droppo, Suhang Wang, Jing Gao

ICDE 2024

2023

Search tasks require finding items similar to a given query, making it a crucial aspect of various applications. However, storing and computing similarity for millions or billions of item representations can be computationally expensive. To address this, quantization-based hash methods present memory and inference-efficient solutions by converting continuous representations into non-negative integer codes

Information and knowledge management

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us