Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

The life of a prescription at Amazon Pharmacy

September 30, 2024

From pricing estimation and regulatory compliance to inventory management and chatbot assistants, machine learning models help Amazon Pharmacy customers stay healthy and save time and money.

Conversational AI
How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- CIKM 2024
  
  Information and knowledge management
  
  October 21 - 25, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

How do multimodal LLMs really fare in classical vision few-shot challenges? A deep dive

Qing Guo, Prashan Wanigasekara, Skyler Zheng, Jacob Zhiyuan Fang, Xinwei Deng, Chenyang Tao

NeurIPS 2023 Workshop on Robustness of Zero/Few-shot Learning in Foundation Models (R0-FoMo)

2023

Recent advances in multimodal foundational models have demonstrated marvelous in-context learning capabilities for diverse vision-language tasks. However, existing literature have mainly focused on few-shot learning tasks similar to their NLP counterparts. It is unclear whether these foundation models can also address classical vision challenges such as few-shot classification, which in some settings (e.g

Machine learning
Evaluating Open-QA evaluation

Cunxiang Wang, Sirui Cheng, Qipeng Guo, Yuanhao Yue, Bowen Ding, Zhikun Xu, Yidong Wang, Xiangkun Hu, Zheng Zhang, Yue Zhang

NeurIPS 2023

2023

This study focuses on the evaluation of the Open Question Answering (Open-QA) task, which can directly estimate the factuality of large language models (LLMs). Current automatic evaluation methods have shown limitations, indicating that human evaluation still remains the most reliable approach. We introduce a new task, Evaluating QA Evaluation (QA-Eval) and the corresponding dataset EVOUNA, designed to

Conversational AI
Data-efficient alignment of large language models with human feedback through natural language

Di Jin, Shikib Mehri, Devamanyu Hazarika, Aishwarya Padmakumar, Sungjin Lee, Yang Liu, Mahdi Namazifar

NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following

2023

Learning from human feedback is a prominent technique to align the output of large language models (LLMs) with human expectations. Reinforcement learning from human feedback (RLHF) leverages human preference signals that are in the form of ranking of response pairs to perform this alignment. However, human preference on LLM outputs can come in much richer forms including natural language, which may provide

Conversational AI
Membership inference attack on diffusion models via quantile regression

Zhiwei Steven Wu, Shuai Tang, Sergul Aydore, Michael Kearns, Aaron Roth

NeurIPS 2023 Workshop on SyntheticData4ML

2023

Recently, diffusion models have demonstrated great potential for image synthesis due to their ability to generate high-quality synthetic data. However, when applied to sensitive data, privacy concerns have been raised about these models. In this paper, we evaluate the privacy risks of diffusion models through a membership inference (MI) attack, which aims to identify whether a target example is in the training

Security, privacy, and abuse prevention
Finite-time logarithmic Bayes regret upper bounds

Alexia Atsidakou, Branislav Kveton, Sumeet Katariya, Constantine Caramanis, Sujay Sanghavi

NeurIPS 2023

2023

We derive the first finite-time logarithmic Bayes regret upper bounds for Bayesian bandits. In Gaussian bandits, we obtain O(cΔ log n) and O(ch log2n) bounds for an upper confidence bound algorithm, where ch and cΔ are constants depending on the prior distribution and the gaps of random bandit instances sampled from it, respectively. The latter bound asymptotically matches the lower bound of Lai (1987).

Machine learning

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us