Amazon Science homepage

The Amazon Nova family of models: Technical report and model card

Training infrastructure, benchmarks, responsible-AI methodology, and more.

David Chang/Getty Images/iStockphoto

Amazon opens new AI lab in San Francisco focused on long-term research bets

The Amazon AGI SF Lab will focus on developing new foundational capabilities for enabling useful AI agents.

Economics Nobelist on causal inference

In a keynote address at the latest Amazon Machine Learning Conference, Amazon academic research consultant, Stanford professor, and recent Nobel laureate Guido Imbens offered insights on the estimation of causal effects in “panel data” settings.

Information and knowledge management

Machine learning

Operations research and optimization

Quantum technologies

Robotics

Search and information retrieval

Security, privacy, and abuse prevention

Sustainability

From the blog

View all

The latest research from Amazon scientists.

View all

Training code generation models to debug their own outputs

February 20, 2025

Using large language models to generate training data and updating models through both fine tuning and reinforcement learning improves the success rate of code generation by 39%.

Conversational AI
Benchmarking tool for graph-centric predictive modeling on databases

February 14, 2025

Cloud and systems
Lightweight LLM for converting text to structured data

February 06, 2025

Conversational AI
QKD and authentication: Separating facts from myths

January 14, 2025

Quantum technologies
The 10 most viewed publications of 2024

December 24, 2024

View all

David Chang/Getty Images/iStockphoto

Amazon AGI SF Lab

Led by David Luan and Pieter Abbeel, the lab will focus on developing new foundational capabilities for enabling useful AI agents.

Amazon Nova

The company's new state-of-the-art foundation models deliver frontier intelligence and industry-leading price performance.

An irregular polyhedron suspended in midair, with shadows projected onto each of three orthogonal surfaces: one shadow is a triangle, one a square, and the third a circle.

Preskill wins prize for work on learning and quantum computing

Caltech professor and Amazon Scholar John Preskill wins Bell Prize for applying both classical and quantum computing to the problem of learning from quantum experiments.

Know when to fold: Futility-aware early termination in online experiments

Yu Liu, Runzhe Wan, Yian Huang, James McQueen, Doug Hains, Jinxiang Gu, Rui Song

The Web Conference 2025

2025

As the demand for online A/B testing continues to rises for tech companies, the opportunity cost of conducting these experiments becomes increasingly significant. Consequently, there is a rising need for an efficient continuous monitoring system capable of early terminating experiments when necessary. Existing literature and tools primarily focuses on early terminating experiments with evidently significant

Machine learning
IHEval: Evaluating language models on following the instruction hierarchy

Zhihan Zhang, Shiyang Li, Zixuan Zhang, Xin Liu, Haoming Jiang, Xianfeng Tang, Yifan Gao, Zheng Li, Haodong Wang, Zhaoxuan Tan, Yichuan Li, Qingyu Yin, Bing Yin, Meng Jiang

NAACL 2025

2025

The instruction hierarchy, which establishes a priority order from system messages to user messages, conversation history, and tool outputs, is essential for ensuring consistent and safe behavior in language models (LMs). Despite its importance, this topic receives limited attention, and there is a lack of comprehensive benchmarks for evaluating models’ ability to follow the instruction hierarchy. We bridge

Conversational AI
Leveraging structural information in tree ensembles for table representation learning

Nikhil Pattisapu, Siva Rajesh Kasa, Sumegh Roychowdhury, Karan Gupta, Anish Bhanushali, Prasanna Srinivasa Murthy

The Web Conference 2025

2025

Tabular data is one of the most common data formats found in the web and used in domains like finance, banking, e-commerce and medical. Although deep neural networks (DNNs) have demonstrated outstanding performance on homogeneous data such as visual, audio, and textual data, tree ensemble methods such as Gradient Boosted Decision Trees (GBDTs) are often the go-to choice for supervised machine learning problems

Machine learning
Towards knowledge checking in retrieval-augmented generation: A representation perspective

Shenglai Zeng, Jiankun Zhang, Bingheng Li, Yuping Lin, Tianqi Zheng, Dante Everaert, Hanqing Lu, Hui Liu, Yue Xing, Monica Cheng, Jiliang Tang

NAACL 2025

2025

Retrieval-Augmented Generation (RAG) systems have shown promise in enhancing the performance of Large Language Models (LLMs). However, these systems face challenges in effectively integrating external knowledge with the LLM’s internal knowledge, often leading to issues with misleading or unhelpful information. This work aims to provide a systematic study on knowledge checking in RAG systems. We conduct

Conversational AI
Token pruning optimization for efficient multi-vector dense retrieval

Shanxiu He, Mutasem Al-Darabsah, Suraj Nair, Jonathan May, Tarun Agarwal, Tao Yang, Choon Hui Teo

ECIR 2025

2025

Multi-vector dense retrieval with ColBERT has been shown to be effective in striking a good relevance and efficiency tradeoff for both in-domain and out-of-domain datasets through late interaction between queries and documents. However, the efficiency of ColBERT for a large-scale retrieval dataset is still constrained by its large memory footprint, as one embedding is stored per token; thus, previous work

Search and information retrieval

AAAI 2025

February 25 - March 4, 2025

Philadelphia, Pennsylvania

Machine learning

WACV 2025

February 28 - March 4, 2025

Tucson, Arizona

Computer vision

WSDM 2025

March 10 - 14, 2025

Hannover, Germany

Search and information retrieval

ICLR 2025

April 24 - 28, 2025

Singapore

Machine learning

The Web Conference 2025

April 28 - May 2, 2025

Sydney, Australia

Information and knowledge management

NAACL 2025

April 29 - May 4, 2025

Albuquerque, New Mexico

KDD 2025

August 3 - 7, 2025

Toronto, Ontario

Information and knowledge management

Russ Tedrake (Massachusetts Institute of Technology).JPG

Gretchen Ertl

Amazon Research Awards

The program offers unrestricted funds and other resources to support research at academic institutions and non-profit organizations in areas that align with our mission.

JORDAN STEAD/(JORDAN STEAD / Amazon)

Amazon Trusted AI Challenge

A global university competition to drive secure innovation in generative AI technology, which focuses on responsible AI and LLM coding security.

Credit: Wolfram Scheible

Research collaborations

We partner with particular academic organizations across the world for deep and sustained collaborations in multiple research areas of mutual interest.

Pai-Ling Yin, senior manager of research science, is seen speaking to a classroom, there is a chalkboard behind her and she is gesturing with her hands.

Courtesy of Pai-Ling Yin

Academics at Amazon

We hire world-class academics to work on large-scale technical challenges, while they continue to teach and conduct research at their universities.

Customer-obsessed science

Research areas

From the blog

Featured news

Publications

Conferences

Academia

Work with us