Customer-obsessed science

Amazon Science Fulfillment Center OAK4 in Tracy, CA

How task decomposition and smaller LLMs can make AI more affordable

September 19, 2024

“Agentic workflows” that use multiple, fine-tuned smaller LLMs — rather than one large one — can improve efficiency.

Machine learning
Accounting for cognitive bias in human evaluation of large language models

September 16, 2024

A position paper presented at ACL proposes a framework for more-accurate human evaluation of LLMs.

Conversational AI
Better-performing “25519” elliptic-curve cryptography

September 10, 2024

Automated reasoning and optimizations specific to CPU microarchitectures improve both performance and assurance of correct implementation.

Automated reasoning
Conference calendar
- ECCV 2024
  
  Computer vision
  
  September 29 - October 4, 2024
- IROS 2024
  
  Robotics
  
  October 14 - 18, 2024
- CIKM 2024
  
  Information and knowledge management
  
  October 21 - 25, 2024

AmazonScience_ARA_Fall2024_092424_MC_Fall 2024.jpg

Amazon Research Awards issues fall 2024 call for proposals

Amazon Research Awards team

September 25, 2024

Now open until November 6, Amazon Research Awards will be seeking proposals in the following research areas: AI for Information Security, Automated Reasoning, AWS AI, AWS Cryptography, and Sustainability.

Testing the limits of unified seq2seq LLM pretraining on diverse table data tasks

Soumajyoti Sarkar, Leonard Lausen

NeurIPS 2023 Workshop on Table Representation Learning

2023

Tables stored in databases and tables which are present in web pages and articles account for a large part of semi-structured data that is available on the internet. It motivates the need to develop a modeling approach with large language models (LLMs) which can be used to solve diverse table tasks such as semantic parsing, question answering as well as classification problems. Traditionally, there existed

Machine learning
Automated few-shot classification with instruction-finetuned language models

Rami Aly, Xingjian Shi, Kaixiang Lin, Aston Zhang, Andrew Wilson

EMNLP 2023

2023

A particularly successful class of approaches for few-shot learning combines language models with prompts — handcrafted task descriptions that complement data samples. However, designing prompts by hand for each task commonly requires domain knowledge and substantial guesswork. We observe, in the context of classification tasks, that instruction-finetuned language models are remarkably robust towards some

Conversational AI
Protege: Prompt-based diverse question generation from web articles

Vinayak Puranik, Anirban Majumder, Vineet Chaoji

EMNLP 2023

2023

Rich and diverse knowledge-bases (KB) are foundational building blocks for online knowledge-sharing communities such as StackOverflow and Quora and applications such as conversational assistants (aka chatbots). A popular format for knowledge bases is question-answer pairs (or FAQs), where questions are designed to accurately match a multitude of queries. In this paper, we address the problem of automatic

Conversational AI
MultiCoNER v2: A large multilingual dataset for fine-grained and noisy named entity recognition

Besnik Fetahu, Zhiyu Chen, Sudipta Kar, Oleg Rokhlenko, Shervin Malmasi

EMNLP 2023

2023

We present MultiCoNER V2, a dataset for fine-grained Named Entity Recognition covering 33 entity classes across 12 languages, in both monolingual and multilingual settings. This dataset aims to tackle the following practical challenges in NER: (i) effective handling of fine-grained classes that include complex entities like movie titles, and (ii) performance degradation due to noise generated from typing

Conversational AI
Retrieve and copy: Scaling ASR personalization to large catalogs

Sai Muralidhar Jayanthi, Devang Kulshreshtha, Saket Dingliwal, Srikanth Ronanki, Sravan Bodapati

EMNLP 2023

2023

Personalization of automatic speech recognition (ASR) models is a widely studied topic because of its many practical applications. Most recently, attention-based contextual biasing techniques are used to improve the recognition of rare words and/or domain-specific entities. However, due to performance constraints, the biasing is often limited to a few thousand entities, restricting real-world usability.

Conversational AI

Career opportunities

We look for talent from around the world for applied scientists, data scientists, economists, research scientists, scholars, academics, PhDs, and interns.
Academic collaborations

We collaborate with leading academic organizations to drive innovation and to ensure that research is creating solutions whose benefits are shared broadly.
Photo by Zak Brickett

Awards and recognitions

Learn more about the awards and recognitions that Amazon researches from around the world have been honored with during their tenure.

Customer-obsessed science

Conference calendar

Publications

Resources

Work with us