A quick guide to Amazon's 65-plus papers at this year's ACL

Familiar topics such as question answering and natural-language understanding remain well represented, but a new concentration on language modeling and multimodal models reflect the spread of generative AI.

Between the main conference and the recently inaugurated ACL Proceedings, Amazon researchers have more than 65 papers at this year's meeting of the Association for Computational Linguistics (ACL).

Automatic speech recognition

Masked audio text encoders are effective multi-modal rescorers*
Jason Cai, Monica Sunkara, Xilai Li, Anshu Bhatia, Xiao Pan, Sravan Bodapati

Code generation

A static evaluation of code completion by large language models
Hantian Ding, Varun Kumar, Yuchen Tian, Zijian Wang, Rob Kwiatkowski, Xiaopeng LI, Murali Krishna Ramanathan, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Multitask pretraining with structured knowledge for text-to-SQL generation
Robert Giaquinto, Dejiao Zhang, Benjamin Kleiner, Yang Li, Ming Tan, Parminder Bhatia, Ramesh Nallapati, Xiaofei Ma

Code switching

Code-switched text synthesis in unseen language pairs*
I-Hung Hsu, Avik Ray, Shubham Garg, Nanyun Peng, Jing Huang

CoMix: Guide transformers to code-mix using POS structure and phonetics*
Gaurav Arora, Srujana Merugu, Vivek Sembium

Continual learning

Characterizing and measuring linguistic dataset drift
Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth

Data-/table-to-text applications

An inner table retriever for robust table question answering
Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias

Few-shot data-to-text generation via unified representation and multi-source learning
Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, JIE MA, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

Improving cross-task generalization of unified table-to-text models with compositional task configurations*
Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng, William Wang, Zhiheng Huang

LI-RAGE: Late interaction retrieval augmented generation with explicit signals for open-domain table question answering
Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias

Dialogue

Diable: Efficient dialogue state tracking as operations on tables*
Pietro Lesci, Yoshinari Fujinuma, Momchil Hardalov, Chao Shang, Lluis Marquez

NatCS: Eliciting natural customer support dialogues
James Gung, Emily Moeng, Wesley Rose, Arshit Gupta, Yi Zhang, Saab Mansour

Schema-guided user satisfaction modeling for task-oriented dialogues
Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai

Toward more accurate and generalizable evaluation metrics for task-oriented dialogs
Abi Komma, Nagesh Panyam, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, Aram Galstyan

Explainable AI

Efficient Shapley values estimation by amortization for text classification
Alan Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang

Few shot rationale generation using self-training with dual teachers*
Aditya Srikanth Veerubhotla, Lahari Poddar, Jun Yin, Gyuri Szarvas, Sharanya Eswaran

Information extraction

An AMR-based link prediction approach for document-level event argument extraction
Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Qipeng Guo, Zheng Zhang

AVEN-GR: Attribute value extraction and normalization using product graphs
Donato Crisostomi, Thomas Ricatte

Large scale generative multimodal attribute extraction for e-commerce attributes
Anant Khandelwal, Happy Mittal, Shreyas Sunil Kulkarni, Deepak Gupta

ParaAMR: A large-scale syntactically diverse paraphrase dataset by AMR back-translation
Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang, Aram Galstyan

Weakly supervised hierarchical multi-task classification of customer questions
Jitenkumar Rana, Promod Yenigalla, Chetan Aggarwal, Sandeep Mukku, Manan Soni, Rashmi Patange

WebIE: Faithful and robust information extraction on the web
Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni

Information retrieval

CUPID: Curriculum learning based real-time prediction using distillation
Arindam Bhattacharya, Ankith M S, Ankit Gandhi, Vijay Huddar, Atul Saroop, Rahul Bhagat

Direct fact retrieval from knowledge graphs without entity linking
Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang

Language modeling

Adaptation approaches for nearest neighbor language models*
Rishabh Bhardwaj, George Polovets, Monica Sunkara

CONTRACLM: Contrastive learning for causal language model
Nihal Jain, Dejiao Zhang, Wasi Ahmad, Zijian Wang, Feng Nan, Xiaopeng LI, Ming Tan, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Ramesh Nallapati, Bing Xiang

Controlled text generation with hidden representation transformations*
Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit Chadha, Emilio Ferrara

KILM: Knowledge injection into encoder-decoder language models
Yan XU, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, Dilek Hakkani-Tür

ReAugKD: Retrieval-augmented knowledge distillation for pre-trained language models
Jianyi Zhang, Aashiq Muhamed, Aditya Anantharaman, Guoyin Wang, Changyou Chen, Kai Zhong, Qingjun Cui, Yi Xu, Belinda Zeng, Trishul Chilimbi, Yiran Chen

Recipes for sequential pre-training of multilingual encoder and seq2seq models*
Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

Rethinking the role of scale for in-context learning: An interpretability-based case study at 66 billion scale
Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth

Machine learning

Mitigating the burden of redundant datasets via batch-wise unique samples and frequency-aware losses
Donato Crisostomi, Andrea Caciolai, Alessandro Pedrani, Alessandro Manzotti, Enrico Palumbo, Kay Rottmann, Davide Bernardi

Machine translation

RAMP: Retrieval and attribute-marking enhanced prompting for attribute-controlled translation
Gabriele Sarti, Phu Mon Htut, Xing Niu, Benjamin Hsu, Anna Currey, Georgiana Dinu, Maria Nădejde

Multimodal models

Benchmarking diverse-modal entity linking with generative models*
Sijia Wang, Alexander Li, Henry Zhu, Sheng Zhang, Pramuditha Perera, Chung-Wei Hang, JIE MA, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

Generate then select: Open-ended visual question answering guided by world knowledge*
Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henry Zhu, Yuhao Zhang, Alexander Hanbo Li, William Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

KG-FLIP: Knowledge-guided fashion-domain language-image pre-training for e-commerce
Qinjin Jia, Yang Liu, Shaoyuan Xu, Huidong Liu, Daoping Wu, Jinmiao Fu, Roland Vollgraf, Bryan Wang

Resolving ambiguities in text-to-image generative models
Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta

Translation-enhanced multilingual text-to-image generation
Yaoyiran Li, Ching-Yun (Frannie) Chang, Stephen Rawls, Ivan Vulić, Anna Korhonen

Unsupervised melody-to-lyric generation
Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Chenyang Tao, Gunnar Sigurdsson, Wenbo Zhao, Tagyoung Chung, Jing Huang, Violet Peng

Natural-language processing

Multi-VALUE: A framework for cross-dialectal English NLP
Caleb Ziems, William Held, Jingfeng Yang, Jwala Dhamala, Rahul Gupta, Diyi Yang

vONTSS: vMF based semi-supervised neural topic modeling with optimal transport*
Weijie Xu, Xiaoyu Jiang, Srinivasan Sengamedu, "SHS", Francis Iannacci, Jinjin Zhao

Natural-language understanding

ECG-QALM: Entity-controlled synthetic text generation using contextual Q&A for NER*
Karan Aggarwal, Henry Jin, Aitzaz Ahmad

Entity contrastive learning in a large-scale virtual assistant system
Jonathan Rubin, Jason Crowley, George Leung, Morteza Ziyadi, Maria Minakova

EPIC: Multi-perspective annotation of a corpus of irony
Simona Frenda, Alessandro Pedrani, Valerio Basile, Soda Marem Lo, Alessandra Teresa Cignarella, Raffaella Panizzon, Cristina Marco, Bianca Scarlini, Viviana Patti, Cristina Bosco, Davide Bernardi

Measuring and mitigating local instability in deep neural networks*
Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan

Reducing cohort bias in natural language understanding systems with targeted self-training scheme
Thu Le, Gabriela Cortes Hernandez, Bei Chen, Melanie Bradford

Privacy

Controlling the extraction of memorized data from large language models via prompt-tuning
Mustafa Ozdayi, Charith Peris, Jack G. M. FitzGerald, Christophe Dupuy, Jimit Majmudar, Haidar Khan, Rahil Parikh, Rahul Gupta

Query rewriting

Context-aware query rewriting for improving users’ search experience on e-commerce websites
Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao

Unified contextual query rewriting
Yingxue Zhou, Jie Hao, Mukund Rungta, Yang Liu, Eunah Cho, Xing Fan, Yanbin Lu, Vishal Vasudevan, Kellen Gillespie, Zeynab Raeesy, Sawyer Shen, Edward Guo, Gokhan Tur

Question answering

Accurate training of web-based question answering systems with feedback from ranked users
Liang Wang, Ivano Lauriola, Alessandro Moschitti

Context-aware transformer pre-training for answer sentence selection
Luca Di Liello, Siddhant Garg, Alessandro Moschitti

Cross-Lingual Knowledge Distillation for answer sentence selection in low-resource languages*
Shivanshu Gupta, Yoshitomo Matsubara, Ankit Chadha, Alessandro Moschitti

Exploiting abstract meaning representation for open-domain question answering*
Cunxiang Wang, Zhikun Xu, Qipeng Guo, Xiangkun Hu, Xuefeng Bai, Zheng Zhang, Yue Zhang

Hybrid hierarchical retrieval for open-domain question answering*
Manoj Ghuhan Arivazhagan, Lan Liu, Peng Qi, Xinchi Chen, William Wang, Zhiheng Huang

Learning answer generation using supervision from automatic question answering evaluators
Matteo Gabburo, Siddhant Garg, Rik Koncel-Kedziorski, Alessandro Moschitti

RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering*
Rujun Han, Peng Qi, Yuhao Zhang, Lan Liu, Juliette Burger, William Wang, Zhiheng Huang, Bing Xiang, Dan Roth

Reasoning

FolkScope: Intention knowledge graph construction for e-commerce commonsense discovery*
Changlong Yu, Weiqi Wang, Xin Liu, Jiaxin Bai, Yangqiu Song, Zheng Li, Yifan Gao, Tianyu Cao, Bing Yin

SCOTT: Self-consistent chain-of-thought distillation
Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang Ren

Self-learning

Constrained policy optimization for controlled self-learning in conversational AI systems
Mohammad Kachuee, Sungjin Lee

Scalable and safe remediation of defective actions in self-learning conversational systems
Sarthak Ahuja, Mohammad Kachuee, Fateme Sheikholeslami, Weiqing Liu, Jae Do

Semantic parsing

An empirical analysis of leveraging knowledge for low-resource task-oriented semantic parsing*
Mayank Kulkarni, Aoxiao Zhong, Nicolas Guenon Des Mesnards, Sahar Movaghati, Mukund Harakere, He Xie, Jianhua Lu

XSEMPLR: Cross-lingual semantic parsing in multiple natural languages and meaning representations
Yusen Zhang, Jun Wang, Zhiguo Wang, Rui Zhang

Spoken-language understanding

Regression-free model updates for spoken language understanding
Andrea Caciolai, Verena Weber, Tobias Falke, Alessandro Pedrani, Davide Bernardi

Sharing encoder representations across languages, domains and tasks in large-scale spoken language understanding
Jonathan Hueser, Judith Gaspers, Thomas Gueudre, Chandana Satya Prakash, Jin Cao, Daniil Sorokin, Quynh Do, Nicolas Anastassacos, Tobias Falke, Turan Gojayev, Mariusz Momotko, Denis Romasanta Rodriguez, Austin Doolittle, Kartik Balasubramaniam, Wael Hamza, Fabian Triefenbach, Patrick Lehnen

Toxic-language classification

QCon at SemEval-2023 Task 10: Data augmentation and model ensembling for detection of online sexism
Wes Feely, Prabhakar Gupta, Manas Mohanty, Tim Chon, Tuhin Kundu, Vijit Singh, Sandeep Atluri, Tanya Roosta, Viviane Ghaderi, Peter Schulam, Heba Elfardy

Towards building a robust toxicity predictor
Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Yanjun (Jane) Qi

*Accepted to ACL Findings

Research areas

Related content

US, NY, New York
We are seeking an Applied Scientist to lead the development of evaluation frameworks and data collection protocols for robotic capabilities. In this role, you will focus on designing how we measure, stress-test, and improve robot behavior across a wide range of real-world tasks. Your work will play a critical role in shaping how policies are validated and how high-quality datasets are generated to accelerate system performance. You will operate at the intersection of robotics, machine learning, and human-in-the-loop systems, building the infrastructure and methodologies that connect teleoperation, evaluation, and learning. This includes developing evaluation policies, defining task structures, and contributing to operator-facing interfaces that enable scalable and reliable data collection. The ideal candidate is highly experimental, systems-oriented, and comfortable working across software, robotics, and data pipelines, with a strong focus on turning ambiguous capability goals into measurable and actionable evaluation systems. Key job responsibilities - Design and implement evaluation frameworks to measure robot capabilities across structured tasks, edge cases, and real-world scenarios - Develop task definitions, success criteria, and benchmarking methodologies that enable consistent and reproducible evaluation of policies - Create and refine data collection protocols that generate high-quality, task-relevant datasets aligned with model development needs - Build and iterate on teleoperation workflows and operator interfaces to support efficient, reliable, and scalable data collection - Analyze evaluation results and collected data to identify performance gaps, failure modes, and opportunities for targeted data collection - Collaborate with engineering teams to integrate evaluation tooling, logging systems, and data pipelines into the broader robotics stack - Stay current with advances in robotics, evaluation methodologies, and human-in-the-loop learning to continuously improve internal approaches - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video subscriptions such as Apple TV+, HBO Max, Peacock, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video team member, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! As an Applied Scientist, you will apply state of the art natural language processing and computer vision research to video centric digital media. We are looking for scientists with expertise in vision-language models/multimodal LLMs and long-form content understanding (full movies/episode vs. short clips). You will be dealing with architectures that handle long-context understanding and causal reasoning across extended temporal sequences. Key job responsibilities Our team builds multi-modal machine learning technologies to enrich and understand video content. We aim not only to understand individual components within the content itself, but also their relationships to each other to provide a holistic and broader contextual understanding. This powers the next generation of video understanding and search capabilities for Prime Video. About the team Prime Video's Content Localization, Understanding & Enrichment organization is responsible for 1) enabling Prime Video to "see" and "understand" video content including characters, scenes, dialogue, events & visual elements and 2) delivering localized, accessible content that meets a consistent cinematic quality standard at scale. This team's mission is to deeply understand all content and empower all customers with relevant language options, innovative accessibility assists, and rich title-information across all their content-experiences on Prime Video. We create and publish content on-time that's meaningful, accurate, and accessible to every customer globally. We delight our customers by pushing the boundaries of content understanding and enrichment. Through inclusion and innovation, we do the most fulfilling work of our career.
US, WA, Seattle
Amazon Seller Assistant is our flagship GenAI-first, multi-agent system that reimagines Seller experience. Our vision is to provide each seller with a proactive, autonomous, agentic assistant that understands their business and helps them navigate the complexities of selling by anticipating their needs, surfacing insights, resolving issues, taking actions on their behalf, and helping them grow. Amazon Seller Assistant helps millions of sellers on Amazon serve billions of customers worldwide. We are seeking a world-class Senior Data Scientist to help define and build the next generation of Amazon Seller Assistant. You will partner with top-tier scientist, engineers and product teams to launch production-grade agentic capabilities at Amazon's scale — owning your problem space end-to-end, from a crisp customer insight to a shipped product that millions of sellers rely on. Key job responsibilities • Own the science vision, strategy, and roadmap for a key Seller Assistant capability area. • Define and ship agentic experiences — sub-agent onboarding, tool onboarding, evaluations— that solve hard seller problems at scale. • Partner with scientists and engineers to translate frontier AI research into production-grade features sellers trust and depend on. • Design rigorous evaluation frameworks — automated and human-in-the-loop — to measure agent quality, accuracy, and business impact. • Deep-dive into seller data, identify unmet needs, and write compelling PRFAQs that set the direction for your team. • Drive cross-functional alignment across science, engineering, UX, and business teams to deliver with speed and quality. About the team Amazon Seller Assistant team operates at the very frontier of agentic AI and agentic commerce — not as a research group, but as a team shipping production-grade, multi-agent systems used by millions of sellers worldwide. We move with the urgency of a startup and the resources of the world's most customer-obsessed company, the latest breakthroughs in science and engineering into capabilities that sellers rely on every day.
US, NY, New York
MULTIPLE POSITIONS AVAILABLE Employer: Amazon Development Center U.S., Inc. Offered Position: Applied Scientist III - AMZ007408 Job Location: New York, NY Position Responsibilities: Participate in the design, development, evaluation, deployment, and updating of formal reasoning systems for security, privacy, and data protection applications. Drive technical and scientific innovation in security automation, data protection, and privacy-preserving technologies, with a focus on developing scalable solutions for cloud environments. Develop and/or apply formal verification techniques and automated theorem proving methods for different applications in cloud security and privacy. Collaborate with internal and external users to understand requirements and enhance formal verification and automated reasoning capabilities. Lead research and development efforts in AI security, specifically evaluate emerging threats and opportunities, including securing Generative AI systems and designing robust safeguards. Proactively identify and explore new opportunities for deploying and leveraging formal reasoning solutions across various domains.
US, CA, San Francisco
The Amazon Center for Quantum Computing (CQC) is seeking to hire an Applied Science Manager to lead a team of scientists in the physical design and simulation of superconducting quantum processors. In this role, you will use advanced modeling, simulation, and experimental design to drive improvements in scaling and performance. You will partner with other physics and engineering teams to advance the development of fault-tolerant quantum computers. Key job responsibilities - Hire Applied Scientists from diverse technical backgrounds to design quantum processors and improve the design process - Develop scientific talent through goal setting, feedback, collaborative work, and coaching - Collaborate with other science teams in designing experiments to overcome scaling and performance limitations - Influence engineering team development priorities in enabling systematic processor design and simulation workflows - Manage tactical and strategic initiatives with scientific projects pursued within team - Enable creative and innovative experimentation while striving for operational excellence About the team The Amazon Center for Quantum Computing (CQC) is a multi-disciplinary team of scientists, engineers, and technicians, on a mission to develop a fault-tolerant quantum computer. Inclusive Team Culture Here at Amazon, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Export Control Requirement Due to applicable export control laws and regulations, candidates must be either a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum, or be able to obtain a US export license. If you are unsure if you meet these requirements, please apply and Amazon will review your application for eligibility.
GB, London
The Agentic Automated Reasoning Group is building the next generation of software verification tools combining advances in artificial intelligence, the computational capacity of the cloud, and our deep expertise in the domain. Join us if you want to be a part of this transformational endeavor. The Strata team (https://github.com/strata-org) is seeking an applied scientist with broad interest and expertise in model checking, interactive theorem proving, programming language semantics, and generative AI. You will combine your expertise with that of your coworkers to build new tools that solve code analysis problems previously considered beyond reach. Our application areas span all the way from Infrastructure as Code to high-performance cryptography written in assembly code, while our methods span from interactive theorem proving to automated test generation. Each day, hundreds of thousands of developers make billions of transactions worldwide on AWS. They harness the power of the cloud to enable innovative applications, websites, and businesses. Using automated reasoning technology and mathematical proofs, AWS allows customers to answer questions about security, availability, durability, and functional correctness. We call this provable security, absolute assurance in security of the cloud and in the cloud. https://aws.amazon.com/security/provable-security/ Key job responsibilities Work with customer teams to understand the nature of their software and the properties they need to establish of it. Identify tools and methods capable of addressing the verification needs of customers, including any novel analysis capabilities required. Use techniques spanning property-based testing to model checkers, and interactive theorem provers to establish program properties. Explore generative AI techniques to help customers formalize their requirements, find revealing tests, generate required boiler plate for testing and model checking, and find and repair program proofs. About the team The Agentic Automated Reasoning Group at AWS develops and applies state of the art formal methods and automated reasoning techniques to ensure the security, reliability, and correctness of AWS services and customer applications, with a strong focus on AI based agents. Our work innovates tools and services to perform verification at scale and apply them to build safe and secure systems at AWS. We are also pioneering the use of formal verification and automated reasoning to develop agentic systems, ensuring AI agents operate within defined safety boundaries.
US, CA, San Francisco
Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers to lead key initiatives in robotic intelligence. As a Member of Technical Staff, you'll spearhead the development of breakthrough foundation models that enable robots to perceive, understand, and interact with the world in unprecedented ways. You'll drive technical excellence in areas such as perception, manipulation, science understanding, sim2real transfer, multi-modal foundation models, and multi-task learning, designing novel algorithms that bridge the gap between state-of-the-art research and real-world deployment at Amazon scale. In this role, you'll combine hands-on technical work with scientific leadership, ensuring your team delivers robust solutions for dynamic real-world environments. You'll leverage Amazon's vast computational resources to tackle ambitious problems in areas like very large multi-modal robotic foundation models and efficient, promptable model architectures that can scale across diverse robotic applications. Key job responsibilities - Lead technical initiatives in robotics foundation models, driving breakthrough approaches through hands-on research and development in areas like open-vocabulary panoptic scene understanding, scaling up multi-modal LLMs, sim2real/real2sim techniques, end-to-end vision-language-action models, efficient model inference, video tokenization - Design and implement novel deep learning architectures that push the boundaries of what robots can understand and accomplish - Guide technical direction for specific research initiatives, ensuring robust performance in production environments - Mentor and support fellow scientists while maintaining strong individual technical contributions - Collaborate with engineering teams to optimize and scale models for real-world applications - Influence technical decisions and implementation strategies within your area of focus A day in the life - Develop and implement novel foundation model architectures, working hands-on with our extensive compute infrastructure - Guide and support fellow scientists in solving complex technical challenges, from sim2real transfer to efficient multi-task learning - Lead focused technical initiatives from conception through deployment, ensuring successful integration with production systems - Drive technical discussions within your team and with key stakeholders - Conduct experiments and prototype new ideas using our massive compute cluster - Mentor team members while maintaining significant hands-on contribution to technical solutions Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through ground breaking foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, NY, New York
In this role, you will design and build intelligent multi-agent systems that automate root cause analysis for advertising campaign delivery at scale. You will architect agentic orchestration patterns where specialized sub-agents (campaign diagnostics, deal-level troubleshooting, pacing control) are invoked as composable tools by a reasoning layer that determines which subsystems to query based on the nature of the issue. You will develop hierarchical analysis frameworks that move from daily trend detection to intra-day anomaly isolation, enabling the system to pinpoint when and why delivery degraded rather than relying on static time windows. You will build self-learning feedback loops where the system identifies recurring failure signatures (auction dynamics, pacing anomalies, supply contention), updates its diagnostic knowledge as engineering teams deploy fixes, and retires stale patterns automatically. We are looking for a passionate Applied Scientist with technical expertise in LLM-based agent architectures, retrieval-augmented generation, time-series anomaly detection, and production ML systems. In addition to hands-on experience building agentic AI solutions, an ideal candidate should demonstrate the ability to translate complex distributed system behaviors into structured diagnostic reasoning, show a willingness to push the boundaries of how LLMs interact with real-time operational data, and thrive in an environment where you ship production systems that directly reduce advertiser escalation time from days to minutes. Key job responsibilities * Conduct deep data analysis to derive insights for the business, identify gaps, and uncover new opportunities. * Develop scalable and effective machine learning models and optimization strategies to solve business problems. * Run regular A/B experiments, gather data, and perform statistical analysis to optimize advertiser experiences. * Collaborate closely with software engineers to deliver end-to-end solutions into production. * Enhance the scalability, efficiency, and automation of large-scale data analytics, model training, deployment, and serving. * Research and implement new machine learning models and techniques to improve advertising performance. A day in the life Your primary focus is building a multi-agent diagnostic system that automates root cause analysis for advertising campaign delivery issues. On a typical day, you might review how the system handled recent escalations, identify where it reasoned incorrectly, adjust orchestration logic, and write new evaluation cases. You will design agent architectures that invoke specialized sub-agents as tools, build hierarchical analysis frameworks that move from trend detection to anomaly isolation, and develop self-learning loops that keep the system's diagnostic knowledge current as the underlying platform evolves. You will work closely with SDEs building the diagnostic platform, product managers defining the troubleshooting experience, and the support teams who rely on your system to resolve advertiser delivery issues in minutes instead of days. Beyond the core agent work, you may find yourself diving into causal inference to measure recommendation effectiveness, prototyping proactive anomaly detection, or contributing to evaluation science for systems that reason over complex operational data. About the team The Demand Enablement, Product Analytics and Operations team builds the diagnostic and intelligence layer for Amazon DSP, the demand-side platform powering Amazon's programmatic advertising business. We own the systems that detect, diagnose, and surface delivery issues across campaigns, giving internal teams and advertisers the visibility to act before problems impact spend. Our product portfolio spans automated troubleshooting platforms, advertiser-facing delivery insights, and AI-powered root cause analysis using multi-agent architectures on foundation models. We are a small, high-ownership team that ships production systems end-to-end, from data pipelines processing billions of bid events to LLM-based agents that reason over complex advertising systems. If you want to work at the intersection of applied science, distributed systems observability, and real business impact measured in advertiser dollars recovered, this is the team.
US, WA, Seattle
Join us at the forefront of Amazon's sustainability initiatives to work on environmental and social advancements that support Amazon's long-term worldwide sustainability strategy. At Amazon, we're working to be the most customer-centric company on earth. To get there, we need exceptionally talented, bright, and driven people who are passionate about making a meaningful impact on communities and the environment while helping shape the future of sustainable business practices. Sustainability Science and Innovation (SSI) is a multi-disciplinary team within WW Sustainability combining science, analytics, economics, statistics, machine learning, product development, and engineering expertise. We use data across the sustainability imperatives (carbon, water, waste, biodiversity, environmental risk and more) and these skills and capabilities to identify, develop, experiment, and scale the scientific solutions and innovations necessary for Amazon, customers and partners to help them solve their hardest unmet and evolving sustainability needs and goals. The Worldwide Sustainability (WWS) organization is seeking an exceptional scientific leader to join Amazon's Sustainability Science and Innovation team as a Researcher Scientist for Materials Chemistry Innovation. This role focuses on hands-on experimental research in materials chemistry to accelerate the discovery and validation of sustainable materials through systematic synthesis, characterization, and performance testing. You will lead the design and execution of experimental research campaigns targeting catalysts, functional materials, and sustainability-relevant chemistries across multivariate parameter spaces. You will establish scientific strategy and technical roadmaps for materials discovery while leading research initiatives that tackle complex sustainability challenges in critical industrial sectors. This position requires driving breakthrough solutions in materials synthesis and characterization through internal capabilities and strategic partnerships with universities, industry scientists, and government laboratories. You will mentor junior scientists and engineers while collaborating across Amazon's Innovation Lab Network to translate research into scalable solutions. Your leadership will be essential in developing early-stage, cost-effective materials that address significant technical and economic challenges fundamental to Amazon's operations, requiring you to navigate complex trade-offs between immediate deliverables and long-term environmental impact. You will also shape how emerging automation and AI tools are applied to accelerate materials discovery workflows. The ideal candidate demonstrates extensive experience in materials synthesis, advanced characterization techniques, and systematic experimental design for performance validations. You must possess proven ability to lead cross-functional teams, establish research priorities, and drive scientific innovation from concept to implementation. Deep technical expertise in materials testing methods, combined with strategic vision for translating research into practical applications is essential. Experience with high-throughput and combinatorial experimental approaches to efficiently explore large design spaces is highly valued. Your work will establish new paradigms in sustainable materials discovery through rigorous experimental research and performance testing, directly contributing to Amazon's sustainability goals while creating scalable solutions that extend beyond the company's immediate operations. Key job responsibilities - Develop scientific models that help solve complex and ambiguous sustainability problems, and extract strategic learnings from large datasets. - Work closely with applied scientists and software engineers to implement your scientific models. - Support early-stage strategic sustainability initiatives and effectively learn from, collaborate with, and influence stakeholders to scale-up high-value initiatives. - Support research and development of cross-cutting technologies for industrial decarbonization, including building the data foundation and analytics for new AI models. - Drive innovation in key focus areas including packaging materials, building materials, and alternative fuels. About the team Diverse Experiences: World Wide Sustainability (WWS) values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture: It’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth: We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, NY, New York
About the Team Our team builds and operates automated reasoning technology that powers security and privacy assurance across Amazon and AWS at scale. Our technology is deeply integrated into critical Amazon and AWS security workflows. We operate at the intersection of automated reasoning, program analysis, and applied security — and our work directly impacts the security posture of every AWS service. About the Role We are looking for an experienced Applied Science Manager to lead the team's static analysis platform science team. In this role, you will own the technical vision and roadmap for our automated reasoning engine's static analysis capabilities, drive innovation in scalable program analysis, and lead a team of applied scientists working at the frontier of automated reasoning for security while also contributing technically as a player/coach. You will partner closely with security, privacy, and compliance stakeholders across AWS to expand the reach and impact of provably correct code analysis. You will also partner closely with automated reasoning experts across the company and contribute to the science of security Key job responsibilities Technical Leadership: Own the science roadmap for our automated reasoning engine, including taint analysis, compositional heap analysis, modular method summarization, and dataflow graph generation Hands-on Contribution: Personally contribute to key research and design decisions, including prototyping novel analyses and reviewing technical artifacts Team Building & Management: Hire, develop, and retain a world-class team of applied scientists; foster a culture of scientific rigor, innovation, and operational excellence Product Integration: Partner with application security and service teams to expand our platform's integration footprint and deliver new security and privacy analysis capabilities Research & Innovation: Advance the state of the art in static program analysis, including exploring formal verification of analysis correctness (e.g., using Lean, Coq, or Dafny), expanding language support beyond Java, and developing novel analysis techniques for emerging security properties Stakeholder Engagement: Collaborate with AWS AppSec, Privacy Engineering, and service teams to understand their security assurance needs and translate them into analysis capabilities Strategic Influence: Represent our team in the broader Automated Reasoning community at Amazon; contribute to automated reasoning initiatives, and academic partnerships About the team Our team builds and operates automated reasoning technology that powers security and privacy assurance across Amazon and AWS at scale. Our automated reasoning engine is the core technology behind our managed dataflow mapping service, which automatically tracks how data flows through AWS service teams’ code and infrastructure. Our technology is deeply integrated into critical Amazon and AWS security workflows. We operate at the intersection of automated reasoning, program analysis, and applied security — and our work directly impacts the security posture of every AWS service. Diverse Experiences Amazon Security values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why Amazon Security? At Amazon, security is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for security across all of Amazon’s products and services. We offer talented security professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Security, it’s in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest security challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve.