A quick guide to Amazon's 65-plus papers at this year's ACL

Familiar topics such as question answering and natural-language understanding remain well represented, but a new concentration on language modeling and multimodal models reflect the spread of generative AI.

Between the main conference and the recently inaugurated ACL Proceedings, Amazon researchers have more than 65 papers at this year's meeting of the Association for Computational Linguistics (ACL).

Automatic speech recognition

Masked audio text encoders are effective multi-modal rescorers*
Jason Cai, Monica Sunkara, Xilai Li, Anshu Bhatia, Xiao Pan, Sravan Bodapati

Code generation

A static evaluation of code completion by large language models
Hantian Ding, Varun Kumar, Yuchen Tian, Zijian Wang, Rob Kwiatkowski, Xiaopeng LI, Murali Krishna Ramanathan, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Multitask pretraining with structured knowledge for text-to-SQL generation
Robert Giaquinto, Dejiao Zhang, Benjamin Kleiner, Yang Li, Ming Tan, Parminder Bhatia, Ramesh Nallapati, Xiaofei Ma

Code switching

Code-switched text synthesis in unseen language pairs*
I-Hung Hsu, Avik Ray, Shubham Garg, Nanyun Peng, Jing Huang

CoMix: Guide transformers to code-mix using POS structure and phonetics*
Gaurav Arora, Srujana Merugu, Vivek Sembium

Continual learning

Characterizing and measuring linguistic dataset drift
Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth

Data-/table-to-text applications

An inner table retriever for robust table question answering
Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias

Few-shot data-to-text generation via unified representation and multi-source learning
Alexander Hanbo Li, Mingyue Shang, Evangelia Spiliopoulou, JIE MA, Patrick Ng, Zhiguo Wang, Bonan Min, William Wang, Kathleen McKeown, Vittorio Castelli, Dan Roth, Bing Xiang

Improving cross-task generalization of unified table-to-text models with compositional task configurations*
Jifan Chen, Yuhao Zhang, Lan Liu, Rui Dong, Xinchi Chen, Patrick Ng, William Wang, Zhiheng Huang

LI-RAGE: Late interaction retrieval augmented generation with explicit signals for open-domain table question answering
Weizhe Lin, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert, Gonzalo Iglesias

Dialogue

Diable: Efficient dialogue state tracking as operations on tables*
Pietro Lesci, Yoshinari Fujinuma, Momchil Hardalov, Chao Shang, Lluis Marquez

NatCS: Eliciting natural customer support dialogues
James Gung, Emily Moeng, Wesley Rose, Arshit Gupta, Yi Zhang, Saab Mansour

Schema-guided user satisfaction modeling for task-oriented dialogues
Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai

Toward more accurate and generalizable evaluation metrics for task-oriented dialogs
Abi Komma, Nagesh Panyam, Timothy Leffel, Anuj Goyal, Angeliki Metallinou, Spyros Matsoukas, Aram Galstyan

Explainable AI

Efficient Shapley values estimation by amortization for text classification
Alan Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, Bing Xiang

Few shot rationale generation using self-training with dual teachers*
Aditya Srikanth Veerubhotla, Lahari Poddar, Jun Yin, Gyuri Szarvas, Sharanya Eswaran

Information extraction

An AMR-based link prediction approach for document-level event argument extraction
Yuqing Yang, Qipeng Guo, Xiangkun Hu, Yue Zhang, Qipeng Guo, Zheng Zhang

AVEN-GR: Attribute value extraction and normalization using product graphs
Donato Crisostomi, Thomas Ricatte

Large scale generative multimodal attribute extraction for e-commerce attributes
Anant Khandelwal, Happy Mittal, Shreyas Sunil Kulkarni, Deepak Gupta

ParaAMR: A large-scale syntactically diverse paraphrase dataset by AMR back-translation
Kuan-Hao Huang, Varun Iyer, I-Hung Hsu, Anoop Kumar, Kai-Wei Chang, Aram Galstyan

Weakly supervised hierarchical multi-task classification of customer questions
Jitenkumar Rana, Promod Yenigalla, Chetan Aggarwal, Sandeep Mukku, Manan Soni, Rashmi Patange

WebIE: Faithful and robust information extraction on the web
Chenxi Whitehouse, Clara Vania, Alham Fikri Aji, Christos Christodoulopoulos, Andrea Pierleoni

Information retrieval

CUPID: Curriculum learning based real-time prediction using distillation
Arindam Bhattacharya, Ankith M S, Ankit Gandhi, Vijay Huddar, Atul Saroop, Rahul Bhagat

Direct fact retrieval from knowledge graphs without entity linking
Jinheon Baek, Alham Fikri Aji, Jens Lehmann, Sung Ju Hwang

Language modeling

Adaptation approaches for nearest neighbor language models*
Rishabh Bhardwaj, George Polovets, Monica Sunkara

CONTRACLM: Contrastive learning for causal language model
Nihal Jain, Dejiao Zhang, Wasi Ahmad, Zijian Wang, Feng Nan, Xiaopeng LI, Ming Tan, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Ramesh Nallapati, Bing Xiang

Controlled text generation with hidden representation transformations*
Vaibhav Kumar, Hana Koorehdavoudi, Masud Moshtaghi, Amita Misra, Ankit Chadha, Emilio Ferrara

KILM: Knowledge injection into encoder-decoder language models
Yan XU, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, Dilek Hakkani-Tür

ReAugKD: Retrieval-augmented knowledge distillation for pre-trained language models
Jianyi Zhang, Aashiq Muhamed, Aditya Anantharaman, Guoyin Wang, Changyou Chen, Kai Zhong, Qingjun Cui, Yi Xu, Belinda Zeng, Trishul Chilimbi, Yiran Chen

Recipes for sequential pre-training of multilingual encoder and seq2seq models*
Saleh Soltan, Andy Rosenbaum, Tobias Falke, Qin Lu, Anna Rumshisky, Wael Hamza

Rethinking the role of scale for in-context learning: An interpretability-based case study at 66 billion scale
Hritik Bansal, Karthik Gopalakrishnan, Saket Dingliwal, Sravan Bodapati, Katrin Kirchhoff, Dan Roth

Machine learning

Mitigating the burden of redundant datasets via batch-wise unique samples and frequency-aware losses
Donato Crisostomi, Andrea Caciolai, Alessandro Pedrani, Alessandro Manzotti, Enrico Palumbo, Kay Rottmann, Davide Bernardi

Machine translation

RAMP: Retrieval and attribute-marking enhanced prompting for attribute-controlled translation
Gabriele Sarti, Phu Mon Htut, Xing Niu, Benjamin Hsu, Anna Currey, Georgiana Dinu, Maria Nădejde

Multimodal models

Benchmarking diverse-modal entity linking with generative models*
Sijia Wang, Alexander Li, Henry Zhu, Sheng Zhang, Pramuditha Perera, Chung-Wei Hang, JIE MA, William Wang, Zhiguo Wang, Vittorio Castelli, Bing Xiang, Patrick Ng

Generate then select: Open-ended visual question answering guided by world knowledge*
Xingyu Fu, Sheng Zhang, Gukyeong Kwon, Pramuditha Perera, Henry Zhu, Yuhao Zhang, Alexander Hanbo Li, William Wang, Zhiguo Wang, Vittorio Castelli, Patrick Ng, Dan Roth, Bing Xiang

KG-FLIP: Knowledge-guided fashion-domain language-image pre-training for e-commerce
Qinjin Jia, Yang Liu, Shaoyuan Xu, Huidong Liu, Daoping Wu, Jinmiao Fu, Roland Vollgraf, Bryan Wang

Resolving ambiguities in text-to-image generative models
Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, Rahul Gupta

Translation-enhanced multilingual text-to-image generation
Yaoyiran Li, Ching-Yun (Frannie) Chang, Stephen Rawls, Ivan Vulić, Anna Korhonen

Unsupervised melody-to-lyric generation
Yufei Tian, Anjali Narayan-Chen, Shereen Oraby, Alessandra Cervone, Chenyang Tao, Gunnar Sigurdsson, Wenbo Zhao, Tagyoung Chung, Jing Huang, Violet Peng

Natural-language processing

Multi-VALUE: A framework for cross-dialectal English NLP
Caleb Ziems, William Held, Jingfeng Yang, Jwala Dhamala, Rahul Gupta, Diyi Yang

vONTSS: vMF based semi-supervised neural topic modeling with optimal transport*
Weijie Xu, Xiaoyu Jiang, Srinivasan Sengamedu, "SHS", Francis Iannacci, Jinjin Zhao

Natural-language understanding

ECG-QALM: Entity-controlled synthetic text generation using contextual Q&A for NER*
Karan Aggarwal, Henry Jin, Aitzaz Ahmad

Entity contrastive learning in a large-scale virtual assistant system
Jonathan Rubin, Jason Crowley, George Leung, Morteza Ziyadi, Maria Minakova

EPIC: Multi-perspective annotation of a corpus of irony
Simona Frenda, Alessandro Pedrani, Valerio Basile, Soda Marem Lo, Alessandra Teresa Cignarella, Raffaella Panizzon, Cristina Marco, Bianca Scarlini, Viviana Patti, Cristina Bosco, Davide Bernardi

Measuring and mitigating local instability in deep neural networks*
Arghya Datta, Subhrangshu Nandi, Jingcheng Xu, Greg Ver Steeg, He Xie, Anoop Kumar, Aram Galstyan

Reducing cohort bias in natural language understanding systems with targeted self-training scheme
Thu Le, Gabriela Cortes Hernandez, Bei Chen, Melanie Bradford

Privacy

Controlling the extraction of memorized data from large language models via prompt-tuning
Mustafa Ozdayi, Charith Peris, Jack G. M. FitzGerald, Christophe Dupuy, Jimit Majmudar, Haidar Khan, Rahil Parikh, Rahul Gupta

Query rewriting

Context-aware query rewriting for improving users’ search experience on e-commerce websites
Simiao Zuo, Qingyu Yin, Haoming Jiang, Shaohui Xi, Bing Yin, Chao Zhang, Tuo Zhao

Unified contextual query rewriting
Yingxue Zhou, Jie Hao, Mukund Rungta, Yang Liu, Eunah Cho, Xing Fan, Yanbin Lu, Vishal Vasudevan, Kellen Gillespie, Zeynab Raeesy, Sawyer Shen, Edward Guo, Gokhan Tur

Question answering

Accurate training of web-based question answering systems with feedback from ranked users
Liang Wang, Ivano Lauriola, Alessandro Moschitti

Context-aware transformer pre-training for answer sentence selection
Luca Di Liello, Siddhant Garg, Alessandro Moschitti

Cross-Lingual Knowledge Distillation for answer sentence selection in low-resource languages*
Shivanshu Gupta, Yoshitomo Matsubara, Ankit Chadha, Alessandro Moschitti

Exploiting abstract meaning representation for open-domain question answering*
Cunxiang Wang, Zhikun Xu, Qipeng Guo, Xiangkun Hu, Xuefeng Bai, Zheng Zhang, Yue Zhang

Hybrid hierarchical retrieval for open-domain question answering*
Manoj Ghuhan Arivazhagan, Lan Liu, Peng Qi, Xinchi Chen, William Wang, Zhiheng Huang

Learning answer generation using supervision from automatic question answering evaluators
Matteo Gabburo, Siddhant Garg, Rik Koncel-Kedziorski, Alessandro Moschitti

RobustQA: Benchmarking the robustness of domain adaptation for open-domain question answering*
Rujun Han, Peng Qi, Yuhao Zhang, Lan Liu, Juliette Burger, William Wang, Zhiheng Huang, Bing Xiang, Dan Roth

Reasoning

FolkScope: Intention knowledge graph construction for e-commerce commonsense discovery*
Changlong Yu, Weiqi Wang, Xin Liu, Jiaxin Bai, Yangqiu Song, Zheng Li, Yifan Gao, Tianyu Cao, Bing Yin

SCOTT: Self-consistent chain-of-thought distillation
Peifeng Wang, Zhengyang Wang, Zheng Li, Yifan Gao, Bing Yin, Xiang Ren

Self-learning

Constrained policy optimization for controlled self-learning in conversational AI systems
Mohammad Kachuee, Sungjin Lee

Scalable and safe remediation of defective actions in self-learning conversational systems
Sarthak Ahuja, Mohammad Kachuee, Fateme Sheikholeslami, Weiqing Liu, Jae Do

Semantic parsing

An empirical analysis of leveraging knowledge for low-resource task-oriented semantic parsing*
Mayank Kulkarni, Aoxiao Zhong, Nicolas Guenon Des Mesnards, Sahar Movaghati, Mukund Harakere, He Xie, Jianhua Lu

XSEMPLR: Cross-lingual semantic parsing in multiple natural languages and meaning representations
Yusen Zhang, Jun Wang, Zhiguo Wang, Rui Zhang

Spoken-language understanding

Regression-free model updates for spoken language understanding
Andrea Caciolai, Verena Weber, Tobias Falke, Alessandro Pedrani, Davide Bernardi

Sharing encoder representations across languages, domains and tasks in large-scale spoken language understanding
Jonathan Hueser, Judith Gaspers, Thomas Gueudre, Chandana Satya Prakash, Jin Cao, Daniil Sorokin, Quynh Do, Nicolas Anastassacos, Tobias Falke, Turan Gojayev, Mariusz Momotko, Denis Romasanta Rodriguez, Austin Doolittle, Kartik Balasubramaniam, Wael Hamza, Fabian Triefenbach, Patrick Lehnen

Toxic-language classification

QCon at SemEval-2023 Task 10: Data augmentation and model ensembling for detection of online sexism
Wes Feely, Prabhakar Gupta, Manas Mohanty, Tim Chon, Tuhin Kundu, Vijit Singh, Sandeep Atluri, Tanya Roosta, Viviane Ghaderi, Peter Schulam, Heba Elfardy

Towards building a robust toxicity predictor
Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Yanjun (Jane) Qi

*Accepted to ACL Findings

Research areas

Related content

  • Amazon Research Awards team
    May 27, 2026
    Awardees represent more than 49 universities in 11 countries. Recipients have access to Amazon public datasets, along with AWS AI/ML services and tools.
  • Meiqi Sun
    April 20, 2026
    Large language models today can solve algebra, pass academic benchmarks, and generate highly structured chain-of-thought explanations. In text-only settings, they often feel startlingly intelligent — methodical, articulate, even strategic. But place those models inside an interactive environment — ask them to click buttons, scroll pages, fill out forms, and submit answers — and their behavior changes. Their careful reasoning falters. They guess where they once deduced. They adhere to templates and produce limited procedural narration: stating what they see and what they will click next, without first forming a structured plan and acting in accordance with plan. It’s as if part of their intelligence has quietly gone offline the moment the cursor appears.
    Machine learning
  • By focusing on specific failure points and suggesting targeted solutions, a new automated prompt-engineering framework improves prompt performance without compromising existing functionality.
IN, KA, Bengaluru
Have you ever wondered how that Amazon box with the smile arrives so quickly, where it came from, and how much it cost Amazon to deliver? The WW Amazon Logistics, Business Analytics team manages the delivery of tens of millions of products every week to Amazon's customers, achieving on-time delivery in a cost-effective manner. We are seeking an enthusiastic, customer-obsessed Manager Research Science with strong analytical skills to join our team. This role is crucial in optimizing Amazon's vast delivery network and will have significant impact on the customer experience, particularly in the final phase of delivery. As a Manager Research Science, you will: 1. Address business challenges through building compelling cases and using data to influence change across the organization 2. Develop input and assumptions based on preexisting models to estimate costs and savings opportunities associated with varying levels of network growth and operations 3. Create metrics to measure business performance, identify root causes and trends, and prescribe action plans 4. Manage multiple high-impact projects simultaneously 5. Work with technology teams and product managers to develop new tools and systems supporting business growth 6. Communicate with and support various internal stakeholders and external audiences 7. Implement scheduling solutions, improve metrics, and develop scalable processes and tools The ideal candidate will have: - Extensive experience in operations research and data-driven decision making - Strong analytical and problem-solving skills - Robust program management and research science skills - Ability to work with a team and make independent decisions in ambiguous environments - Customer-obsessed mindset with a focus on improving the Amazon delivery experience This role offers the autonomy to think strategically and make data-driven decisions from day one. Join us in shaping the future of e-commerce delivery and addressing the core challenges in our world-class operations space! Key job responsibilities 1. Advanced Modeling and Algorithm Development: - Design and implement sophisticated machine learning models for logistics optimization - Develop complex time series forecasting algorithms for demand prediction and resource allocation 2. AI and Machine Learning Integration: - Architect and deploy AI-powered systems to enhance decision-making in logistics operations - Implement deep learning techniques for image recognition in package sorting and handling - Develop reinforcement learning algorithms for adaptive scheduling and resource management 3. Big Data Analytics and Processing: - Design and implement distributed computing solutions for processing massive logistics datasets - Utilize cloud computing platforms (e.g., AWS) for scalable data processing and analysis 4. AI-Driven Workflow Optimization: - Design and implement AI agents for autonomous decision-making in logistics processes - Create machine learning models for customer behavior analysis and personalized delivery options 5. Software Development and System Architecture: - Write efficient, scalable code in languages such as Python, Java, or C++ - Develop and maintain complex software systems for logistics optimization - Stay at the forefront of AI and ML research - Publish research findings in top-tier conferences and journals About the team We are Amazon's Last Mile Science and Analytics team, dedicated to improving e-commerce delivery. We work to optimize our vast network, forecast demand using machine learning, and enhance route efficiency. Our efforts focus on developing innovative delivery methods, applying AI to solve complex problems, and conducting geospatial analysis. We create simulations to refine processes and plan capacity effectively. Operating globally, we strive to develop adaptable solutions for diverse markets. We aim to advance logistics science, continually improving speed, efficiency, and customer satisfaction, in support of Amazon's mission to be Earth's most customer-centric company.
US, WA, Seattle
Ever wish you could use your quantitative and critical thinking skills to influence business decisions? Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems. As part of the Content Discovery and Experimentation Science team within Prime Video, you will leverage your expertise in causal inference and experimental design to make Prime Video the best-in-class digital video experience. Key job responsibilities - Build causal models and metrics that capture trade-off decisions when business and customer outcomes do not align - Partner with data scientists and product managers to integrate these metrics into Prime Video's experimentation tooling - Work with finance partners to ensure that the team's product metrics contribute to Prime Video's strategic business and financial objectives - Contribute to technical and business documents to communicate ideas and proposals to various audiences - Educate and advocate for best practices in experimentation and how to use it for decision-making
US, CA, Sunnyvale
MULTIPLE POSITIONS AVAILABLE Employer: AMAZON.COM SERVICES LLC Offered Position: Manager III, Economist Job Location: Sunnyvale, California Job Number: AMZ9803624 Position Responsibilities: Independently manage a team of economists and/or scientists in developing strategic economic analyses and demand estimation models. Translate business questions into econometric methodologies and causal inference analyses. Communicate economic insights to non-technical audiences to guide strategic-level, high-impact business decisions. Scale economic models through cross-functional collaboration with engineering teams. Establish scientific quality standards and research priorities. Drive operational efficiency and research excellence across the team. 40 hours / week, 8:00am-5:00pm, Salary Range: $201,300/year to $272,400/year. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, visit: https://www.aboutamazon.com/workplace/employee-benefits. Amazon.com is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.#0000
AU, VIC, Melbourne
Are you excited about leveraging state-of-the-art Computer Vision algorithms and large datasets to solve real-world problems? Join Amazon as an Applied Scientist Intern and be at the forefront of AI innovation! As an Applied Scientist Intern, you'll work in a fast-paced, cross-disciplinary team of pioneering researchers. You'll tackle complex problems, developing solutions that either build on existing academic and industrial research or stem from your own innovative thinking. Your work may even find its way into customer-facing products, making a real-world impact. Please note: This internship is a duration of 6 months full time with a start date in Jan-March 2027. The successful intern is required to be based in Melbourne and relocation allowance will be provided if you are based outside of Melbourne. Key job responsibilities - Develop novel solutions and build prototypes - Work on complex problems in Computer Vision and Machine Learning - Contribute to research that could significantly impact Amazon's operations - Collaborate with a diverse team of experts in a fast-paced environment - Collaborate with scientists on writing and submitting papers to Tier-1 conferences (e.g., CVPR, ICCV, NeurIPS, ICML) - Present your research findings to both technical and non-technical audiences Key Opportunities - Collaborate with leading machine learning researchers - Access Amazon tools and hardware (large GPU clusters) - Address challenges at an unparalleled scale - Become a disruptor, innovator, and problem solver in the field of computer vision - Potentially deliver solutions to production in customer-facing applications - Opportunities to become an FTE after the internship Join us in shaping the future of AI at Amazon. Apply now and turn your research into real-world solutions!
IN, KA, Bengaluru
Are you passionate about solving complex business problems at scale through Generative AI? Do you want to help build intelligent systems that reason, act, and learn from minimal supervision? If so, we have an exciting opportunity for you on Amazon's Trustworthy Shopping Experience (TSE) team. At TSE, our vision is to guarantee customers a worry-free shopping experience by earning their trust that the products they buy are safe, authentic, and compliant with regulations and policy. We do this in close partnership with our selling partners, empowering them with best-in-class tools and expertise to offer a high-quality, compliant selection that customers trust. As an Applied Scientist I, you will bring subject matter expertise in at least one relevant discipline (e.g., NLP, computer vision, representation learning, agentic architecture) to contribute to next-generation agentic AI solutions that automate complex manual investigation processes at Amazon scale. Working alongside senior scientists, you will map business goals—such as reducing cost-of-serving while maintaining trust and safety standards—to well-defined scientific problems and metrics. You will invent, refine, and experiment with solutions spanning agentic reasoning, self-supervised representation learning, few-shot adaptation, multimodal understanding, and model compression. With guidance from senior scientists, you will stay current on research trends and benchmark your results against the state of the art. You will help design and execute experiments to identify optimal solutions, initiating the development and implementation of small components with team guidance. You will write secure, stable, testable, and well-documented production code at the level of an SDE I, rigorously evaluating models and quantifying performance. You will handle data in accordance with Amazon policies, troubleshoot issues to root cause, and ensure your work does not put the company at risk. Your scope of influence will typically be at the self-level, with the possibility of mentoring interns. You will participate in team design and prioritization discussions, learn the business context behind TSE's products, and escalate problems with proposed solutions. You will publish internal technical reports and may contribute to peer-reviewed publications and external review activities when aligned with business needs. This role offers a unique opportunity to contribute to end-to-end AI development—from research through production—with your contributions serving hundreds of millions of customers within months, not years. Key job responsibilities • Contribute to the design and development of agentic AI systems with multi-step reasoning, autonomous task execution, and multimodal intelligence, including feedback and memory mechanisms, leveraging reinforcement learning techniques for agent decision-making and policy optimization, with input and guidance from senior scientists • Help productionize models built on top of SFT (Supervised Fine-tuning) and RFT (Reinforced Fine-tuning) approaches, as well as few-shot approaches based on multimodal datasets spanning text, images, and structured data, applying mathematical optimization techniques to improve efficiency, resource allocation, and decision-making in complex workflows, working alongside senior scientists to identify optimal solutions • Contribute to building production-ready deep learning and conventional ML solutions, including multimodal fusion and cross-modal alignment techniques that seamlessly connect visual, textual, and relational understanding, to support automation requirements within your team's scope • Help identify customer and business problems; use reasonable assumptions, data, and customer requirements to solve well-defined scientific problems involving multimodal inputs such as unstructured text, documents, product images, and relational data, developing representations that capture complementary signals across modalities and mapping business goals to scientific metrics • May co-author research papers for peer-reviewed internal and/or external venues, including contributions in areas such as multimodal representation learning and vision-language modeling, and contribute to the wider scientific community by reviewing research submissions, when aligned with business needs • Prototype rapidly, iterate based on feedback, and deliver small components at SDE I level—including multimodal data pipelines and inference modules—that integrate into production-scale systems • Write secure, stable, testable, maintainable, and well-documented code, balancing model capability, deployment cost, and resource usage across multimodal architectures while understanding state-of-the-art data structures, algorithms, and performance tradeoffs • Rigorously test code and evaluate models across individual and combined modalities, quantifying their performance; troubleshoot issues, research root causes, and thoroughly resolve defects, leaving systems more maintainable • Participate in team design, scoping, and prioritization discussions through clear verbal and written communication; seek to learn the business context, science, and engineering behind your team's products, including how multimodal signals contribute to trust and safety decisions • Participate in engineering best practices with peer reviews; clearly document approaches and communicate design decisions; publish internal technical reports to institutionalize scientific learning • Help train and mentor scientist interns; identify and escalate problems with proposed solutions, taking ownership or ensuring clear hand-off to the right owner About the team Trustworthy Shopping Experience Product team in TSE is responsible for the human-in-the-loop products and technology used in the risk investigations at Amazon. The team is also responsible for reducing the cost of performing the investigations, by automating wherever possible and optimizing the experience where manual interventions are needed. The team leverages state-of-the art technology and GenAI to deliver the products and associated goals.
IN, KA, Bengaluru
The Trust CX Innovations team is looking for an Applied Scientist with strong background in Generative AI space to build solutions that help in upholding customer trust for Alexa+. As an Applied Scientist in Trust CX innovations, you will be at the forefront of developing innovative solutions to critical challenges in AI trust and privacy. You'll lead research in trust-preserving machine learning techniques. We are working on revolutionizing the way Amazonians work and collaborate. You will help us achieve new heights of productivity through the power of advanced generative AI technologies. Key job responsibilities - Lead research initiatives in generative AI, focusing on LLMs, multimodal models, and frontier AI capabilities - Develop innovative approaches for model optimization, including prompt engineering, few-shot learning, and efficient fine-tuning - Pioneer new methods for AI safety, alignment, and responsible AI development - Design and execute sophisticated experiments to evaluate model performance and behavior - Lead the development of production-ready AI solutions that scale efficiently - Collaborate with product teams to translate research innovations into practical applications - Guide engineering teams in implementing AI models and systems at scale - Author technical papers for top-tier conferences - File patents for novel AI technologies and applications A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test scientific proposal/solutions to improve our trust-preserving experiences. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, policy, and model development. You work closely with partner teams across Alexa to deliver platform features that require cross-team leadership. About the team Who We Are: Trust CX Innovations is a strategic innovation team within Amazon Devices & Services that focuses on advancing AI technology while prioritizing customer trust and experience. Our team operates at the intersection of artificial intelligence, privacy engineering and customer-centric design. Our Mission: To pioneer trustworthy AI innovations that delight customers while setting new standards for privacy and responsible technology development. We aim to transform how Amazon builds AI products by creating solutions that balance innovation with customer trust.
US, CA, Pasadena
The Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is looking to hire a Research Scientist with experience in semiconductor process development who will aid in AWS’s effort to bring cloud quantum computing services to its worldwide customer base. You will join a multi-disciplinary team of scientists, and hardware and software engineers working at the forefront of quantum computing. Through your work inside and outside of the cleanroom environment in the fabrication research and development group, you will solve problems related to developing next-generation quantum processors. Candidates must have a demonstrated background in sound scientific and engineering principles, and must have excellent data analysis, bias for action, problem solving, and communication skills, and be highly motivated and curious to research and learn new technical topics as needed. As a research scientist you will be expected to work on new ideas and stay abreast of novel approaches in fabricating and packaging superconducting quantum processors. Working effectively within a team environment is critical. Key job responsibilities Responsibilities include developing novel processes to fabricate high-coherence superconducting qubits; developing advanced 3DI interconnect and routing technologies for integrating superconducting quantum technologies; analyzing inline metrology and electrical test data; writing production standard operating procedures to transfer newly-developed processes to production teams; interacting with project leads to provide feedback that continuously improves different processes. A day in the life The candidate will develop novel technologies using micro-/nano-fabrication techniques inside the cleanroom (independently or in collaboration with other scientists and engineers) for next-generation quantum computing. Outside the cleanroom, the candidate will plan experiments, analyze data, and conceive future innovations. About the team AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
US, WA, Redmond
We are searching for a talented candidate with expertise in orbital mechanics and spaceflight navigation, including LEO Satellite Orbit Determination. This position requires experience in simulation and analysis of spacecraft orbital mechanics and sequential orbit determination methods, including Extended Kalman Filters (EKF) and/or Unscented Kalman Filter (UKF). Strong analysis skills are required to develop engineering studies of complex large-scale dynamical systems. This position requires demonstrated expertise in computational analysis automation and tool development. Key job responsibilities - Perform spacecraft maneuver or navigation analysis in support of multi-disciplinary trades within the Amazon Leo team. - Contribute to prototype software development of flight algorithms. - Test and assess navigation software for integration into flight systems. - Assess and trouble-shoot the performance of Leo on-board GNSS hardware and software systems. - Work closely with GNC engineers to manage on-orbit performance and develop flight dynamics operations processes. Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum. A day in the life - Interacting with GNC teams to evaluate and troubleshoot satellite issues. - Working within the Flight Dynamics Research team to prioritize tasks. - Performing analysis, simulation, testing and documentation to address assigned tasks.
US, TX, Austin
What happens when you combine startup speed with Amazon-scale impact? You get this team. Amazon Enterprise Security Products is a newly launched group building intelligent, cloud-agnostic security tools using AI-first development practices. Here, you build AI and you build with AI at the same time. This role is a chance to define and lead the science strategy for the future of security tooling with a small, fast team that ships like a startup but deploys at Amazon scale. We're looking for a Senior Data Scientist who operates at the intersection of applied ML, agentic AI, and security; and who can set technical direction across ambiguous, undefined problem spaces. You won't just build models; you'll decide which problems are worth solving, architect the scientific approach for an entire product area, and raise the bar for how the team applies science. You'll partner with senior and principal engineers, applied scientists, security researchers, and PMs, and your judgment will shape roadmaps, not just deliverables. This is a role for someone who thrives in ambiguity, influences without authority, and turns "too ambitious" into shipped reality. Key job responsibilities - Set the science direction for a product area: Define the modeling strategy, scientific approach, and success metrics for entire categories of AI-first security capabilities, agentic systems, anomaly detection, threat classification, and automated response across multi-cloud environments. Decide where science can move the needle and where it can't. - Own the hardest, most ambiguous problems: Take on undefined, open-ended challenges where the path isn't clear, the data is messy or scarce, and the stakes are high. Frame the problem, choose the approach, and bring others along. - Build with AI to build AI and define how the team does it: Drive adoption of agentic coding tools, LLM-powered workflows, and experimental AI tooling across the science org. Establish the practices that multiply velocity for every scientist, not just yourself. - Architect agentic intelligence: Lead the design of models, embeddings, RAG pipelines, evaluation frameworks, and feedback loops that make multi-agent security systems smart, safe, and customer-ready at scale. Own the science architecture decisions others build on. - Drive technical strategy across teams: Influence roadmaps, dive deep with senior and principal scientists and engineers, and align cross-functional partners around a shared scientific vision. Your recommendations shape what the team invests in next. - Prototype, validate, and scale: Turn ambiguous hypotheses into prototypes in days, validate with real customer signal, and chart the path from prototype to production system that runs reliably at Amazon scale. - Communicate to influence at the executive level: Translate complex modeling results and scientific trade-offs into clear recommendations for engineers, product leaders, and senior executives. Drive organizational decisions with data and earn trust across the company. - Raise the bar and grow others: Mentor data scientists and applied scientists, lead technical and science reviews, and champion AI-first development practices. Shape the science culture and hiring bar of a fast-growing team from the ground floor. A day in the life No two days look the same on this fast-growing, AI-first team. You might start your morning setting direction in a roadmap review; making the call on which science investments will have the biggest customer impact and then dive into architecting an evaluation framework that the whole team will build on. Before lunch, you're pair-prompting with an agentic coding assistant to validate a new approach, then unblocking a teammate stuck on a thorny modeling problem. In the afternoon, you lead a design session with senior and principal scientists and engineers, then distill it into a crisp recommendation for senior leadership. You own ambiguous problems end to end, define how the team works, and see your decisions ripple across the product. This is where builders who want to lead with science come to do their best work. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
AU, NSW, Sydney
AWS Networking operates one of the largest and most complex networks on the planet. The team you'd join is responsible for the availability of that network — measuring how it performs for customers, predicting where it is most likely to degrade, and reshaping how we operate it as the workload grows. We are in the middle of a significant change in how network operations are run. Lessons from our recent work on automation, AI, and ML — including agentic systems that triage and mitigate incidents alongside engineers — are feeding into a broader rethink of where humans focus, where automation takes over, and how we measure whether either is working. We are looking for a Data Scientist to join the team in Sydney to drive the data science strategy behind that change. You will define the metrics that matter, own the evidence the team uses to make decisions, and measure whether each decision delivered the outcomes we expected. You'll be the data science voice on a team of senior network and software engineers — the person who decides what we measure, how we measure it, and what the numbers actually mean. Concretely, that means setting the analytical bar for the program, designing risk and reliability models against telemetry from millions of network devices, surfacing the patterns that drive customer-impact incidents, and turning that analysis into the dashboards and metrics our leaders use to set priorities. It also means owning the evaluations that tell us when a new piece of automation — including the agents we are rolling out to support engineers on the front line — is actually moving the needle on availability, and not just adding noise. If you are a scientist who wants to shape how a tier-one production network is run — using data to drive program strategy, not just to support it — at a scale no academic lab or startup can match, and you're at your best as the data science voice embedded in a team of engineers, this is the team for you. Key job responsibilities - Define and drive the data science strategy for the program — the metrics, the experiments, and what counts as evidence that a change worked - Lead the design and deployment of predictive risk and reliability models for network availability, using device failures, alarm telemetry, ticket data, and traffic signals - Own the evidence behind program decisions: where availability is at risk, where automation is ready to expand, where engineering effort has the highest leverage. Defend recommendations to senior technical and business audiences - Design and own the operational analytics and dashboards (Amazon QuickSight, Amazon CloudWatch, Python) used by senior leadership to track network health and the impact of operational change - Design and run experiments to evaluate the automation we are rolling out — including agentic systems supporting engineers on incidents — measuring whether each rollout improved availability - Drive data quality and classification improvements — event categorisation, root-cause attribution — so the program's metrics rest on solid ground - Build and own event-driven scoring pipelines (Python, SQL, AWS Lambda, Amazon S3, Amazon Athena) that keep the decide / measure / improve loop running - Bring statistical rigour to the engineers you partner with — review experiment designs, push back on unsupported assumptions, and raise the bar on how the team uses evidence A day in the life You might start the morning defining how the team will measure a new initiative — the success metrics, the counterfactual, the bar for calling it a win. By mid-morning you're with the engineering team turning a proposal into a decision: walking through trade-offs, pushing back where the data doesn't support an assumption. The afternoon is outcome measurement — refining the evaluation pipeline that tracks last week's rollout, updating the CloudWatch dashboard senior leadership uses to gate the next expansion, and prepping the data for an upcoming Director review. About the team We sit inside AWS Networking with a strong Sydney presence and a remit that spans network availability, the data and analytics that support it, and the automation we are building to change how operations are done. You'd be the data science voice in a small, senior team of network and software engineers in Sydney, partnering with the broader network engineering organisation across Seattle and Dublin. Small team, high autonomy, direct line to senior leadership, and a roadmap with real production impact rather than research demos.