Automated reasoning's scientific frontiers

Distributing proof search, reasoning about distributed systems, and automating regulatory compliance are just three fruitful research areas.

Automated reasoning is the algorithmic search through the infinite set of theorems in mathematical logic. We can use automated reasoning to answer questions about what systems such as biological models and computer programs can and cannot do in the wild.

In the 1990s, AMD, IBM, Intel, and other companies invested in automated reasoning for circuit and microprocessor design, leading to today’s widely used and industry-standard hardware formal-verification tools (e.g., JasperGold). In the 2000s, automated reasoning expanded to niche software domains such as device drivers (e.g., Static Driver Verifier) or transportation systems (e.g., Prover technology). In the 2010s, we saw automated reasoning increasingly applied to our foundational computing infrastructure, such as cryptography, networking, storage, and virtualization.

Related content
Meet Amazon Science’s newest research area.

With recently launched cloud services such as IAM Access Analyzer and VPC Network Access Analyzer, automated reasoning is now beginning to change how computer systems built on top of the cloud are developed and operated.

All these applications of automated reasoning rest on a common foundation: automated and semi-automated mechanical theorem provers. ACL2, CVC5, HOL-light’s Meson_tac, MiniSat, and Vampire are a few examples, but there are many more we could name. They are all, in outline, working on the same problem: the search for proofs in mathematical logic.

Over the past 30 years, slowly but surely, a virtuous cycle has formed: automated reasoning in specific and critical application areas drives more investment in foundational tools, while improvements in the foundational tools drive further applications. Around and around.

SAT graph comparison.png
The propositional-satisfiability problem (a.k.a. SAT) is NP-complete, and in the case of unstructured decision graphs (left), the problem instances can be prohibitively time consuming to solve. But when the decision graphs have some inherent structure (right), automatic solvers can exploit that structure to find solutions efficiently.
Visualizations produced by Carsten Sinz, using his 3DVis visualization tool

The increasingly difficult benchmarks driving the development of these tools present new science opportunities. International competitions such as CASC, SAT-COMP, SMT-COMP, SV-COMP, and the Termination competition have accelerated this virtuous cycle. On the application side, with increasing power from the tools come new research opportunities in the design of customer-intuitive tools (such as models of cellular signaling pathways or Amazon's abstraction of control policies for cloud computing).

As an example of the virtuous cycle at work, consider the following graph, which shows the results for all of the winners of SAT-COMP from 2002 to 2021, compared apples-to-apples in a competition with the same hardware and same benchmarks:

Winners 2021.png

This graph plots the number of benchmarks that each solver can solve in 200 seconds, 400 seconds, etc. The higher the line, the more benchmarks the solver could solve. By looking at the plot we can see, for example, that the 2010 winner (cryptominisat) solved approximately 50 benchmarks within the allotted 1,000 seconds, whereas the 2021 winner (kissat) can solve nearly four times as many benchmarks in the same time, using the same hardware. Why did the tools get better? Because members of the scientific community pushing on the application submitted benchmarks to the competitions, which helped tool developers take the tools to new heights of performance and scale.

At Amazon we see the velocity of the virtuous cycle dramatically increasing. Our automated-reasoning tools are now called billions of times daily, with growth rates exceeding 100% year-over-year. For example, AWS customers now have access to automated-reasoning-based features such as IAM Access Analyzer, S3 Block Public Access, or VPC Reachability Analyzer. We also see Amazon development teams using tools such as Dafny, P, and SAW.

Related content
In a pilot study, an automated code checker found about 100 possible errors, 80% of which turned out to require correction.

What’s most exciting to me as an automated-reasoning scientist is that our research area seems to be entering a golden era. I think we are beginning to witness a transformation in automated reasoning that is similar to what happened in virtualized computing as the cloud’s virtuous cycle spun up. As described in Werner Vogels’s 2019 re:Invent keynote, AWS’s EC2 team was driven by unprecedented customer adoption to reinvent its hypervisor, microprocessor, and networking stack, capturing significant improvements in security, cost, and team agility made possible by economies of scale.

There are parallels in automated reasoning today. Dramatic new infrastructure is needed for viable business reasons, putting a spotlight on research questions that were previously obscure and unsolved. Below I outline three examples of open research areas driven by the increasing scale of automated-reasoning tools and our underlying computing infrastructure.

Example: Distributed proof search

For over two decades the automated-reasoning scientific community has postulated that distributed-systems-based proof search could be faster than sequential proof search. But we didn’t have the economic scale to justify serious investigation of the question.

At Amazon, with our increased reliance on automated reasoning, we now have that kind of scale. For example, we sponsored the new cloud-based-tool tracks in several international competitions.

Compare the mallob-mono solver, the winner of SAT-COMP’s new cloud-solver track, to the single-microprocessor solvers:

2 Mallob-mono.png

Mallob-mono is now, by a wide margin, the most powerful SAT solver on the planet. And like the sequential solvers, the distributed solvers are improving.

As described in Kuhn’s seminal book The Structure of Scientific Revolutions, major perspective shifts like this tend to trigger scientific revolutions. The success of distributed proof search raises the possibility of similar revolutions. For example, we may need to re-evaluate our assumptions about when to use eager vs. lazy reduction techniques when converting between formalisms.

Related content
Rungta had a promising career with NASA, but decided the stars aligned for her at Amazon.

Here at Amazon, we recently reconsidered the PhD dissertation of University of California, Berkeley, professor Sanjit Seshia in light of mallob-mono and were able to quickly (in about 2,000 lines of Rust) develop a new eager-reduction-based solver that outperforms today’s leading lazy-reduction tools on the notoriously difficult SMT-COMP bcnscheduling and job_shop benchmarks. Here we are solving SAT problems that go beyond Booleans, to involve integers, real numbers, strings, or functions. We call this SAT modulo theories, or SMT.

In the graph below we compare the performance of leading lazy SMT solvers CVC5 and Z3 to a Seshia-style eager solver based on the SAT solvers Kissat and mallob-mono on those benchmarks:

Solver performance.png

We’ve published the code for our Seshia-style eager solver on GitHub.

There are many other open questions driven by distributed proof search. For example, is there an effective lookahead-solver strategy for SMT that would facilitate cube-and-conquer? Or as the Zoncolan service does when analyzing programs for security vulnerabilities, can we memoize intermediate lemmas in a cloud database and reuse them, rather than recomputing for each query? Can Monte Carlo tree search in the cloud on past proofs be used to synthesize more-effective proof search strategies?

Another example: Reasoning about distributed systems

Recent examples of formal reasoning within AWS at the level of distributed-protocol design include a proof of S3’s recently announced strong consistency and the protocol-level proof of secrecy in AWS's KMS service. The problem with these proofs is that they apply to the protocols that power the distributed services, not necessarily to the code running on the servers that use those protocols.

Related content
SOSP paper describes lightweight formal methods for validating new S3 data storage service.

Here at Amazon, we believe that automated reasoning at the level of protocol design has the greatest long-term value when the investment cost is amortized and protected via continuous integration/continuous delivery (CI/CD) integrations with the code that implements the protocols. That is, the benefit of upfront effort is often seen later, when protocol compliance proofs fail on buggy changes to implementation source code. The code doesn’t make it to production until the developers have fixed it.

Again: major perspective shifts like those resulting from successful proofs about S3 and KMS could trigger a revolution, à la Kuhn. For years, we have had tools for reasoning about distributed systems, such as TLA+ and P. But with the success of the work with S3 and KMS, it’s now clear that protocol design should be a first-class concept for engineering, with tools that support it, proactively finding errors and proving properties.

These tools should also connect to the source code that speaks the protocols by (i) constructing specifications that can be proved with existing code-level tools and (ii) synthesizing implementation code in languages such as C, Go, Rust, or Java. The tools would facilitate integration into our CI/CD, code review, and ticketing systems, allowing service teams to (iii) synthesize “runtime monitors” to exploit enterprise-level operations strength by providing telemetry about the status of a service’s conformance to a proved protocol.

Final example: Automating regulatory compliance

At the recent Computer-Aided Verification (CAV ’21) workshop called Formal Approaches to Certifying Compliance (talks recorded and available), we heard from NIST, Coalfire, Collins Aerospace, DARPA, and Amazon about the use of automated reasoning to lower the cost and the time-to-market added by regulatory compliance.

Karthik Amrutesh of the AWS security assurance team reported that automated reasoning enabled a 91% reduction in the time it took for our third-party auditor to produce evidence for checking controls. For perhaps the first time in the more than 2,500-year history of mathematical logic, we see a business use case that exploits the difference between finding proofs and checking proofs. What's the difference? Finding is usually the hard part, the creative part, the part that requires sophisticated algorithms. Finding is usually undecidable or NP-complete, depending on the context.

Meanwhile, not only is checking proofs decidable in most cases, but it’s often linear in the size of the proof. To check proofs, compliance auditors can use well-understood and trusted small solvers such as HOL-light.

Using cloud-scale automation to find the proofs lowers cost. That lets the auditor offer its services for less, saving the customer money. It also reduces the latency of audits, a major pain point for developers looking to go to market quickly.

An audit check involves constraints on the form that valid text strings can take. The set of constraints is known as a string theory, and the imposition of that theory means that audit checks are SMT problems.

From the perspective of automated-reasoning science, it becomes important to build string theory solvers that can efficiently construct easily checkable proof artifacts. In the realm of propositional satisfiability — SAT problems — the DRAT proof checker is now the standard methodology for communicating proofs. But in SMT, no such standard exists. What would a general-purpose theory-agnostic SMT format and checker look like?

Conclusion

We've come a long way from days when automated reasoning was the exclusive domain of circuit designers or aerospace engineers. Success in these early domains kicked off a virtuous cycle for the makers of the theorem provers that power automated reasoning. With applications for mainstream applications such as cloud computing, the automated-reasoning virtuous cycle is now radically accelerating. After 2,500 years of mathematical-logic research and 70+ years of automated-reasoning science, we live in a heady time. With wider adoption of and investment in automated reasoning, we are seeing economies of scale where what we can do now would have been unimaginable even two or three years ago. Welcome to the future!

Research areas

Related content

US, VA, Arlington
We are seeking an exceptional Data Scientist to join our team in PXT Central Science. The ideal candidate will thrive in a dynamic, multifaceted role where you'll translate complex business challenges into rigorous quantitative frameworks, extract actionable insights from structured and unstructured datasets, and architect science-backed, scalable solutions that elevate the experience of our 1 million+ employees worldwide. If you're energized by the opportunity to apply data science to our mission of making Amazon Earth's Best Employer, we want to hear from you. Key job responsibilities • Own the design, development, and maintenance of scalable models and prototypes leveraging statistical, machine learning, or GenAI methodologies to enhance employee experience. • Partner with scientists, engineers, and product leaders to solve for employee experience defects using scientific approaches, building new services and tools that deliverable measurable impact. • Author and maintain detailed technical documentation related to the projects you drive. • Communicate results to diverse audiences of varying technical background with effective writing, visualizations, and presentations • Stay current with emerging methods and technologies, and implement them strategically to amplify the team’s impact. About the team The Central Science Team within Amazon’s People Experience and Technology org (PXTCS) uses economics, behavioral science, statistics, machine learning, and Generative AI to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, well-being, and the value of work to Amazonians. We are an interdisciplinary team, which combines the talents of science, engineering, and UX to develop and deliver solutions that measurably achieve this goal.
US, WA, Bellevue
The Amazon Fulfillment Technologies (AFT) Science team is looking for an exceptional Applied Scientist, with strong optimization and analytical skills, to develop production solutions for one of the most complex systems in the world: Amazon’s Fulfillment Network. At AFT Science, we design, build and deploy optimization, simulation, and machine learning solutions to power the production systems running at world wide Amazon Fulfillment Centers. We solve a wide range of problems that are encountered in the network, including labor planning and staffing, demand prioritization, pick assignment and scheduling, and flow process optimization. We are tasked to develop innovative, scalable, and reliable science-driven solutions that are beyond the published state of art in order to run frequently (ranging from every few minutes to every few hours per use case) and continuously in our large scale network. Key job responsibilities As an Applied Scientist, you will work with other scientists, software engineers, product managers, and operations leaders to develop scientific solutions and analytics using a variety of tools and observe direct impact to process efficiency and associate experience in the fulfillment network. Key responsibilities include: * Develop an understanding and domain knowledge of operational processes, system architecture and functions, and business requirements * Deep dive into data and code to identify opportunities for continuous improvement and/or disruptive new approach * Develop scalable mathematical models for production systems to derive optimal or near-optimal solutions for existing and new challenges * Create prototypes and simulations for agile experimentation of devised solutions * Advocate technical solutions to business stakeholders, engineering teams, and senior leadership * Partner with engineers to integrate prototypes into production systems * Design experiment to test new or incremental solutions launched in production and build metrics to track performance About the team Amazon Fulfillment Technology (AFT) designs, develops and operates the end-to-end fulfillment technology solutions for all Amazon Fulfillment Centers (FC). We harmonize the physical and virtual world so Amazon customers can get what they want, when they want it. The AFT Science team has expertise in operations research, optimization, scheduling, planning, simulation, and machine learning. We also have domain expertise in the operational processes within the FCs and their defects. We prioritize advancements that support AFT tech teams and focus areas rather than specific fields of research or individual business partners. We influence each stage of innovation from inception to deployment which includes both developing novel solutions or improving existing approaches. Resulting production systems rely on a diverse set of technologies, our teams therefore invest in multiple specialties as the needs of each focus area evolves.
US, WA, Seattle
We are seeking an exceptional Data Scientist to join our team in PXT Central Science. The ideal candidate will thrive in a dynamic, multifaceted role where you'll translate complex business challenges into rigorous quantitative frameworks, extract actionable insights from structured and unstructured datasets, and architect science-backed, scalable solutions that elevate the experience of our 1 million+ employees worldwide. If you're energized by the opportunity to apply data science to our mission of making Amazon Earth's Best Employer, we want to hear from you. Key job responsibilities • Own the design, development, and maintenance of scalable models and prototypes leveraging statistical, machine learning, or GenAI methodologies to enhance employee experience. • Partner with scientists, engineers, and product leaders to solve for employee experience defects using scientific approaches, building new services and tools that deliverable measurable impact. • Author and maintain detailed technical documentation related to the projects you drive. • Communicate results to diverse audiences of varying technical background with effective writing, visualizations, and presentations • Stay current with emerging methods and technologies, and implement them strategically to amplify the team’s impact. About the team The Central Science Team within Amazon’s People Experience and Technology org (PXTCS) uses economics, behavioral science, statistics, machine learning, and Generative AI to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, well-being, and the value of work to Amazonians. We are an interdisciplinary team, which combines the talents of science, engineering, and UX to develop and deliver solutions that measurably achieve this goal.
US, WA, Bellevue
Alexa International is looking for a passionate, talented, and inventive Applied Scientist to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems, requiring strong deep learning and generative models knowledge. You will contribute to developing novel solutions and deliver high-quality results that impact Alexa's international products and services. Key job responsibilities As an Applied Scientist with the Alexa International team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs. Your work will directly impact our international customers in the form of products and services that make use of digital assistant technology. You will leverage Amazon's heterogeneous data sources, unique and diverse international customer nuances and large-scale computing resources to accelerate advances in text, voice, and vision domains in a multimodal setup. The ideal candidate possesses a solid understanding of machine learning, natural language understanding, modern LLM architectures, LLM evaluation & tooling, and a passion for pushing boundaries in this vast and quickly evolving field. They thrive in fast-paced environments to tackle complex challenges, excel at swiftly delivering impactful solutions while iterating based on user feedback, and collaborate effectively with cross-functional teams. A day in the life * Analyze, understand, and model customer behavior and the customer experience based on large-scale data. * Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants. * Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF. * Set up experimentation frameworks for agile model analysis and A/B testing. * Collaborate with partner teams on LLM evaluation frameworks and post-training methodologies. * Contribute to end-to-end delivery of solutions from research to production, including reusable science components. * Communicate solutions clearly to partners and stakeholders. * Contribute to the scientific community through publications and community engagement.
US, WA, Bellevue
Amazon’s Last Mile Team is looking for a passionate individual with strong optimization and analytical skills to join its Last Mile Science team in the endeavor of designing and improving the most complex planning of delivery network in the world. Last Mile builds global solutions that enable Amazon to attract an elastic supply of drivers, companies, and assets needed to deliver Amazon's and other shippers' volumes at the lowest cost and with the best customer delivery experience. Last Mile Science team owns the core decision models in the space of jurisdiction planning, delivery channel and modes network design, capacity planning for on the road and at delivery stations, routing inputs estimation and optimization. Our research has direct impact on customer experience, driver and station associate experience, Delivery Service Partner (DSP)’s success and the sustainable growth of Amazon. Optimizing the last mile delivery requires deep understanding of transportation, supply chain management, pricing strategies and forecasting. Only through innovative and strategic thinking, we will make the right capital investments in technology, assets and infrastructures that allows for long-term success. Our team members have an opportunity to be on the forefront of supply chain thought leadership by working on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. Key job responsibilities Candidates will be responsible for developing solutions to better manage and optimize delivery capacity in the last mile network. The successful candidate should have solid research experience in one or more technical areas of Operations Research or Machine Learning. These positions will focus on identifying and analyzing opportunities to improve existing algorithms and also on optimizing the system policies across the management of external delivery service providers and internal planning strategies. They require superior logical thinkers who are able to quickly approach large ambiguous problems, turn high-level business requirements into mathematical models, identify the right solution approach, and contribute to the software development for production systems. To support their proposals, candidates should be able to independently mine and analyze data, and be able to use any necessary programming and statistical analysis software to do so. Successful candidates must thrive in fast-paced environments, which encourage collaborative and creative problem solving, be able to measure and estimate risks, constructively critique peer research, and align research focuses with the Amazon's strategic needs.
US, WA, Bellevue
Alexa International is looking for a passionate, talented, and inventive Applied Scientist to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems, requiring strong deep learning and generative models knowledge. You will contribute to developing novel solutions and deliver high-quality results that impact Alexa's international products and services. Key job responsibilities As an Applied Scientist with the Alexa International team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs. Your work will directly impact our international customers in the form of products and services that make use of digital assistant technology. You will leverage Amazon's heterogeneous data sources, unique and diverse international customer nuances and large-scale computing resources to accelerate advances in text, voice, and vision domains in a multimodal setup. The ideal candidate possesses a solid understanding of machine learning, natural language understanding, modern LLM architectures, LLM evaluation & tooling, and a passion for pushing boundaries in this vast and quickly evolving field. They thrive in fast-paced environments to tackle complex challenges, excel at swiftly delivering impactful solutions while iterating based on user feedback, and collaborate effectively with cross-functional teams. A day in the life * Analyze, understand, and model customer behavior and the customer experience based on large-scale data. * Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants. * Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF. * Set up experimentation frameworks for agile model analysis and A/B testing. * Collaborate with partner teams on LLM evaluation frameworks and post-training methodologies. * Contribute to end-to-end delivery of solutions from research to production, including reusable science components. * Communicate solutions clearly to partners and stakeholders. * Contribute to the scientific community through publications and community engagement.
US, CA, Pasadena
The Amazon Web Services (AWS) Center for Quantum Computing (CQC) is a multi-disciplinary team of theoretical and experimental physicists, materials scientists, and hardware and software engineers on a mission to develop a fault-tolerant quantum computer. Throughout your internship journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of Quantum Computing and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Quantum Research Science and Applied Science Internships in Santa Clara, CA and Pasadena, CA. We are particularly interested in candidates with expertise in any of the following areas: superconducting qubits, cavity/circuit QED, quantum optics, open quantum systems, superconductivity, electromagnetic simulations of superconducting circuits, microwave engineering, benchmarking, quantum error correction, fabrication, etc. Key job responsibilities In this role, you will work alongside global experts to develop and implement novel, scalable solutions that advance the state-of-the-art in the areas of quantum computing. You will tackle challenging, groundbreaking research problems, work with leading edge technology, focus on highly targeted customer use-cases, and launch products that solve problems for Amazon customers. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
US, WA, Bellevue
Amazon is seeking a Language Data Scientist to join the Alexa International science team as domain expert. This role focuses on expanding analysis and evaluation of conversational interaction data deliverables. The Language Data Scientist is an expert in conversation assessment processes, working closely with a team of skilled machine learning scientists and engineers, and is a key member in developing new conventions for relevant annotation workflows. The Language Data Scientist will be own unique data analysis and research requests that support the training and evaluation of LLMs and machine learning models, and the overall processing of a data collection. Key job responsibilities To be successful in this role, you must have a passion for data, efficiency, and accuracy. Specifically, you will: - Own data analyses for customer-facing features, including launch go/no-go metrics for new features and accuracy metrics for existing features - Handle unique data analysis requests from a range of stakeholders, including quantitative and qualitative analyses to elevate customer experience with speech interfaces - Lead and evaluate changing dialog evaluation conventions, test tooling developments, and pilot processes to support expansion to new data areas - Continuously evaluate workflow tools and processes and offer solutions to ensure they are efficient, high quality, and scalable - Provide expert support for a large and growing team of data analysts - Provide support for ongoing and new data collection efforts as a subject matter expert on conventions and use of the data - Conduct research studies to understand speech and customer-Alexa interactions - Collaborate with scientists and product managers, and other stakeholders in defining and validating customer experience metrics
US, WA, Bellevue
Alexa International Science team is looking for a passionate, talented, and inventive Senior Applied Scientist to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems, requiring strong deep learning and generative models knowledge. At this level, you will drive cross-team scientific strategy, influence partner teams, and deliver solutions that have broad impact across Alexa's international products and services. Key job responsibilities As a Senior Applied Scientist with the Alexa International team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs, particularly delivering industry-leading scientific research and applied AI for multi-lingual applications — a challenging area for the industry globally. Your work will directly impact our global customers in the form of products and services that support Alexa+. You will leverage Amazon's heterogeneous data sources and large-scale computing resources to accelerate advances in text, speech, and vision domains. The ideal candidate possesses a solid understanding of machine learning, speech and/or natural language processing, modern LLM architectures, LLM evaluation & tooling, and a passion for pushing boundaries in this vast and quickly evolving field. They thrive in fast-paced environment, like to tackle complex challenges, excel at swiftly delivering impactful solutions while iterating based on user feedback, and are able to influence and align multiple teams around a shared scientific vision.
US, WA, Bellevue
Alexa International is looking for a passionate, talented, and inventive Applied Scientist to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems, requiring strong deep learning and generative models knowledge. You will contribute to developing novel solutions and deliver high-quality results that impact Alexa's international products and services. Key job responsibilities As an Applied Scientist with the Alexa International team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art with LLMs. Your work will directly impact our international customers in the form of products and services that make use of digital assistant technology. You will leverage Amazon's heterogeneous data sources, unique and diverse international customer nuances and large-scale computing resources to accelerate advances in text, voice, and vision domains in a multimodal setup. The ideal candidate possesses a solid understanding of machine learning, natural language understanding, modern LLM architectures, LLM evaluation & tooling, and a passion for pushing boundaries in this vast and quickly evolving field. They thrive in fast-paced environments to tackle complex challenges, excel at swiftly delivering impactful solutions while iterating based on user feedback, and collaborate effectively with cross-functional teams. A day in the life * Analyze, understand, and model customer behavior and the customer experience based on large-scale data. * Build novel online & offline evaluation metrics and methodologies for multimodal personal digital assistants. * Fine-tune/post-train LLMs using techniques like SFT, DPO, RLHF, and RLAIF. * Set up experimentation frameworks for agile model analysis and A/B testing. * Collaborate with partner teams on LLM evaluation frameworks and post-training methodologies. * Contribute to end-to-end delivery of solutions from research to production, including reusable science components. * Communicate solutions clearly to partners and stakeholders. * Contribute to the scientific community through publications and community engagement.