A gentle introduction to automated reasoning

Meet Amazon Science’s newest research area.

This week, Amazon Science added automated reasoning to its list of research areas. We made this change because of the impact that automated reasoning is having here at Amazon. For example, Amazon Web Services’ customers now have direct access to automated-reasoning-based features such as IAM Access Analyzer, S3 Block Public Access, or VPC Reachability Analyzer. We also see Amazon development teams integrating automated-reasoning tools into their development processes, raising the bar on the security, durability, availability, and quality of our products.

The goal of this article is to provide a gentle introduction to automated reasoning for the industry professional who knows nothing about the area but is curious to learn more. All you will need to make sense of this article is to be able to read a few small C and Python code fragments. I will refer to a few specialist concepts along the way, but only with the goal of introducing them in an informal manner. I close with links to some of our favorite publicly available tools, videos, books, and articles for those looking to go more in-depth.

Let’s start with a simple example. Consider the following C function:

bool f(unsigned int x, unsigned int y) {
   return (x+y == y+x);
}

Take a few moments to answer the question “Could f ever return false?” This is not a trick question: I’ve purposefully used a simple example to make a point.

To check the answer with exhaustive testing, we could try executing the following doubly nested test loop, which calls f on all possible pairs of values of the type unsigned int:

#include<stdio.h>
#include<stdbool.h>
#include<limits.h>

bool f(unsigned int x, unsigned int y) {
   return (x+y == y+x);
}

void main() {
   for (unsigned int x=0;1;x++) {
      for (unsigned int y=0;1;y++) {
         if (!f(x,y)) printf("Error!\n");
         if (y==UINT_MAX) break;
      }
      if (x==UINT_MAX) break;
   }
}

Unfortunately, even on modern hardware, this doubly nested loop will run for a very long time. I compiled it and ran it on a 2.6 GHz Intel processor for over 48 hours before giving up.

Why does testing take so long? Because UINT_MAX is typically 4,294,967,295, there are 18,446,744,065,119,617,025 separate f calls to consider. On my 2.6 GHz machine, the compiled test loop called f approximately 430 million times a second. But to test all 18 quintillion cases at this performance, we would need over 1,360 years.

When we show the above code to industry professionals, they almost immediately work out that f can't return false as long as the underlying compiler/interpreter and hardware are correct. How do they do that? They reason about it. They remember from their school days that x + y can be rewritten as y + x and conclude that f always returns true.

Re:Invent 2021 keynote address by Peter DeSantis, senior vice president for utility computing at Amazon Web Services
Skip to 15:49 for a discussion of Amazon Web Services' work on automated reasoning.

An automated reasoning tool does this work for us: it attempts to answer questions about a program (or a logic formula) by using known techniques from mathematics. In this case, the tool would use algebra to deduce that x + y == y + x can be replaced with the simple expression true.

Automated-reasoning tools can be incredibly fast, even when the domains are infinite (e.g., unbounded mathematical integers rather than finite C ints). Unfortunately, the tools may answer “Don’t know” in some instances. We'll see a famous example of that below.

The science of automated reasoning is essentially focused on driving the frequency of these “Don’t know” answers down as far as possible: the less often the tools report "Don't know" (or time out while trying), the more useful they are.

Today’s tools are able to give answers for programs and queries where yesterday’s tools could not. Tomorrow’s tools will be even more powerful. We are seeing rapid progress in this field, which is why at Amazon, we are increasingly getting so much value from it. In fact, we see automated reasoning forming its own Amazon-style virtuous cycle, where more input problems to our tools drive improvements to the tools, which encourages more use of the tools.

A slightly more complex example. Now that we know the rough outlines of what automated reasoning is, the next small example gives a slightly more realistic taste of the sort of complexity that the tools are managing for us.

void g(int x, int y) {
   if (y > 0)
      while (x > y)
         x = x - y;
}

Or, alternatively, consider a similar Python program over unbounded integers:

def g(x, y):
   assert isinstance(x, int) and isinstance(y, int)
   if y > 0:
      while x > y:
         x = x - y

Try to answer this question: “Does g always eventually return control back to its caller?”

When we show this program to industry professionals, they usually figure out the right answer quickly. A few, especially those who are aware of results in theoretical computer science, sometimes mistakenly think that we can't answer this question, with the rationale “This is an example of the halting problem, which has been proved insoluble”. In fact, we can reason about the halting behavior for specific programs, including this one. We’ll talk more about that later.

Here’s the reasoning that most industry professionals use when looking at this problem:

  1. In the case where y is not positive, execution jumps to the end of the function g. That’s the easy case.
  2. If, in every iteration of the loop, the value of the variable x decreases, then eventually, the loop condition x > y will fail, and the end of g will be reached.
  3. The value of x always decreases only if y is always positive, because only then does the update to x (i.e., x = x - y) decrease x. But y’s positivity is established by the conditional expression, so x always decreases.

The experienced programmer will usually worry about underflow in the x = x - y command of the C program but will then notice that x > y before the update to x and thus cannot underflow.

If you carried out the three steps above yourself, you now have a very intuitive view of the type of thinking an automated-reasoning tool is performing on our behalf when reasoning about a computer program. There are many nitty-gritty details that the tools have to face (e.g., heaps, stacks, strings, pointer arithmetic, recursion, concurrency, callbacks, etc.), but there’s also decades of research papers on techniques for handling these and other topics, along with various practical tools that put these ideas to work.

Policy-code.gif
Automated reasoning can be applied to both policies (top) and code (bottom). In both cases, an essential step is reasoning about what's always true.

The main takeaway is that automated-reasoning tools are usually working through the three steps above on our behalf: Item 1 is reasoning about the program’s control structure. Item 2 is reasoning about what is eventually true within the program. Item 3 is reasoning about what is always true in the program.

Note that configuration artifacts such as AWS resource policies, VPC network descriptions, or even makefiles can be thought of as code. This viewpoint allows us to use the same techniques we use to reason about C or Python code to answer questions about the interpretation of configurations. It’s this insight that gives us tools like IAM Access Analyzer or VPC Reachability Analyzer.

An end to testing?

As we saw above when looking at f and g, automated reasoning can be dramatically faster than exhaustive testing. With tools available today, we can show properties of f or g in milliseconds, rather than waiting lifetimes with exhaustive testing.

Can we throw away our testing tools now and just move to automated reasoning? Not quite. Yes, we can dramatically reduce our dependency on testing, but we will not be completely eliminating it any time soon, if ever. Consider our first example:

bool f(unsigned int x, unsigned int y) {
   return (x + y == y + x);
}

Recall the worry that a buggy compiler or microprocessor could in fact cause an executable program constructed from this source code to return false. We might also need to worry about the language runtime. For example, the C math library or the Python garbage collector might have bugs that cause a program to misbehave.

What’s interesting about testing, and something we often forget, is that it’s doing much more than just telling us about the C or Python source code. It’s also testing the compiler, the runtime, the interpreter, the microprocessor, etc. A test failure could be rooted in any of those tools in the stack.

Automated reasoning, in contrast, is usually applied to just one layer of that stack — the source code itself, or sometimes the compiler or the microprocessor. What we find so valuable about reasoning is it allows us to clearly define both what we do know and what we do not know about the layer under inspection.

Furthermore, the models of the surrounding environment (e.g., the compiler or the procedure calling our procedure) used by the automated-reasoning tool make our assumptions very precise. Separating the layers of the computational stack helps make better use of our time, energy, and money and the capabilities of the tools today and tomorrow.

Unfortunately, we will almost always need to make assumptions about something when using automated reasoning — for example, the principles of physics that govern our silicon chips. Thus, testing will never be fully replaced. We will want to perform end-to-end testing to try and validate our assumptions as best we can.

An impossible program

I previously mentioned that automated-reasoning tools sometimes return “Don’t know” rather than “yes” or “no”. They also sometimes run forever (or time out), thus never returning an answer. Let’s look at the famous "halting problem" program, in which we know tools cannot return “yes” or “no”.

Imagine that we have an automated-reasoning API, called terminates, that returns “yes” if a C function always terminates or “no” when the function could execute forever. As an example, we could build such an API using the tool described here (shameless self-promotion of author’s previous work). To get the idea of what a termination tool can do for us, consider two basic C functions, g (from above),

void g(int x, int y) {
   if (y > 0)
      while (x > y)
         x = x - y;
}

and g2:

void g2(int x, int y) {
   while (x > y)
      x = x - y;
}

For the reasons we have already discussed, the function g always returns control back to its caller, so terminates(g) should return true. Meanwhile, terminates(g2) should return false because, for example, g2(5, 0) will never terminate.

Now comes the difficult function. Consider h:

void h() {
   if terminates(h) while(1){}
}

Notice that it's recursive. What’s the right answer for terminates(h)? The answer cannot be "yes". It also cannot be "no". Why?

Imagine that terminates(h) were to return "yes". If you read the code of h, you’ll see that in this case, the function does not terminate because of the conditional statement in the code of h that will execute the infinite loop while(1){}. Thus, in this case, the terminates(h) answer would be wrong, because h is defined recursively, calling terminates on itself.

Similarly, if terminates(h) were to return "no", then h would in fact terminate and return control to its caller, because the if case of the conditional statement is not met, and there is no else branch. Again, the answer would be wrong. This is why the “Don’t know” answer is actually unavoidable in this case.

The program h is a variation of examples given in Turing’s famous 1936 paper on decidability and Gödel’s incompleteness theorems from 1931. These papers tell us that problems like the halting problem cannot be “solved”, if bysolved” we mean that the solution procedure itself always terminates and answers either “yes” or “no” but never “Don’t know”. But that is not the definition of “solved” that many of us have in mind. For many of us, a tool that sometimes times out or occasionally returns “Don’t know” but, when it gives an answer, always gives the right answer is good enough.

This problem is analogous to airline travel: we know it’s not 100% safe, because crashes have happened in the past, and we are sure that they will happen in the future. But when you land safely, you know it worked that time. The goal of the airline industry is to reduce failure as much as possible, even though it’s in principle unavoidable.

To put that in the context of automated reasoning: for some programs, like h, we can never improve the tool enough to replace the "Don't know" answer. But there are many other cases where today's tools answer "Don't know", but future tools may be able to answer "yes" or "no". The modern scientific challenge for automated-reasoning subject-matter experts is to get the practical tools to return “yes” or “no” as often as possible. As an example of current work, check out CMU professor and Amazon Scholar Marijn Heule and his quest to solve the Collatz termination problem.

Another thing to keep in mind is that automated-reasoning tools are regularly trying to solve “intractable” problems, e.g., problems in the NP complexity class. Here, the same thinking applies that we saw in the case of the halting problem: automated-reasoning tools have powerful heuristics that often work around the intractability problem for specific cases, but those heuristics can (and sometimes do) fail, resulting in “Don’t know” answers or impractically long execution time. The science is to improve the heuristics to minimize that problem.

Nomenclature

A host of names are used in the scientific literature to describe interrelated topics, of which automated reasoning is just one. Here’s a quick glossary:

  • logic is a formal and mechanical system for defining what is true and untrue. Examples: propositional logic or first-order logic.
  • theorem is a true statement in logic. Example: the four-color theorem.
  • proof is a valid argument in logic of a theorem. Example: Gonthier's proof of the four-color theorem
  • mechanical theorem prover is a semi-automated-reasoning tool that checks a machine-readable expression of a proof often written down by a human. These tools often require human guidance. Example: HOL-light, from Amazon researcher John Harrison
  • Formal verification is the use of theorem proving when applied to models of computer systems to prove desired properties of the systems. Example: the CompCert verified C compiler
  • Formal methods is the broadest term, meaning simply the use of logic to reason formally about models of systems. 
  • Automated reasoning focuses on the automation of formal methods. 
  • semi-automated-reasoning tool is one that requires hints from the user but still finds valid proofs in logic. 

As you can see, we have a choice of monikers when working in this space. At Amazon, we’ve chosen to use automated reasoning, as we think it best captures our ambition for automation and scale. In practice, some of our internal teams use both automated and semi-automated reasoning tools, because the scientists we've hired can often get semi-automated reasoning tools to succeed where the heuristics in fully automated reasoning might fail. For our externally facing customer features, we currently use only fully automated approaches.

Next steps

In this essay, I’ve introduced the idea of automated reasoning, with the smallest of toy programs. I haven’t described how to handle realistic programs, with heap or concurrency. In fact, there are a wide variety of automated-reasoning tools and techniques, solving problems in all kinds of different domains, some of them quite narrow. To describe them all and the many branches and sub-disciplines of the field (e.g. “SMT solving”, “higher-order logic theorem proving”, “separation logic”) would take thousands of blogs posts and books.

Automated reasoning goes back to the early inventors of computers. And logic itself (which automated reasoning attempts to solve) is thousands of years old. In order to keep this post brief, I’ll stop here and suggest further reading. Note that it’s very easy to get lost in the weeds reading depth-first into this area, and you could emerge more confused than when you started. I encourage you to use a bounded depth-first search approach, looking sequentially at a wide variety of tools and techniques in only some detail and then moving on, rather than learning only one aspect deeply.

Suggested books:

International conferences/workshops:

Tool competitions:

Some tools:

Interviews of Amazon staff about their use of automated reasoning:

AWS Lectures aimed at customers and industry:

AWS talks aimed at the automated-reasoning science community:

AWS blog posts and informational videos:

Some course notes by Amazon Scholars who are also university professors:

A fun deep track:

Some algorithms found in the automated theorem provers we use today date as far back as 1959, when Hao Wang used automated reasoning to prove the theorems from Principia Mathematica.

Research areas

Related content

US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Principal Applied Scientist with a strong deep learning background, to lead the development of industry-leading technology with multimodal systems. As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance).
US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a highly skilled and experienced Sr. Applied Scientist, to support the development and implementation of state-of-the-art algorithms and models for supervised fine-tuning and reinforcement learning through human feedback and complex reasoning; with a focus across text, image, and video modalities. As an Sr. Applied Scientist, you will play a critical role in supporting the development of Generative AI (Gen AI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities Collaborate with cross-functional teams of engineers, product managers, and scientists to identify and solve complex problems in Gen AI Design and execute experiments to evaluate the performance of different algorithms (PT, SFT, RL) and models, and iterate quickly to improve results Think big about the arc of development of Gen AI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems Communicate results and insights to both technical and non-technical audiences, including through presentations and written reports About the team We are passionate scientists dedicated to pushing the boundaries of innovation in Gen AI with focus on Software Development use cases.
IN, HR, Gurugram
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Key job responsibilities - Develop ML models for various recommendation & search systems using deep learning, online learning, and optimization methods - Work closely with other scientists, engineers and product managers to expand the depth of our product insights with data, create a variety of experiments to determine the high impact projects to include in planning roadmaps - Stay up-to-date with advancements and the latest modeling techniques in the field - Publish your research findings in top conferences and journals A day in the life We're using advanced approaches such as foundation models to connect information about our videos and customers from a variety of information sources, acquiring and processing data sets on a scale that only a few companies in the world can match. This will enable us to recommend titles effectively, even when we don't have a large behavioral signal (to tackle the cold-start title problem). It will also allow us to find our customer's niche interests, helping them discover groups of titles that they didn't even know existed. We are looking for creative & customer obsessed machine learning scientists who can apply the latest research, state of the art algorithms and ML to build highly scalable page personalization solutions. You'll be a research leader in the space and a hands-on ML practitioner, guiding and collaborating with talented teams of engineers and scientists and senior leaders in the Prime Video organization. You will also have the opportunity to publish your research at internal and external conferences. About the team Prime Video Recommendation Science team owns science solution to power recommendation and personalization experience on various Prime Video surfaces and devices. We work closely with the engineering teams to launch our solutions in production.
US, WA, Seattle
Revolutionize the Future of AI at the Frontier of Applied Science Are you a brilliant mind seeking to push the boundaries of what's possible with artificial intelligence? Join our elite team of researchers and engineers at the forefront of applied science, where we're harnessing the latest advancements in natural language processing, deep learning, and generative AI to reshape industries and unlock new realms of innovation. As an Applied Science Intern, you'll have the unique opportunity to work alongside world-renowned experts, gaining invaluable hands-on experience with cutting-edge technologies such as large language models, transformers, and neural networks. You'll dive deep into complex challenges, fine-tuning state-of-the-art models, developing novel algorithms for named entity recognition, and exploring the vast potential of generative AI. This internship is not just about executing tasks – it's about being a driving force behind groundbreaking discoveries. You'll collaborate with cross-functional teams, leveraging your expertise in statistics, recommender systems, and question answering to tackle real-world problems and deliver impactful solutions. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for LLM & GenAI Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA; Pittsburgh, PA. Key job responsibilities We are particularly interested in candidates with expertise in: LLMs, NLP/NLU, Gen AI, Transformers, Fine-Tuning, Recommendation Systems, Deep Learning, NER, Statistics, Neural Networks, Question Answering. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and GenAI. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on recommendation systems, question answering, deep learning and generative AI. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Collaborate with cross-functional teams to tackle complex challenges in natural language processing, computer vision, and generative AI. - Fine-tune state-of-the-art models and develop novel algorithms to push the boundaries of what's possible. - Explore the vast potential of generative AI and its applications across industries. - Attend cutting-edge research seminars and engage in thought-provoking discussions with industry luminaries. - Leverage state-of-the-art computing infrastructure and access to the latest research papers to fuel your innovation. - Present your groundbreaking work and insights to the team, fostering a culture of knowledge-sharing and continuous learning.
US, WA, Seattle
Unlock the Future with Amazon Science! Calling all visionary minds passionate about the transformative power of machine learning! Amazon is seeking boundary-pushing graduate student scientists who can turn revolutionary theory into awe-inspiring reality. Join our team of visionary scientists and embark on a journey to revolutionize the field by harnessing the power of cutting-edge techniques in bayesian optimization, time series, multi-armed bandits and more. At Amazon, we don't just talk about innovation – we live and breathe it. You'll conducting research into the theory and application of deep reinforcement learning. You will work on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. You will propose and deploy solutions that will likely draw from a range of scientific areas such as supervised, semi-supervised and unsupervised learning, reinforcement learning, advanced statistical modeling, and graph models. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Machine Learning Applied Science Internships in, but not limited to Arlington, VA; Bellevue, WA; Boston, MA; New York, NY; Palo Alto, CA; San Diego, CA; Santa Clara, CA; Seattle, WA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Programming/Scripting Languages, Statistics, Reinforcement Learning, Causal Inference, Large Language Models, Time Series, Graph Modeling, Supervised/Unsupervised Learning, Deep Learning, Predictive Modeling In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Reinforcement Learning and Optimization within Machine Learning. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on developing novel RL algorithms and applying them to complex, real-world challenges. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Design, development and evaluation of highly innovative ML models for solving complex business problems. - Research and apply the latest ML techniques and best practices from both academia and industry. - Think about customers and how to improve the customer delivery experience. - Use and analytical techniques to create scalable solutions for business problems.
US, WA, Seattle
Shape the Future of Human-Machine Interaction Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies. Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated.. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling. - Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling. - Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning. - Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
US, WA, Seattle
Do you enjoy solving challenging problems and driving innovations in research? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? We are looking for builders, innovators, and entrepreneurs who want to bring their ideas to reality and improve the lives of millions of customers. As a Research Science intern focused on Operations Research and Optimization intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes and work with massive datasets. As you navigate through complex algorithms and data structures, you'll find yourself at the forefront of innovation, shaping the future of Amazon's fulfillment, logistics, and supply chain operations. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions. You'll then immerse yourself in a world of data, leveraging your expertise in optimization, causal inference, time series analysis, and machine learning to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Operations Research Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Causal Inference, Time Series, Algorithms and Data Structures, Statistics, Operations Research, Machine Learning, Programming/Scripting Languages, LLMs In this role, you will gain hands-on experience in applying cutting-edge analytical techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights to drive operational excellence, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life Develop and apply optimization, causal inference, and time series modeling techniques to drive operational efficiencies and improve decision-making across Amazon's fulfillment, logistics, and supply chain operations Design and implement scalable algorithms and data structures to support complex optimization systems Leverage statistical methods and machine learning to uncover insights and patterns in large-scale operations data Prototype and validate new approaches through rigorous experimentation and analysis Collaborate closely with cross-functional teams of researchers, engineers, and business stakeholders to translate research outputs into tangible business impact
US, CA, San Francisco
Are you a brilliant mind seeking to push the boundaries of what's possible with intelligent robotics? Join our elite team of researchers and engineers - led by Pieter Abeel, Rocky Duan, and Peter Chen - at the forefront of applied science, where we're harnessing the latest advancements in large language models (LLMs) and generative AI to reshape the world of robotics and unlock new realms of innovation. As an Applied Science Intern, you'll have the unique opportunity to work alongside world-renowned experts, gaining invaluable hands-on experience with cutting-edge robotics technologies. You'll dive deep into exciting research projects at the intersection of AI and robotics. This internship is not just about executing tasks – it's about being a driving force behind groundbreaking discoveries. You'll collaborate with cross-functional teams, leveraging your expertise in areas such as deep learning, reinforcement learning, computer vision, and motion planning to tackle real-world problems and deliver impactful solutions. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied robotics and AI, where your contributions will shape the future of intelligent systems and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Must be eligible and available for a full-time (40h/ week) 12 week internship between May 2026 and September 2026. Amazon has positions available in San Francisco, CA and Seattle, WA. The ideal candidate should possess: - Strong background in machine learning, deep learning, and/or robotics - Publication record at science conferences such as NeurIPS, CVPR, ICRA, RSS, CoRL, and ICLR. - Experience in areas such as multimodal LLMs, world models, image/video tokenization, real2Sim/Sim2real transfer, bimanual manipulation, open-vocabulary panoptic scene understanding, scaling up multi-modal LLMs, and end-to-end vision-language-action models. - Proficiency in Python, Experience with PyTorch or JAX - Excellent problem-solving skills, attention to detail, and the ability to work collaboratively in a team Apply now and embark on an extraordinary journey of discovery and innovation! Key job responsibilities - Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and generative AI for robotics - Tackle challenging, groundbreaking research problems on production-scale data, with a focus on robotic perception, manipulation, and control - Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in areas such as deep learning, reinforcement learning, computer vision, and motion planning - Demonstrate the ability to work independently, thrive in a fast-paced, ever-changing environment, and communicate effectively with diverse stakeholders
US, WA, Seattle
Unleash Your Potential at the Forefront of AI Innovation At Amazon, we're on a mission to revolutionize the way the world leverages machine learning. Amazon is seeking graduate student scientists who can turn revolutionary theory into awe-inspiring reality. As an Applied Science Intern focused on Information and Knowledge Management in Machine Learning, you will play a critical role in developing the systems and frameworks that power Amazon's machine learning capabilities. You'll be at the epicenter of this transformation, shaping the systems and frameworks that power our cutting-edge AI capabilities. Imagine a role where you develop intuitive tools and workflows that empower machine learning teams to discover, reuse, and build upon existing models and datasets, accelerating innovation across the company. You'll leverage natural language processing and information retrieval techniques to unlock insights from vast repositories of unstructured data, fueling the next generation of AI applications. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Machine Learning Applied Science Internships in, but not limited to Arlington, VA; Bellevue, WA; Boston, MA; New York, NY; Palo Alto, CA; San Diego, CA; Santa Clara, CA; Seattle, WA. Key job responsibilities We are particularly interested in candidates with expertise in: Knowledge Graphs and Extraction, Neural Networks/GNNs, Data Structures and Algorithms, Time Series, Machine Learning, Natural Language Processing, Deep Learning, Large Language Models, Graph Modeling, Knowledge Graphs and Extraction, Programming/Scripting Languages In this role, you'll collaborate with brilliant minds to develop innovative frameworks and tools that streamline the lifecycle of machine learning assets, from data to deployed models in areas at the intersection of Knowledge Management within Machine Learning. You will conduct groundbreaking research into emerging best practices and innovations in the field of ML operations, knowledge engineering, and information management, proposing novel approaches that could further enhance Amazon's machine learning capabilities. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Design, development and evaluation of highly innovative ML models for solving complex business problems. - Research and apply the latest ML techniques and best practices from both academia and industry. - Think about customers and how to improve the customer delivery experience. - Use and analytical techniques to create scalable solutions for business problems.