Michael I. Jordan, Amazon scholar and professor at the University of California, Berkeley
Michael I. Jordan, Amazon scholar and professor at the University of California, Berkeley
Credit: Flavia Loreto

Artificial Intelligence—The revolution hasn’t happened yet

Michael I. Jordan, Amazon scholar and professor at the University of California, Berkeley, writes about the classical goals in human-imitative AI, and reflects on how in the current hubbub over the AI revolution it is easy to forget that these goals haven’t yet been achieved. This article is reprinted with permission from the Harvard Data Science Review, where it first appeared.

Artificial Intelligence (AI) is the mantra of the current era. The phrase is intoned by technologists, academicians, journalists, and venture capitalists alike. As with many phrases that cross over from technical academic fields into general circulation, there is significant misunderstanding accompanying use of the phrase. However, this is not the classical case of the public not understanding the scientists—here the scientists are often as befuddled as the public. The idea that our era is somehow seeing the emergence of an intelligence in silicon that rivals our own entertains all of us, enthralling us and frightening us in equal measure. And, unfortunately, it distracts us.

There is a different narrative that one can tell about the current era. Consider the following story, which involves humans, computers, data, and life-or-death decisions, but where the focus is something other than intelligence-in-silicon fantasies. When my spouse was pregnant 14 years ago, we had an ultrasound. There was a geneticist in the room, and she pointed out some white spots around the heart of the fetus. “Those are markers for Down syndrome,” she noted, “and your risk has now gone up to one in 20.” She let us know that we could learn whether the fetus in fact had the genetic modification underlying Down syndrome via an amniocentesis, but amniocentesis was risky—the chance of killing the fetus during the procedure was roughly one in 300. Being a statistician, I was determined to find out where these numbers were coming from. In my research, I discovered that a statistical analysis had been done a decade previously in the UK in which these white spots, which reflect calcium buildup, were indeed established as a predictor of Down syndrome. I also noticed that the imaging machine used in our test had a few hundred more pixels per square inch than the machine used in the UK study. I returned to tell the geneticist that I believed that the white spots were likely false positives, literal white noise.

She said, “Ah, that explains why we started seeing an uptick in Down syndrome diagnoses a few years ago. That’s when the new machine arrived.”

We didn’t do the amniocentesis, and my wife delivered a healthy girl a few months later, but the episode troubled me, particularly after a back-of-the-envelope calculation convinced me that many thousands of people had gotten that diagnosis that same day worldwide, that many of them had opted for amniocentesis, and that a number of babies had died needlessly. The problem that this episode revealed wasn’t about my individual medical care; it was about a medical system that measured variables and outcomes in various places and times, conducted statistical analyses, and made use of the results in other situations. The problem had to do not just with data analysis per se, but with what database researchers call provenance—broadly, where did data arise, what inferences were drawn from the data, and how relevant are those inferences to the present situation? While a trained human might be able to work all of this out on a case-by-case basis, the issue was that of designing a planetary-scale medical system that could do this without the need for such detailed human oversight.

I’m also a computer scientist, and it occurred to me that the principles needed to build planetary-scale inference-and-decision-making systems of this kind, blending computer science with statistics, and considering human utilities, were nowhere to be found in my education. It occurred to me that the development of such principles—which will be needed not only in the medical domain but also in domains such as commerce, transportation, and education—were at least as important as those of building AI systems that can dazzle us with their game-playing or sensorimotor skills.

Whether or not we come to understand ‘intelligence’ any time soon, we do have a major challenge on our hands in bringing together computers and humans in ways that enhance human life. While some view this challenge as subservient to the creation of artificial intelligence, another more prosaic, but no less reverent, viewpoint is that it is the creation of a new branch of engineering. Much like civil engineering and chemical engineering in decades past, this new discipline aims to corral the power of a few key ideas, bringing new resources and capabilities to people, and to do so safely. Whereas civil engineering and chemical engineering built upon physics and chemistry, this new engineering discipline will build on ideas that the preceding century gave substance to, such as information, algorithm, data, uncertainty, computing, inference, and optimization. Moreover, since much of the focus of the new discipline will be on data from and about humans, its development will require perspectives from the social sciences and humanities.

While the building blocks are in place, the principles for putting these blocks together are not, and so the blocks are currently being put together in ad-hoc ways. Thus, just as humans built buildings and bridges before there was civil engineering, humans are proceeding with the building of societal-scale, inference-and-decision-making systems that involve machines, humans, and the environment. Just as early buildings and bridges sometimes fell to the ground—in unforeseen ways and with tragic consequences—many of our early societal-scale inference-and-decision-making systems are already exposing serious conceptual flaws.

Unfortunately, we are not very good at anticipating what the next emerging serious flaw will be. What we’re missing is an engineering discipline with principles of analysis and design.

The current public dialog about these issues too often uses the term AI as an intellectual wildcard, one that makes it difficult to reason about the scope and consequences of emerging technology. Let us consider more carefully what AI has been used to refer to, both recently and historically.

Most of what is labeled AI today, particularly in the public sphere, is actually machine learning (ML), a term in use for the past several decades. ML is an algorithmic field that blends ideas from statistics, computer science and many other disciplines (see below) to design algorithms that process data, make predictions, and help make decisions. In terms of impact on the real world, ML is the real thing, and not just recently. Indeed, that ML would grow into massive industrial relevance was already clear in the early 1990s, and by the turn of the century forward-looking companies such as Amazon were already using ML throughout their business, solving mission-critical, back-end problems in fraud detection and supply-chain prediction, and building innovative consumer-facing services such as recommendation systems. As datasets and computing resources grew rapidly over the ensuing two decades, it became clear that ML would soon power not only Amazon but essentially any company in which decisions could be tied to large-scale data. New business models would emerge. The phrase ‘data science’ emerged to refer to this phenomenon, reflecting both the need of ML algorithms experts to partner with database and distributed-systems experts to build scalable, robust ML systems, as well as reflecting the larger social and environmental scope of the resulting systems.This confluence of ideas and technology trends has been rebranded as ‘AI’ over the past few years. This rebranding deserves some scrutiny.

Historically, the phrase “artificial intelligence” was coined in the late 1950s to refer to the heady aspiration of realizing in software and hardware an entity possessing human-level intelligence. I will use the phrase “human-imitative AI” to refer to this aspiration, emphasizing the notion that the artificially intelligent entity should seem to be one of us, if not physically then at least mentally (whatever that might mean). This was largely an academic enterprise. While related academic fields such as operations research, statistics, pattern recognition, information theory, and control theory already existed, and often took inspiration from human or animal behavior, these fields were arguably focused on low-level signals and decisions. The ability of, say, a squirrel to perceive the three-dimensional structure of the forest it lives in, and to leap among its branches, was inspirational to these fields. AI was meant to focus on something different: the high-level or cognitive capability of humans to reason and to think. Sixty years later, however, high-level reasoning and thought remain elusive. The developments now being called AI arose mostly in the engineering fields associated with low-level pattern recognition and movement control, as well as in the field of statistics, the discipline focused on finding patterns in data and on making well-founded predictions, tests of hypotheses, and decisions.

Indeed, the famous backpropagation algorithm that David Rumelhart rediscovered in the early 1980s, and which is now considered at the core of the so-called “AI revolution,” first arose in the field of control theory in the 1950s and 1960s. One of its early applications was to optimize the thrusts of the Apollo spaceships as they headed towards the moon.

Since the 1960s, much progress has been made, but it has arguably not come about from the pursuit of human-imitative AI. Rather, as in the case of the Apollo spaceships, these ideas have often hidden behind the scenes, the handiwork of researchers focused on specific engineering challenges. Although not visible to the general public, research and systems-building in areas such as document retrieval, text classification, fraud detection, recommendation systems, personalized search, social network analysis, planning, diagnostics, and A/B testing have been a major success—these advances have powered companies such as Google, Netflix, Facebook, and Amazon.

One could simply refer to all of this as AI, and indeed that is what appears to have happened. Such labeling may come as a surprise to optimization or statistics researchers, who find themselves suddenly called AI researchers, but labels aside, the bigger problem is that the use of this single, ill-defined acronym prevents a clear understanding of the range of intellectual and commercial issues at play.

The past two decades have seen major progress—in industry and academia—in a complementary aspiration to human-imitative AI that is often referred to as “Intelligence Augmentation” (IA). Here computation and data are used to create services that augment human intelligence and creativity. A search engine can be viewed as an example of IA, as it augments human memory and factual knowledge, as can natural language translation, which augments the ability of a human to communicate. Computer-based generation of sounds and images serves as a palette and creativity enhancer for artists. While services of this kind could conceivably involve high-level reasoning and thought, currently they don’t; they mostly perform various kinds of string-matching and numerical operations that capture patterns that humans can make use of.

Hoping that the reader will tolerate one last acronym, let us conceive broadly of a discipline of “Intelligent Infrastructure” (II), whereby a web of computation, data, and physical entities exists that makes human environments more supportive, interesting, and safe. Such infrastructure is beginning to make its appearance in domains such as transportation, medicine, commerce, and finance, with implications for individual humans and societies. This emergence sometimes arises in conversations about an Internet of Things, but that effort generally refers to the mere problem of getting ‘things’ onto the Internet, not to the far grander set of challenges associated with building systems that analyze those data streams to discover facts about the world and permit ‘things’ to interact with humans at a far higher level of abstraction than mere bits.

For example, returning to my personal anecdote, we might imagine living our lives in a societal-scale medical system that sets up data flows and data-analysis flows between doctors and devices positioned in and around human bodies, thereby able to aid human intelligence in making diagnoses and providing care. The system would incorporate information from cells in the body, DNA, blood tests, environment, population genetics, and the vast scientific literature on drugs and treatments. It would not just focus on a single patient and a doctor, but on relationships among all humans, just as current medical testing allows experiments done on one set of humans (or animals) to be brought to bear in the care of other humans. It would help maintain notions of relevance, provenance, and reliability, in the way that the current banking system focuses on such challenges in the domain of finance and payment. While one can foresee many problems arising in such a system—privacy issues, liability issues, security issues, etc.—these concerns should be viewed as challenges, not show-stoppers.

We now come to a critical issue: is working on classical human-imitative AI the best or only way to focus on these larger challenges? Some of the most heralded recent success stories of ML have in fact been in areas associated with human-imitative AI—areas such as computer vision, speech recognition, game-playing, and robotics. Perhaps we should simply await further progress in domains such as these. There are two points to make here. First, although one would not know it from reading the newspapers, success in human-imitative AI has in fact been limited; we are very far from realizing human-imitative AI aspirations. The thrill (and fear) of making even limited progress on human-imitative AI gives rise to levels of over-exuberance and media attention that is not present in other areas of engineering.

Second, and more importantly, success in these domains is neither sufficient nor necessary to solve important IA and II problems. On the sufficiency side, consider self-driving cars. For such technology to be realized, a range of engineering problems will need to be solved that may have little relationship to human competencies (or human lack-of-competencies). The overall transportation system (an II system) will likely more closely resemble the current air-traffic control system than the current collection of loosely coupled, forward-facing, inattentive human drivers. It will be vastly more complex than the current air-traffic control system, specifically in its use of massive amounts of data and adaptive statistical modeling to inform fine-grained decisions. Those challenges need to be in the forefront versus a potentially distracting focus on human-imitative AI.

As for the necessity argument, some say that the human-imitative AI aspiration subsumes IA and II aspirations, because a human-imitative AI system would not only be able to solve the classical problems of AI (e.g., as embodied in the Turing test), but it would also be our best bet for solving IA and II problems. Such an argument has little historical precedent. Did civil engineering develop by envisaging the creation of an artificial carpenter or bricklayer? Should chemical engineering have been framed in terms of creating an artificial chemist? Even more polemically: if our goal was to build chemical factories, should we have first created an artificial chemist who would have then worked out how to build a chemical factory?

A related argument is that human intelligence is the only kind of intelligence we know, thus we should aim to mimic it as a first step. However, humans are in fact not very good at some kinds of reasoning—we have our lapses, biases, and limitations. Moreover, critically, we did not evolve to perform the kinds of large-scale decision-making that modern II systems must face, nor to cope with the kinds of uncertainty that arise in II contexts. One could argue that an AI system would not only imitate human intelligence, but also correct it, and would also scale to arbitrarily large problems. Of course, we are now in the realm of science fiction—such speculative arguments, while entertaining in the setting of fiction, should not be our principal strategy going forward in the face of the critical IA and II problems that are beginning to emerge. We need to solve IA and II problems on their own merits, not as a mere corollary to a human-imitative AI agenda.

It is not hard to pinpoint algorithmic and infrastructure challenges in II systems that are not central themes in human-imitative AI research. II systems require the ability to manage distributed repositories of knowledge that are rapidly changing and are likely to be globally incoherent. Such systems must cope with cloud-edge interactions in making timely, distributed decisions, and they must deal with long-tail phenomena where there is lots of data on some individuals and little data on most individuals. They must address the difficulties of sharing data across administrative and competitive boundaries. Finally, and of particular importance, II systems must bring economic ideas such as incentives and pricing into the realm of the statistical and computational infrastructures that link humans to each other and to valued goods. Such II systems can be viewed as not merely providing a service, but as creating markets. There are domains such as music, literature, and journalism that are crying out for the emergence of such markets, where data analysis links producers and consumers. And this must all be done within the context of evolving societal, ethical, and legal norms.

Of course, classical human-imitative AI problems remain of great interest as well. However, the current focus on doing AI research via the gathering of data, the deployment of deep learning infrastructure, and the demonstration of systems that mimic certain narrowly defined human skills—with little in the way of emerging explanatory principles—tends to deflect attention from major open problems in classical AI. These problems include the need to bring meaning and reasoning into systems that perform natural language processing, the need to infer and represent causality, the need to develop computationally tractable representations of uncertainty and the need to develop systems that formulate and pursue long-term goals. These are classical goals in human-imitative AI, but in the current hubbub over the AI revolution it is easy to forget that they are not yet solved.

IA will also remain quite essential, because for the foreseeable future, computers will not be able to match humans in their ability to reason abstractly about real-world situations. We will need well-thought-out interactions of humans and computers to solve our most pressing problems. And we will want computers to trigger new levels of human creativity, not replace human creativity (whatever that might mean).

It was John McCarthy (while a professor at Dartmouth, and soon to take a position at MIT) who coined the term AI, apparently to distinguish his budding research agenda from that of Norbert Wiener (then an older professor at MIT). Wiener had coined “cybernetics” to refer to his own vision of intelligent systems—a vision that was closely tied to operations research, statistics, pattern recognition, information theory, and control theory. McCarthy, on the other hand, emphasized the ties to logic. In an interesting reversal, it is Wiener’s intellectual agenda that has come to dominate in the current era, under the banner of McCarthy’s terminology. (This state of affairs is surely, however, only temporary; the pendulum swings more in AI than in most fields.)

Beyond the historical perspectives of McCarthy and Wiener, we need to realize that the current public dialog on AI—which focuses on narrow subsets of both industry and of academia—risks blinding us to the challenges and opportunities that are presented by the full scope of AI, IA, and II.

This scope is less about the realization of science-fiction dreams or superhuman nightmares, and more about the need for humans to understand and shape technology as it becomes ever more present and influential in their daily lives. Moreover, in this understanding and shaping, there is a need for a diverse set of voices from all walks of life, not merely a dialog among the technologically attuned. Focusing narrowly on human-imitative AI prevents an appropriately wide range of voices from being heard.

While industry will drive many developments, academia will also play an essential role, not only in providing some of the most innovative technical ideas, but also in bringing researchers from the computational and statistical disciplines together with researchers from other disciplines whose contributions and perspectives are sorely needed—notably the social sciences, the cognitive sciences, and the humanities.

On the other hand, while the humanities and the sciences are essential as we go forward, we should also not pretend that we are talking about something other than an engineering effort of unprecedented scale and scope; society is aiming to build new kinds of artifacts. These artifacts should be built to work as claimed. We do not want to build systems that help us with medical treatments, transportation options, and commercial opportunities only to find out after the fact that these systems don’t really work, that they make errors that take their toll in terms of human lives and happiness. In this regard, as I have emphasized, there is an engineering discipline yet to emerge for the data- and learning-focused fields. As exciting as these latter fields appear to be, they cannot yet be viewed as constituting an engineering discipline.

We should embrace the fact that we are witnessing the creation of a new branch of engineering. The term engineering has connotations—in academia and beyond—of cold, affectless machinery, and of loss of control for humans, but an engineering discipline can be what we want it to be. In the current era, we have a real opportunity to conceive of something historically new: a human-centric engineering discipline. I will resist giving this emerging discipline a name, but if the acronym AI continues to serve as placeholder nomenclature going forward, let’s be aware of the very real limitations of this placeholder. Let’s broaden our scope, tone down the hype, and recognize the serious challenges ahead.

Research areas

Related content

GB, London
We are looking for a passionate, talented, and inventive Data Scientist with a strong machine learning and analytics background to help build industry-leading language technology powering Rufus, our AI-driven search and shopping assistant, helping customers with their shopping tasks at every step of their shopping journey. This innovative role focuses on developing and optimizing large language model (LLM)-powered conversational experiences. The core emphasis is to get the best performance out of state-of-the-art LLMs via careful and methodical instruction design, contextual grounding, informed choices of MCP tools and agent/multi-agent systems, evaluation frameworks, and experimentation to systematically improve LLM quality, robustness, and customer impact. The work combines scientific rigor with product intuition to systematically raise the bar for conversational AI performance at Amazon scale. Our mission in conversational shopping is to make it easy for customers to find and discover the best products to meet their needs by helping with their product research, providing comparisons and recommendations, answering product questions, enabling shopping directly from images or videos, providing visual inspiration, and more. We do this by leveraging advanced analytics, Natural Language Processing (NLP), Machine Learning (ML), A/B testing, causal inference, and data-driven insights to continuously improve our systems. Key job responsibilities As a Data Scientist on our team, you will develop and maintain LLM instructions iterations and evaluation frameworks, including automated eval pipelines, LLM-as-a-judge methodologies, rubric design, and dataset curation to measure nuanced aspects of response quality. You will partner with the wider org to experiment with techniques such as retrieval augmentation, context enrichment, prompt decomposition, and model fine-tuning or post-training strategies, if and when applicable. You will leverage petabytes of data and identify opportunities to leverage machine learning models aimed at making conversational systems more performant. A day in the life You will: Perform hands-on analysis of large-scale multimodal interaction datasets to develop insights into how customers engage with conversational AI systems and how to improve response quality and customer experience. Use statistical methods, experimentation, and data-driven analysis to develop scalable approaches for measuring, evaluating, and optimizing large language model (LLM)-based shopping assistant systems, leveraging structured and unstructured contextual signals. Design and analyze A/B tests and experiments to evaluate new features and model improvements, ensuring statistical rigor and actionable insights. Develop metrics, dashboards, and reporting frameworks to monitor system performance, customer engagement, and business impact. Conduct deep-dive analyses to identify opportunities for improving conversational relevance, grounding, customer satisfaction, and downstream business impact. Collaborate with Applied Scientists and Engineers to translate analytical insights into production systems, working closely on model evaluation and deployment. Establish automated processes for large-scale data analysis, ETL pipelines, metric generation, and experimentation frameworks. Communicate results and insights to both technical and non-technical audiences, including through presentations, written reports, and data visualizations. About the team The Rufus Features Science team, based in London, works alongside ~150 engineers, designers and product managers, shaping the future of AI-driven shopping experiences at Amazon. The team works on every aspect of the Rufus AI, from making Rufus agentic, enabling customers to set price alerts or empower Rufus to act on their behalf and automatically purchase products when the price is right, to understanding multimodal user queries and generating answers that combine text, image, audio and video, including deep research reports that scour the web and the Amazon catalog to provide detailed and personalised shopping guidance. We utilize and advance state-of-art techniques in the fields of Natural Language Processing, gen AI, Information Retrieval, Machine/Deep Learning, and Data Mining. We validate our work by actively participating in the internal and external scientific communities.
CN, 44, Shenzhen
职位:Applied scientist 应用科学家实习生 毕业时间:2026年10月 - 2027年7月之间毕业的应届毕业生 · 入职日期:2026年6月及之前 · 实习时间:保证一周实习4-5天全职实习,至少持续3个月 · 工作地点:深圳福田区 投递须知: 1 填写简历申请时,请把必填和非必填项都填写完整。提交简历之后就无法修改了哦! 2 学校的英文全称请准确填写。中英文对应表请查这里(无法浏览请登录后浏览)https://docs.qq.com/sheet/DVmdaa1BCV0RBbnlR?tab=BB08J2 关于职位 Amazon Device &Services Asia团队正在寻找一位充满好奇心、善于沟通的应用科学家实习生,成为连接前沿AI研究与现实世界认知的桥梁。这是一个独特的角色——既需要动手参与机器学习项目,又要接受将复杂AI概念转化为通俗易懂内容的创意挑战。D&S Asia是亚马逊设备与服务业务在亚洲的支柱组织,自2009年支持Kindle制造起步,现已发展为横跨软硬件、AI(Alexa)及智能家居(Ring/Blink)的综合性团队,持续驱动区域业务创新与人才发展。 你将做什么 • 解密AI: 将复杂的技术发现转化为直观的解释、博客文章、教程或互动演示,让非技术背景的业务方和更广泛的社区都能理解 • 技术叙事: 与工程团队协作,以清晰、引人入胜的方式记录AI的能力与局限性 • 知识共享: 协助开发内部工作坊或"AI入门"课程,提升跨职能团队(产品、设计、商务)的AI素养 • 保持前沿: 持续学习并整合最新突破(如大语言模型、扩散模型、智能体),为团队输出简明易懂的趋势简报 • 研究与应用: 参与端到端的应用研究项目,从文献综述到原型开发,涵盖自然语言处理、计算机视觉或多模态AI领域
US, MA, N.reading
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics Group, we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of dexterous manipulation system that: - Enables unprecedented generalization across diverse tasks - Enables contact-rich manipulation in different environments - Seamlessly integrates low-level skills and high-level behaviors - Leverage mechanical intelligence, multi-modal sensor feedback and advanced control techniques. The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. A day in the life - Lead design and implementation of methods for Visual SLAM, navigation and spatial reasoning - Leverage simulation and real-world data collection to create large datasets for model development - Develop a hierarchical system that combines low-level control with high-level planning - Collaborate effectively with multi-disciplinary teams to co-design hardware and algorithms for dexterous manipulation
US, WA, Seattle
Amazon Prime is looking for an ambitious Economist Intern to help create econometric insights for world-wide Prime. Prime is Amazon's premiere membership program, with over 200M members world-wide. This role is at the center of many major company decisions that impact Amazon's customers. These decisions span a variety of industries, each reflecting the diversity of Prime benefits. These range from fast-free e-commerce shipping, digital content (e.g., exclusive streaming video, music, gaming, photos), reading, healthcare, and grocery offerings. Prime Science creates insights that power these decisions. As an economist intern in this role, you will create statistical tools that embed causal interpretations. You will utilize massive data, state-of-the-art scientific computing, econometrics (causal, counterfactual/structural, experimentation), and machine-learning, to do so. Some of the science you create will be publishable in internal or external scientific journals and conferences. You will work closely with a team of economists, applied scientists, data professionals (business analysts, business intelligence engineers), product managers, and software/data engineers. You will create insights from descriptive statistics, as well as from novel statistical and econometric models. You will create internal-to-Amazon-facing automated scientific data products to power company decisions. You will write strategic documents explaining how senior company leaders should utilize these insights to create sustainable value for customers. These leaders will often include the senior-most leaders at Amazon. The team is unique in its exposure to company-wide strategies as well as senior leadership. It operates at the research frontier of utilizing data, econometrics, artificial intelligence, and machine-learning to form business strategies. A successful candidate will have demonstrated a capacity for building, estimating, and defending statistical models (e.g., causal, counterfactual, machine-learning) using software such as R, Python, or STATA. They will have a willingness to learn and apply a broad set of statistical and computational techniques to supplement deep training in one area of econometrics. For example, many applications on the team motivate the use of structural econometrics and machine-learning. They rely on building scalable production software, which involves a broad set of world-class software-building skills often learned on-the-job. As a consequence, already-obtained knowledge of SQL, machine learning, and large-scale scientific computing using distributed computing infrastructures such as Spark-Scala or PySpark would be a plus. Additionally, this candidate will show a track-record of delivering projects well and on-time, preferably in collaboration with other team members (e.g. co-authors). Candidates must have very strong writing and emotional intelligence skills (for collaborative teamwork, often with colleagues in different functional roles), a growth mindset, and a capacity for dealing with a high-level of ambiguity. Endowed with these traits and on-the-job-growth, the role will provide the opportunity to have a large strategic, world-wide impact on the customer experiences of Prime members.
US, WA, Bellevue
The Mission Build AI safety systems that protect millions of Alexa customers every day. As conversational AI evolves, you'll solve challenging problems in Responsible AI by ensuring LLMs provide safe, trustworthy responses, building AI systems that understand nuanced human values across cultures, and maintaining customer trust at scale. What You'll Build You'll pioneer breakthrough solutions in Responsible AI at Amazon's scale. Imagine training models that set new safety standards, designing automated testing systems that hunt for vulnerabilities before they surface, and certifying the systems that power millions of daily conversations. You'll create intelligent evaluation systems that judge responses with human-level insight, build models that truly understand what makes interactions safe and delightful, and craft feedback mechanisms that help Alexa+ grasp the nuances of complex customer conversations. Here's where it gets even more exciting: you'll build AI agents that act as your team's safety net—automatically detecting and fixing production issues in real-time, often before anyone notices there was a problem. Your innovations won't just improve Alexa+; they'll fundamentally shape how it learns, evolves, and earns customer trust. As Alexa+ continues to delight customers, your work ensures it becomes more trustworthy, safer, and deeply aligned with customer needs and expectations. Your work directly protects customer trust at Amazon's scale. Every innovation you create—from novel safety mechanisms to sophisticated evaluation techniques—shapes how millions of people interact with AI confidently. You're not just building products; you're defining industry standards for responsible AI. This is frontier research with immediate real-world impact. You'll tackle problems that require innovative solutions: training models that remain truthful and grounded across diverse contexts, building reward models that capture the nuanced spectrum of human values across cultures and languages, and creating automated systems that continuously discover and address potential issues before customers encounter them. You'll collaborate with world-class scientists, product managers, and engineers to transform state-of-the-art ideas into production systems serving millions. What We're Looking For * Deep expertise in state-of-the-art NLP and Large Language Models * Track record of building scalable ML systems * Passion for impactful research—where frontier science meets real-world responsibility at scale * Excitement about solving problems that will shape the future of AI Ready to work on AI safety challenges that define the industry? Join us. Key job responsibilities This is where you'll make your mark. You'll architect breakthrough Responsible AI solutions that become industry benchmarks, pioneering algorithms that eliminate false information, designing frameworks that hunt down vulnerabilities before bad actors find them, and developing models that understand human values across every culture we serve. Working with world-class engineers and scientists, you'll push the boundaries of model training—transforming bold research into production systems that protect millions of customers daily while withstanding attacks and delivering exceptional experiences. But here's what makes this role truly special: you'll shape the future. You'll lead certification processes, advance optimization techniques, build evaluation systems that reason like humans, and mentor the next generation of AI safety experts. Every innovation you drive will set new standards for trustworthy AI at the world's largest scale. A day in the life As a Responsible AI Scientist, you're at the frontier of AI safety—experimenting with breakthrough techniques that push the boundaries of what's possible. You partner with engineering to transform research into production-ready solutions, tackling complex optimization challenges. You brainstorm with Product teams, translating ambitious visions into concrete objectives that drive real impact. Your expertise shapes critical deployment decisions as you review impactful work and guide go/no-go calls. You mentor the next generation of AI safety leaders, watching ideas spark and capabilities grow. This is where science meets impact—building AI that's not just intelligent, but trustworthy and aligned with human values. About the team Our team pioneers Responsible AI for conversational assistants. We ensure Alexa delivers safe, trustworthy experiences across all devices, modalities, and languages worldwide. We work on frontier AI safety challenges—and we're looking for scientists who want to help shape the future of trustworthy AI.
US, WA, Bellevue
The Mission Build AI safety systems that protect millions of Alexa customers every day. As conversational AI evolves, you'll solve challenging problems in Responsible AI by ensuring LLMs provide safe, trustworthy responses, building AI systems that understand nuanced human values across cultures, and maintaining customer trust at scale. What You'll Build You'll pioneer breakthrough solutions in Responsible AI at Amazon's scale. Imagine training models that set new safety standards, designing automated testing systems that hunt for vulnerabilities before they surface, and certifying the systems that power millions of daily conversations. You'll create intelligent evaluation systems that judge responses with human-level insight, build models that truly understand what makes interactions safe and delightful, and craft feedback mechanisms that help Alexa+ grasp the nuances of complex customer conversations. Here's where it gets even more exciting: you'll build AI agents that act as your team's safety net—automatically detecting and fixing production issues in real-time, often before anyone notices there was a problem. Your innovations won't just improve Alexa+; they'll fundamentally shape how it learns, evolves, and earns customer trust. As Alexa+ continues to delight customers, your work ensures it becomes more trustworthy, safer, and deeply aligned with customer needs and expectations. Your work directly protects customer trust at Amazon's scale. Every innovation you create—from novel safety mechanisms to sophisticated evaluation techniques—shapes how millions of people interact with AI confidently. You're not just building products; you're defining industry standards for responsible AI. This is frontier research with immediate real-world impact. You'll tackle problems that require innovative solutions: training models that remain truthful and grounded across diverse contexts, building reward models that capture the nuanced spectrum of human values across cultures and languages, and creating automated systems that continuously discover and address potential issues before customers encounter them. You'll collaborate with world-class scientists, product managers, and engineers to transform state-of-the-art ideas into production systems serving millions. What We're Looking For * Deep expertise in state-of-the-art NLP and Large Language Models * Track record of building scalable ML systems * Passion for impactful research—where frontier science meets real-world responsibility at scale * Excitement about solving problems that will shape the future of AI Ready to work on AI safety challenges that define the industry? Join us. Key job responsibilities This is where you'll make your mark. You'll architect breakthrough Responsible AI solutions that become industry benchmarks, pioneering algorithms that eliminate false information, designing frameworks that hunt down vulnerabilities before bad actors find them, and developing models that understand human values across every culture we serve. Working with world-class engineers and scientists, you'll push the boundaries of model training—transforming bold research into production systems that protect millions of customers daily while withstanding attacks and delivering exceptional experiences. But here's what makes this role truly special: you'll shape the future. You'll lead certification processes, advance optimization techniques, build evaluation systems that reason like humans, and mentor the next generation of AI safety experts. Every innovation you drive will set new standards for trustworthy AI at the world's largest scale. A day in the life As a Responsible AI Scientist, you're at the frontier of AI safety—experimenting with breakthrough techniques that push the boundaries of what's possible. You partner with engineering to transform research into production-ready solutions, tackling complex optimization challenges. You brainstorm with Product teams, translating ambitious visions into concrete objectives that drive real impact. Your expertise shapes critical deployment decisions as you review impactful work and guide go/no-go calls. You mentor the next generation of AI safety leaders, watching ideas spark and capabilities grow. This is where science meets impact—building AI that's not just intelligent, but trustworthy and aligned with human values. About the team Our team pioneers Responsible AI for conversational assistants. We ensure Alexa delivers safe, trustworthy experiences across all devices, modalities, and languages worldwide. We work on frontier AI safety challenges—and we're looking for scientists who want to help shape the future of trustworthy AI.
GB, London
We are looking for an Economist to work on exciting and challenging business problems related to Amazon Retail’s worldwide product assortment. You will build innovative solutions based on econometrics, machine learning, and experimentation. You will be part of a interdisciplinary team of economists, product managers, engineers, and scientists, and your work will influence finance and business decisions affecting Amazon’s vast product assortment globally. If you have an entrepreneurial spirit, you know how to deliver results fast, and you have a deeply quantitative, highly innovative approach to solving problems, and long for the opportunity to build pioneering solutions to challenging problems, we want to talk to you. Key job responsibilities * Work on a challenging problem that has the potential to significantly impact Amazon’s business position * Develop econometric models and experiments to measure the customer and financial impact of Amazon’s product assortment * Collaborate with other scientists at Amazon to deliver measurable progress and change * Influence business leaders based on empirical findings
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the limits. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As an Applied Scientist on our team, you will focus on building state-of-the-art ML models for biology. Our team rewards curiosity while maintaining a laser-focus in bringing products to market. Competitive candidates are responsive, flexible, and able to succeed within an open, collaborative, entrepreneurial, startup-like environment. At the forefront of both academic and applied research in this product area, you have the opportunity to work together with a diverse and talented team of scientists, engineers, and product managers and collaborate with other teams. Key job responsibilities - Build, adapt and evaluate ML models for life sciences applications - Collaborate with a cross-functional team of ML scientists, biologists, software engineers and product managers
US, WA, Seattle
Amazon Prime is looking for an ambitious Economist Intern to help create econometric insights for world-wide Prime. Prime is Amazon's premiere membership program, with over 200M members world-wide. This role is at the center of many major company decisions that impact Amazon's customers. These decisions span a variety of industries, each reflecting the diversity of Prime benefits. These range from fast-free e-commerce shipping, digital content (e.g., exclusive streaming video, music, gaming, photos), reading, healthcare, and grocery offerings. Prime Science creates insights that power these decisions. As an economist intern in this role, you will create statistical tools that embed causal interpretations. You will utilize massive data, state-of-the-art scientific computing, econometrics (causal, counterfactual/structural, experimentation), and machine-learning, to do so. Some of the science you create will be publishable in internal or external scientific journals and conferences. You will work closely with a team of economists, applied scientists, data professionals (business analysts, business intelligence engineers), product managers, and software/data engineers. You will create insights from descriptive statistics, as well as from novel statistical and econometric models. You will create internal-to-Amazon-facing automated scientific data products to power company decisions. You will write strategic documents explaining how senior company leaders should utilize these insights to create sustainable value for customers. These leaders will often include the senior-most leaders at Amazon. The team is unique in its exposure to company-wide strategies as well as senior leadership. It operates at the research frontier of utilizing data, econometrics, artificial intelligence, and machine-learning to form business strategies. A successful candidate will have demonstrated a capacity for building, estimating, and defending statistical models (e.g., causal, counterfactual, machine-learning) using software such as R, Python, or STATA. They will have a willingness to learn and apply a broad set of statistical and computational techniques to supplement deep training in one area of econometrics. For example, many applications on the team motivate the use of structural econometrics and machine-learning. They rely on building scalable production software, which involves a broad set of world-class software-building skills often learned on-the-job. As a consequence, already-obtained knowledge of SQL, machine learning, and large-scale scientific computing using distributed computing infrastructures such as Spark-Scala or PySpark would be a plus. Additionally, this candidate will show a track-record of delivering projects well and on-time, preferably in collaboration with other team members (e.g. co-authors). Candidates must have very strong writing and emotional intelligence skills (for collaborative teamwork, often with colleagues in different functional roles), a growth mindset, and a capacity for dealing with a high-level of ambiguity. Endowed with these traits and on-the-job-growth, the role will provide the opportunity to have a large strategic, world-wide impact on the customer experiences of Prime members.
US, VA, Arlington
The People eXperience and Technology Central Science (PXTCS) team uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, well-being, and the value of work to Amazonians. The Benefits Science team is looking for a senior economist to transform complex business challenges into actionable scientific insights. In this role, you will partner directly with business leaders to design and evaluate pilots, build models using large-scale data, and scale successful prototypes into company-wide policies and programs. We're looking for someone who can combine rigorous scientific thinking with practical business acumen and is passionate about using economics to improve employee experiences at scale. The ideal candidate will thrive in interdisciplinary environments, working alongside engineers, data scientists, and business leaders from diverse backgrounds. Key job responsibilities * Design and evaluate innovative research pilots that address critical business challenges * Develop sophisticated economic models using large-scale organizational data * Collaborate with engineers, data scientists, and business leaders to transform research insights into actionable strategies * Write and present comprehensive research findings to senior leadership * Scale successful prototypes into company-wide policies and programs A day in the life Work with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions. About the team Our Benefits Science team is a dynamic group of economists, data scientists, and business strategists committed to understanding human capital at scale. We use interdisciplinary approaches to solve complex workforce challenges, combining economics, behavioral science, and advanced analytics to create meaningful workplace improvements.