On-device speech processing makes Alexa faster, lower-bandwidth

Innovative training methods and model compression techniques combine with clever engineering to keep speech processing local.

At Amazon, we always look to invent new technology for improving customer experience. One technology we have been working on at Alexa is on-device speech processing, which has multiple benefits: a reduction in latency, or the time it takes Alexa to respond to queries; lowered bandwidth consumption, which is important on portable devices; and increased availability in in-car units and other applications where Internet connectivity is intermittent. On-device processing also enables the fusion of the speech signal with other modalities, like vision, for features such as Alexa’s natural turn-taking.

In the last year, we’ve continued to build upon Alexa’s on-device speech-processing capabilities. As a result of these inventions, we are launching a new setting that gives customers the option of having the audio of their Alexa voice requests processed locally, without being sent to the cloud.

In the cloud, storage space and computational capacity are effectively unconstrained. To ensure accuracy, our cloud models can be large and computationally demanding. Executing the same functions on-device means compressing our models into less than 1% as much space — with minimal loss in accuracy.

Moreover, in the cloud, the separate components of Alexa’s speech-processing stack — automatic speech recognition (ASR), whisper detection, and speaker identification — run on separate server nodes with their own powerful processors. On-device, those functions have to share hardware not only with each other but with Alexa’s other core device functions, such as music playback.

Re-creating Alexa’s speech-processing stack on-device was a massive undertaking. New methods for training small-footprint ASR models were part of the solution, but so were innovations in system design and hardware-software codesign. It was a joint effort across science and engineering teams over a span of years. Here’s a quick overview of how it works.

System architecture

Our on-device ASR model takes in an acoustic speech signal and outputs a set of hypotheses about what the speaker said, ranked according to probability. We represent those hypotheses as a lattice — a graph whose edges represent recognized words and the probability that a given word follows from the previous one.

Sample lattice.cropped.png
An example of a lattice representing ASR hypotheses.

With cloud-based ASR, encrypted audio streams to the cloud in small snippets called “frames”. With on-device ASR, only the lattice is sent to the cloud, where a large and powerful neural language model reranks the hypotheses. The lattice can’t be sent until the customer has finished speaking, as words later in a sequence can dramatically change the overall probability of a hypothesis.

The model that determines when the customer has finished speaking is called an end-pointer. End-pointers offer a natural trade-off between accuracy and latency: an aggressive end-pointer will initiate speech processing earlier, but it might cut the speaker off prematurely, resulting in a poor customer experience.

On the device, we in fact run two end-pointers: One is a speculative end-pointer that we have tuned to be about 200 milliseconds faster than the final end-pointer, so we can initiate downstream processing — such as natural-language understanding (NLU) — ahead of the final end-pointed ASR result. In exchange for speed, however, we trade off a little accuracy.

The final end-pointer takes longer to make a decision but is more accurate. In cases in which the first end-pointer cuts speech off too early, the final end-pointer sends a revised lattice and the instruction to reset downstream processing. In the large majority of cases, however, the aggressive end-pointer is correct, which reduces user-perceived latency, since downstream tasks are initiated earlier.

Another aspect of ASR that had to move on-device is context awareness. When computing the probabilities in a lattice, the ASR model should, for instance, give added weight to otherwise uncommon names that happen to be in the customer’s address book or the names the customer has assigned to household devices.

AmazonScience_StaticGraphic
A diagram of the on-device ASR network, with a closeup of the biasing mechanism that allows the network to ingest dynamic content. (Based on figures in "Context-aware Transformer transducer for speech recognition")
Attention map.png
This attention map indicates that the trained network is attending to the correct entry in a list of Alexa-linked home appliances. (From "Context-aware Transformer transducer for speech recognition")

Context awareness can’t wait for the cloud because the lattice, though it encodes multiple hypotheses, doesn’t come close to encoding all possible hypotheses. When constructing the lattice, the ASR system has to prune a lot of low-probability hypotheses. If context awareness isn’t built into the on-device model, names of contacts or linked skills might end up getting pruned.

Initially, we use a so-called shallow-fusion model to add context and personalize content on-device. When the system is building the lattice, it boosts the probabilities of contextually relevant words such as contact or appliance names.

The probability boosts are heuristic, however — they’re not learned jointly with the core ASR model. To achieve even better accuracy on personalized and long-tail content, we have developed a multihead attention-based context-biasing mechanism that is jointly trained with the rest of the ASR subnetworks.

Model training

On-device ASR required us to build a new model from the ground up, an end-to-end recurrent neural network-transducer (RNN-T) model that directly maps the input speech signal to an output sequence of words. Using a single neural network results in a significantly reduced memory footprint. But we had to develop new techniques, both for inference and for training, to achieve the degree of accuracy and compression that would let this technology handle utterances on-device.

Previously on Amazon Science, we’ve discussed some of the techniques we used to increase the accuracy of small-footprint end-to-end ASR models. With teacher-student training, for instance, we teach a small, lean model to match the outputs of a more-powerful but slower model. We developed a training methodology that made it possible to do teacher-student training efficiently with a million hours of unannotated speech.

Stream-level context.png
During the training of a context-aware ASR model, a long-short-term-memory (LSTM) encoder encodes both unlabeled and labeled segments of the audio stream, so the model can use the entire input audio to improve ASR accuracy. (From "Improving RNN-T ASR accuracy using context audio")

To further boost the accuracy of on-device RNN-T ASR, we developed techniques that allow the neural network to learn and exploit audio context within a stream. For example, for a stream comprising two utterances, “Alexa” and “Play a song”, the audio context from the keyword segment (“Alexa”) helps the model focus on the foreground speech and speaker. Separately, we implemented a novel discriminative-loss and training algorithm that aims at directly minimizing the word error rate (WER) of RNN-T ASR.

On top of these innovations, however, we still had to develop some new compression techniques to get the RNN-T to run efficiently on-device. A neural network consists of simple processing nodes each of which is connected to several others. The connections between nodes have associated weights, which determine how much one node’s output contributes to the computation performed by the next node.

One way to shrink a neural network’s memory footprint is to quantize its weights — to divide the total range of weights into a small set of intervals and use a single value to represent all the weights in each interval. So, for instance, the weights 0.70, 0.76, and 0.79 might all get quantized to the single value 0.75. Specifying an interval requires fewer bits than specifying several different floating-point values.

If quantization is done after a network has been trained, performance can suffer. We developed a method of <i class="rte2-style-italic">quantization-aware</i> training that imposes a probability distribution on the network weights during training, so that they can be easily quantized with little effect on performance. Unlike previous quantization-aware training methods, which mostly take quantization into account in the forward pass, ours accounts for quantization in the backward direction, during weight updates, through network loss regularization. And it does that efficiently.

A way to make neural networks run more efficiently — also a vital concern on resource-constrained devices — is to reduce low weights to zero. Computations involving zero weights can be discarded, reducing the computational burden.

Sparsification.png
Over successive training epochs, sparsification gradually drops low weights in a weight matrix.

But again, doing that reduction after the network is trained can compromise performance. We developed a <i class="rte2-style-italic">sparsification</i> method that enables the gradual reduction of low-value weights during training, so the network learns a model amenable to weight pruning.

Neural networks are typically trained on multiple passes through the same set of training data, or epochs. During each epoch, we force the network weights to diverge more and more, so that at the end of the final epoch, a fixed number of weights — say, half — are effectively zero. They can be safely discarded.

AmazonScience_AmnetDemo_V1.gif
A demonstration of the branching encoder network.

To improve on-device efficiency, we also developed a branching encoder network that uses two different neural networks to convert speech inputs into numeric representations suitable for speech classification. One network is complex, one simple, and the ASR model decides on the fly whether it can get away with passing an input frame to the simple model, saving computational cost and time. We described this work in more detail in an earlier Amazon Science blog post.

Hardware-software codesign

Quantization and sparsification make no difference to performance if the underlying hardware can’t take advantage of them. Another key to getting ASR to run on-device was the design of Amazon’s AZ family of neural edge processors, which are optimized for our specific approach to compression.

For one thing, where a typical processor might represent data using 16 or 32 bits, for certain core operations, the AZ processors accelerate computation by using an 8-bit or even lower-bit representation, because that’s all we need to handle quantized values.

The weights of a neural network are typically represented using a matrix — a big grid of numbers. A matrix half of whose values are zeroes takes up as much space as a matrix that’s all nonzero.

On computer chips, transferring data tends to be much more time consuming than executing computations. So when we load our matrix into memory, we use a compression scheme that takes advantage of low-bit quantization and zero values. The circuitry for decoding the compressed representation is built into the chip.

In the neural processor’s memory, the matrix is reconstituted: the zeroes are filled back in. But the processor’s circuitry is designed to recognize zero values and discard computations involving them. So the time savings from sparsification are realized in the hardware itself.

Moving speech recognition on device entails a number of innovations in other areas, such as reduction in the bandwidth required for model updates and compression of NLU models, to ensure basic functionality on devices with intermittent Internet connectivity. And we’re also hard at work on multilingual on-device ASR models for dynamic language switching, or automatically recognizing which of two languages a customer is speaking and responding in kind.

The launch of on-device speech processing is a huge step in bringing the benefits of “processing on the edge” to our customers, and we will continue to invent on their behalf in this area.

Research areas

Related content

IN, KA, Bengaluru
The Seller Fee Science Team integrates economic modeling, machine learning, and artificial intelligence to guide fee strategy, quantify its impact, and ensure fees are accurately computed and explained for billions of transactions between Amazon selling partners and customers. We help build the foundations for growing selling partner businesses, bringing the best selection and prices to Amazon customers, and helping Amazon leaders make and implement high impact decisions that optimally balance profitability and growth. Our team brings together world-class economists, physicists, mathematicians, and computer scientists to tackle diverse challenging problems that require theoretical rigor and deliver real-world impact. As an data scientist on our team, this role will focus on the application of data analysis, econometrics, machine learning, and artificial intelligence to measure and predict Amazon's P&L, with emphasis on fee revenue. This blends the tools of data science, statistics, and ML/AI. Your work will shape not only how fees are decided, but how they are interpreted and planned. We are seeking scientists who are motivated by first principles, disciplined experimentation, and the technical challenge of deploying ideas at global scale. This is an opportunity to work on consequential problems where analytic rigor meets real-world complexity, and where your analysis, models, algorithms, and systems will directly influence the experience of millions of sellers. If you are driven to build elegant solutions to hard problems—and to see them operate in production at meaningful scale—we would welcome the opportunity to build with you. Key job responsibilities ** Translate ambiguous business challenges into well-defined scientific problems with measurable impact. ** Identify opportunities to improve fee revenue measurement, prediction, planning, structure, and level. ** Identify opportunities to improve measurement, and prediction of other items of the P&L, at appropriate levels of granularity. ** Design, develop, and deploy econometric or AI/ML models that improve our understanding of the relationship between fees and costs, or predict fee revenue, and other elements of the P&L. ** Partner closely with finance and fee strategy teams to formulate scientific questions, communicate results, and productionalize solutions. **Apply rigorous simulation methods to validate models and quantify business impact at scale. **Communicate scientific innovations and results clearly to cross-functional stakeholders and contribute to the broader internal and external scientific community through publications, talks, and technical artifacts. About the team Amazon’s third-party marketplace is a multibillion-dollar global service, connecting customers and sellers across through billions of transactions annually. The Seller Fee Science Team integrates economic modeling, machine learning, and artificial intelligence to guide business fee strategy, ensure fees are accurately computed for millions of products, and improve the seller experience with AI tools that support any fee related contact (understanding, audit, and dispute). We build the scientific foundation that empowers sellers to grow their businesses with clarity and confidence. Our team brings together world-class economists, physicists, mathematicians, and computer scientists to tackle diverse challenging problems that require theoretical rigor and deliver real-world impact.
US, NY, New York
The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. We are seeking a technical leader for our Supply Science team. This team is within the Sponsored Product team, and works on complex engineering, optimization, econometric, and user-experience problems in order to deliver relevant product ads on Amazon search and detail pages world-wide. The team operates with the dual objective of enhancing the experience of Amazon shoppers and enabling the monetization of our online and mobile page properties. Our work spans ML and Data science across predictive modeling, reinforcement learning (Bandits), adaptive experimentation, causal inference, data engineering. Key job responsibilities Search Supply and Experiences, within Sponsored Products, is seeking a Senior Applied Scientist to join a fast growing team with the mandate of creating new ads experience that elevates the shopping experience for our hundreds of millions customers worldwide. We are looking for a top analytical mind capable of understanding our complex ecosystem of advertisers participating in a pay-per-click model– and leveraging this knowledge to help turn the flywheel of the business. As a Senior Applied Scientist on this team you will: --Act as the technical leader in Machine Learning and drive full life-cycle Machine Learning projects. --Lead technical efforts within this team and across other teams. --Build machine learning models, perform proof-of-concept, experiment, optimize, and deploy your models into production. --Run A/B experiments, gather data, and perform statistical analysis. --Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving. --Work closely with software engineers to assist in productionizing your ML models. --Research new machine learning approaches. --Recruit Applied Scientists to the team and act as a mentor to other scientists on the team. A day in the life The successful candidate will be a self-starter comfortable with ambiguity, with strong attention to detail, and with an ability to work in a fast-paced, high-energy and ever-changing environment. The drive and capability to shape the direction is a must. About the team We are a customer-obsessed team of engineers, technologists, product leaders, and scientists. We are focused on continuous exploration of contexts and creatives where advertising delivers value to customers and advertisers. We specifically work on new ads experiences globally with the goal of helping shoppers make the most informed purchase decision. We obsess about our customers and we are continuously innovating on their behalf to enrich their shopping experience on Amazon
US, WA, Seattle
This role will contribute to developing the Economics and Science products and services in the Fee domain, with specialization in supply chain systems and fees. Through the lens of economics, you will develop causal links for how Amazon, Sellers and Customers interact. You will be a key and senior scientist, advising Amazon leaders how to price our services. You will work on developing frameworks and scaleable, repeatable models supporting optimal pricing and policy in the two-sided marketplace that is central to Amazon's business. The pricing for Amazon services is complex. You will partner with science and technology teams across Amazon including Advertising, Supply Chain, Operations, Prime, Consumer Pricing, and Finance. We are looking for an experienced Principal Economist to improve our understanding of seller Economics, enhance our ability to estimate the causal impact of fees, and work with partner teams to design pricing policy changes. In this role, you will provide guidance to scientists to develop econometric models to influence our fee pricing worldwide. You will lead the development of causal models to help isolate the impact of fee and policy changes from other business actions, using experiments when possible, or observational data when not. Key job responsibilities The ideal candidate will have extensive Economics knowledge, demonstrated strength in practical and policy relevant structural econometrics, strong collaboration skills, proven ability to lead highly ambiguous and large projects, and a drive to deliver results. They will work closely with Economists, Data / Applied Scientists, Strategy Analysts, Data Engineers, and Product leads to integrate economic insights into policy and systems production. Familiarity with systems and services that constitute seller supply chains is a plus but not required. About the team The Stores Economics and Sciences team is a central science team that supports Amazon's Retail and Supply Chain leadership. We tackle some of Amazon's most challenging economics and machine learning problems, where our mandate is to impact the business on massive scale.
US, WA, Bellevue
Are you inspired by invention? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Last Mile Simulations and Analytics Engineering team. WW AMZL Simulations and Analytics Engineering team is looking to build out our Simulation team to drive innovation across our Last Mile network. We start with the customer and work backwards in everything we do. If you’re interested in joining a rapidly growing team working to build a unique, solutions advisory group with a relentless focus on the customer, you’ve come to the right place. This is a blue-sky role that gives you a chance to roll up your sleeves and dive into big data sets in order to build discrete event 3D simulations using tools like Flexsim, Anylogic, Emulate 3D etc and experimentation systems at scale, build optimization algorithms and leverage advanced technologies across Amazon. This is an opportunity to think big about how to solve a challenging problem for the customers. As a Sr. Simulation Scientist, you are expected to deep dive into complex problems and drive relentlessly towards innovative solutions working with cross functional teams. Be comfortable interfacing and influencing various functional teams and individuals at all levels of the organization in order to be successful. Lead strategic modelling and simulation projects related to drive process design decisions. Your expertise in synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication will enable you to answer specific business questions and innovate for the future. You will apply advanced designs and methodologies for complex use cases across Last Mile network to drive innovation. In addition, you will contribute to the end state vision for simulation and experimentation of future delivery stations at Amazon. Key job responsibilities • Lead the design, implementation, and delivery of the simulation data science solutions to perform system of systems discrete event simulations for significantly complex operational processes that have a long-term impact on a product, business, or function using FlexSim, Demo 3D, AnyLogic or any other Discrete Event Simulation (DES) software packages • Lead strategic modeling and simulation research projects to drive process design decisions • Be an exemplary practitioner in simulation science discipline to establish best practices and simplify problems to develop discrete event simulations faster with higher standards • Identify and tackle intrinsically hard process flow simulation problems (e.g., highly complex, ambiguous, undefined, with less existing structure, or having significant business risk or potential for significant impact • Deliver artifacts that set the standard in the organization for excellence, from process flow control algorithm design to validation to implementations to technical documents using simulations • Be a pragmatic problem solver by applying judgment and simulation experience to balance cross-organization trade-offs between competing interests and effectively influence, negotiate, and communicate with internal and external business partners, contractors and vendors for multiple simulation projects • Provide simulation data and measurements that influence the business strategy of an organization. Write effective white papers and artifacts while documenting your approach, simulation outcomes, recommendations, and arguments • Lead and actively participate in reviews of simulation research science solutions. You bring clarity to complexity, probe assumptions, illuminate pitfalls, and foster shared understanding within simulation data science discipline • Pay a significant role in the career development of others, actively mentoring and educating the larger simulation data science community on trends, technologies, and best practices • Use advanced statistical /simulation tools and develop codes (python or another object oriented language) for data analysis , simulation, and developing modeling algorithms • Lead and coordinate simulation efforts between internal teams and outside vendors to develop optimal solutions for the network, including equipment specification, material flow control logic, process design, and site layout • Deliver results according to project schedules and quality A day in the life If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!
AU, VIC, Melbourne
Are you excited about leveraging and extending state-of-the-art Deep Learning, Information Retrieval, Natural Language Processing, Computer Vision algorithms to solve customer problems at the scale of Amazon? As an Applied Scientist Intern, you will be working in the Melbourne office in a fast-paced, cross-disciplinary team of experienced R&D scientists. You will take on complex problems, work on solutions that leverage existing academic and industrial research, and utilize your own out-of-the-box pragmatic thinking. In addition to coming up with novel solutions and prototypes, you may even deliver these to production in customer facing products. Key job responsibilities - Develop novel solutions and build prototypes - Work on complex problems in Deep Learning and Generative AI - Contribute to research that could significantly impact Amazon operations - Collaborate with a diverse team of experts in a fast-paced environment - Present your research findings to both technical and non-technical audiences - Collaborate with scientists on writing and submitting papers to top ML conferences, e.g. NeurIPS, ICML, ICLR, AISTATS, ACL ICCV, CVPR, KDD. Key Opportunities: - Work in a team of ML scientists to solve applied science problems at the scale of Amazon - Access to Amazon services and hardware - Potentially deliver solutions to production in customer-facing applications - Opportunities to be hired full-time after the internship Join us in shaping the future of AI at Amazon. Apply now and turn your research into real-world solutions!
US, WA, Redmond
We are searching for a talented candidate with expertise in orbital mechanics and spaceflight navigation, including LEO Satellite Orbit Determination. This position requires experience in simulation and analysis of spacecraft orbital mechanics and sequential orbit determination methods, including Extended Kalman Filters (EKF) and/or Unscented Kalman Filter (UKF). Strong analysis skills are required to develop engineering studies of complex large-scale dynamical systems. This position requires demonstrated expertise in computational analysis automation and tool development. Key job responsibilities - Perform spacecraft maneuver or navigation analysis in support of multi-disciplinary trades within the Amazon Leo team. - Contribute to prototype software development of flight algorithms. - Test and assess navigation software for integration into flight systems. - Assess and trouble-shoot the performance of Leo on-board GNSS hardware and software systems. - Work closely with GNC engineers to manage on-orbit performance and develop flight dynamics operations processes. Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum. A day in the life - Interacting with GNC teams to evaluate and troubleshoot satellite issues. - Working within the Flight Dynamics Research team to prioritize tasks. - Performing analysis, simulation, testing and documentation to address assigned tasks.
US, CA, San Francisco
Amazon Industrial Robotics is on a mission to redefine the future of automation — and we're looking for exceptional talent to help lead the way. We are building the next generation of advanced robotic systems that seamlessly blend cutting-edge AI, sophisticated control systems, and novel mechanical design to create adaptable, intelligent automation solutions capable of operating safely alongside humans in dynamic, real-world environments. At Amazon Industrial Robotics, we leverage the power of machine learning, artificial intelligence, and advanced robotics to solve some of the most complex operational challenges at a scale unlike anywhere else in the world. Our fleet of robots spans hundreds of facilities globally, working in sophisticated coordination to deliver on our promise of customer excellence — and we're just getting started. As a Sr. Applied Scientist in Robot Perception, you will be at the forefront of this transformation. You will develop and deploy state-of-the-art perception algorithms that enable robots to truly understand and interact with the physical world — bridging the gap between theoretical research and realworld impact. Bringing deep expertise in Computer Vision and a nuanced understanding of the capabilities and limitations of modern Vision-Language Models (VLMs), you will innovate boldly and push the boundaries of what's possible. Our vision for the Perception layer is ambitious: to enable seamless, intelligent interaction between the user, the robot, and its environment. This is a rare opportunity to work at the intersection of deep learning, large language models, and robotics — contributing to research that doesn't just advance the field, but reshapes it. You will collaborate with world-class teams pioneering breakthroughs in dexterous manipulation, locomotion, and humanrobot interaction, all at an unprecedented scale. Key job responsibilities Design, develop, and deploy perception algorithms for robotics systems, including object detection, segmentation, tracking, depth estimation, and scene understanding • Lead research initiatives in computer vision, sensor fusion and 3D perception • Collaborate with cross-functional teams including robotics engineers, software engineers, and product managers to define and deliver perception capabilities • Drive end-to-end ownership of ML models — from data collection and labeling strategy to training, evaluation, and deployment • Mentor junior scientists and engineers; contribute to a culture of technical excellence • Define and track key metrics to measure perception system performance in real-world environments • Publish research findings in top-tier venues (CVPR, ICCV, ECCV, ICRA, NeurIPS, etc.) and contribute to patents A day in the life Train ML models for deployment in simulation and real-world robots, identify and document their limitations post-deployment • Drive technical discussions within your team and with key stakeholders to develop innovative solutions to address identified limitations • Actively contribute to brainstorming sessions on adjacent topics, bringing fresh perspectives that help peers grow and succeed — and in doing so, build lasting trust across the team • Mentor team members while maintaining significant hands-on contribution to technical solutions About the team Our Industrial Robotics Group is a diverse group of scientists and engineers passionate about building intelligent machines. We value curiosity, rigor, and a bias for action. We believe in learning from failure and iterating quickly toward solutions that matter.
IN, KA, Bengaluru
Amazon.com’s Product Detail Page team is looking for talented, motivated and passionate applied scientist to be part of the design and development of a highly scalable multi-tiered shopping application to provide the best possible online shopping experience for Amazon customers world-wide. Our team is comprised of talented applied scientists, developers, testers, program managers, designers and product managers tasked with the singular goal to create THE world's best buying experience. Scientists on this team develop the next-generation technologies and experiences that change how millions interact and shop online. To provide the best possible online shopping at the scale of the web requires ideas from every area of computer science, including distributed computing, large-scale system design, machine learning, natural language processing, data compression and user interface design; the list goes on and is growing every day. We need our scientists to be versatile and always eager to tackle new problems as we continue to push technology forward. Our team leverages sophisticated econometric, machine learning, and big data technologies to help customers to discover the right products at the right prices from millions of trusted sellers billions of times a day. If you are looking for a career-defining opportunity on one of the most customer centric and business impacting teams within Amazon, we’d love to hear from you. We are looking for an Applied Scientist to help build the next generation of Detail Page optimization algorithms. These new set of algorithms will incorporate the continually changing preferences of our customers and continue to scale with numerous new programs that Amazon is introducing for our customers. You will work with multiple Amazon businesses and programs to identify big business opportunities and propose new business features and technical systems to improve customer experience on Amazon Detail Page, Search Page and many other widgets throughout the website. You will be responsible for the quality of algorithm design and will get the opportunity to present your ideas and share results of your deliverables with Amazon executives on a frequent basis. You will get an opportunity to work with senior scientists to define and enforce broad, company-wide technical standards in optimization techniques, statistical modeling and simulation techniques, and/or data analytics.
IT, Turin
As a Senior Applied Scientist in the Alexa AI team, you will define and drive the science roadmap for state-of-the-art conversational AI systems powered by large language models, directly impacting how millions of customers interact with Alexa daily. You'll lead the design of LLM fine-tuning, alignment, and agentic architectures that operate reliably at scale, owning end-to-end delivery from research formulation through production deployment. Working at the intersection of research and production, you'll translate state of the art advances into customer-facing features. Your work will span the full ML lifecycle: developing novel evaluation frameworks, building automated training pipelines, and conducting rigorous experimentation across diverse devices and endpoints. Collaborating with engineering, product, and cross-functional science teams across Amazon, you'll tackle the team's most complex technical challenges while maintaining practical focus on customer value. This role offers the opportunity to publish at top-tier conferences, generate intellectual property, and see your innovations scale to one of the world's most popular voice assistants. Key job responsibilities As a Senior Applied Scientist in the Alexa AI team: - Define and drive the science roadmap for conversational AI capabilities powered by large language models - Design, implement, and evaluate novel approaches to LLM fine-tuning, alignment (RLHF, DPO), and distillation for production deployment - Architect agentic systems (multi-step reasoning, tool use, planning, and orchestration) that work reliably at scale - Develop evaluation frameworks and methodologies that go beyond standard benchmarks to capture real-world conversational quality - Translate research advances into customer-facing products, working closely with engineering, product, and cross-functional science teams - Own end-to-end delivery of complex, ambiguous research initiatives from problem formulation through experimentation to production deployment, with minimal guidance - Tackle the team's most complex technical problems while maintaining practical focus on customer value and solution generalizability - Advance the team's scientific reputation through high-impact publications and presentations at top-tier internal and external venues, and generate intellectual property through patents The applicable collective agreement for this role is CBA for employees of Telecommunication Sector. The position is classified at level 6 or above, depending on the candidate’s skills, competences and experience. The minimum gross annual base salary for this position is listed below. The base salary listed corresponds to working on a full-time basis. For part-time hours, the salary will be pro-rated. Amazon reserves the right to offer a higher salary and/or level, depending on the candidate's skills, competencies, and experience. Amazon's package may include a sign on payment. In addition, the candidate may be eligible to participate in a restricted stock unit scheme operated independently by Amazon.com Inc. in USA. Your recruiting team will share final salary and any restricted stock unit scheme if applicable, depending on skills and requirements. In addition to statutory benefits, and those applicable to the relevant CBA, company supplementary benefits may apply subject to further terms. Italy- EUR104,500 gross annually. A day in the life As a Senior Applied Scientist in the Alexa AI team, your day will involve leading cross-functional collaborations with engineering, product, and science teams to define the technical direction for our conversational assistant. You'll design experiments that shape the science roadmap, mentor junior scientists, and make high-judgment calls on architecture and deployment trade-offs. Working in a fast-paced, ambiguous environment, you'll own end-to-end delivery of complex initiatives: from formulating novel research problems to presenting strategic recommendations to senior leadership. Your ability to influence across organizational boundaries will drive measurable customer impact while raising the bar for millions of customers. About the team Alexa AI is building the science and technology behind Alexa+, Amazon's next-generation conversational assistant. Our team works at the intersection of large language models, reinforcement learning from human feedback and verifiable rewards, agentic architectures, and multilingual/multimodal understanding. We operate at massive scale: our models serve customers across dozens of languages and device types. If you want to push the frontier of conversational AI and see your work used by people every day, come join us.
US, WA, Bellevue
The Supply Chain Optimization Technologies (SCOT) team builds technology to automate and optimize Amazon’s supply chain of physical goods. We seek a Data Scientist with strong analytical and communication skills to join our team. SCOT manages Amazon's inventory under uncertainty of demand, pricing, promotions, supply, vendor lead times, and product life cycle. We optimize complex trade-offs between customer experience, inventory costs, fulfillment costs, fulfillment center capacity, etc. We develop sophisticated algorithms that involve learning from large amounts of data such as prices, promotions, similar products, and other data from our product catalog in order to automatically act on millions of dollars’ worth of inventory weekly and establish plans for tens of thousands of employees. As a Data Scientist, you will contribute to the research community, by working with other scientists across Amazon and our Supply Chain, as well as collaborating with academic researchers and publishing papers both internally and externally. Key job responsibilities Major responsibilities include: - Analysis of large amounts of data from different parts of the supply chain and their associated business functions - Improving upon existing machine learning methodologies by developing new data sources, developing and testing model enhancements, running computational experiments, and fine-tuning model parameters for new models - Formalizing assumptions about how models are expected to behave, creating definitions of outliers, developing methods to systematically identify these outliers, and explaining why they are reasonable or identifying fixes for them - Communicating verbally and in writing to business customers with various levels of technical knowledge, educating them about our research, as well as sharing insights and recommendations - Utilizing code (Python, R, Scala, etc.) for analyzing data and building statistical and machine learning models and algorithms A day in the life As a Data Scientist in SCOT, you will be tasked to understand and work with innovative research tools to enable the implementation of sophisticated models on big data. As a successful data scientist in the SCOT team, you are an analytical problem solver who enjoys diving into data from various businesses, is excited about investigations and algorithms, can multi-task, and can credibly interface between scientists, engineers and business stakeholders. Your expertise in synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication will enable you to answer specific business questions and innovate for the future. Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: - Medical, Dental, and Vision Coverage - Maternity and Parental Leave Options - Paid Time Off (PTO) - 401(k) Plan If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply!