Optimizing neural networks for special-purpose hardware

Curating the neural-architecture search space and taking advantage of human intuition reduces latency on real-world applications by up to 55%.

As neural networks grow in size, deploying them on-device increasingly requires special-purpose hardware that parallelizes common operations. But for maximum efficiency, it’s not enough to optimize the hardware for the networks; the networks should be optimized for the hardware, too.

Related content
The first step in training a neural network to solve a problem is usually the selection of an architecture: a specification of the number of computational nodes in the network and the connections between them. Architectural decisions are generally based on historical precedent, intuition, and plenty of trial and error.

The standard way to optimize a neural network is through neural-architecture search (NAS), where the goal is to minimize both the size of the network and the number of floating-point operations (FLOPS) it performs. But this approach doesn’t work with neural chips, which can often execute easily parallelized but higher-FLOPS tasks more rapidly than they can harder-to-parallelize but lower-FLOPS tasks.

Minimizing latency is a more complicated optimization objective than minimizing FLOPS, so in the Amazon Devices Hardware group, we’ve developed a number of strategies for adapting NAS to the problem of optimizing network architectures for Amazon’s new Neural Engine family of accelerators. Those strategies involve curating the architecture search space to, for instance, reduce the chances of getting stuck in local minima. We’ve also found that combining a little human intuition with the results of NAS for particular tasks can help us generalize to new tasks more reliably and efficiently.

In experiments involving several different machine learning tasks, we’ve found that our NAS strategies can reduce latencies by as much as 55%.

Varieties of neural-architecture search

NAS needs three things: a definition of the search space, which specifies the building blocks available to construct a network; a cost model, which is a function of the network's accuracy, latency, and memory; and an optimization algorithm. We use a performance estimator to measure latency and memory footprint, but to measure accuracy, we must train the network. This is a major bottleneck, as training a single network can take days. Sampling thousands of architectures would take thousands of GPU days, which is clearly neither practical nor environmentally sustainable.

There are three categories of NAS algorithm, which require networks to be trained different numbers of times: multishot, single-shot, and zero-shot.

Related content
A new approach that grows networks dynamically promises improvements over GANs with fixed architectures or predetermined growing strategies.

Multishot methods sample a cohort of architectures in each iteration. Each network is trained and evaluated for accuracy and performance, and the next set of architectures is sampled based on their cost. Evolutionary or reinforcement-learning-based algorithms are generally used for multishot methods.

Single-shot methods start with a large network called the supernet, which has multiple possible subgraphs. During training, the subgraphs start converging to a single, small network. Single-shot methods are designed to be trained only once, but their training takes much longer than that of a single network in multishot methods.

Zero-shot methods works like multishot methods, with the key difference that the network is never trained. As a proxy for accuracy, we use the network’s trainability score, which is computed using the network's topology, nonlinearity, and operations. Zero-shot methods are the fastest to converge, because calculating the score is computationally very cheap. The downside is that the trainability may not correlate well with model accuracy.

Search space curation

The NAS cost function can be visualized as a landscape, with each point representing a potential architecture. A cost function based on FLOPS changes monotonically with factors such as sizes or channels: that is, if you find a direction across the terrain in which the cost is going down, you can be sure that continuing in that direction will not cause the cost to go up.

However, the inclusion of accelerator-aware constraints disrupts the function by introducing more asymptotes, or points at which the cost switches from going down to going up. This results in a more complex and rocky landscape.

Related content
How to make trained systems evolve gracefully.

To address this issue, we reduced the number of options in the search space. We were exploring convolutional architectures, meaning that the inputs are decomposed into several different components, each of which has its own channel through the network. The data in each channel, in turn, is filtered in several different ways; each filter involves a different data convolution.

Previously, we would have explored the number of channels — known as the channel size — at increments of one; instead, we considered only a handful of channel sizes. We limited the options for channel sizes to certain values that were favorable for the parallelism factor of the Neural Engine. The parallelism factor is a count of operations, such as dot product, that can be performed in parallel. In some cases, we even added "depth multiplier" ratio that could be used to scale the number of channels across the entire model to the search space.

These improvements can be visualized as taking fewer, larger steps across a smoother terrain, rather than trying to navigate the rocky landscape that resulted from the inclusion of accelerator-aware performance in the cost function. During the optimization process, they resulted in a faster convergence rate because of the reduced number of options and in improved stability and reliability thanks to the monotonic nature of the curated search space.

NAS - 3x1.png
Illustration of how the cost landscape (green) changes from smooth (left) to rocky (center and right) when a cost function based on Neural Engine performance replaces one based on FLOPS. Curation (right) reduces the discrete search space (black dots) and ensures that points are far apart. The trajectory of a search algorithm (blue arrows) shows how curation (right) ensures that with each step in a search, the cost is monotonically decreasing.

One key detail in our implementation is the performance estimator. Instead of deploying an architecture on real hardware or an emulator to obtain performance metrics, we estimated them using a machine learning regression model trained on measurements of different operators or subgraphs.

At inference time, the estimator would decompose the queried architecture into subgraphs and use the regression model to estimate the performance of each. Then it would accumulate these estimates to give the model-level performance. This regressor-based design simplified our NAS framework, as it no longer required compilation, inference, or hardware. This technique enables us to test accelerators in the design phase, before we’ve developed custom compilers and hardware emulators for them.

Productizing NAS with expert-in-the-loop

Curating the search space improves convergence rate, stability, and reliability, but transferability to new use cases is not straightforward. NAS results for a detector model, for instance, may not be easy to transfer to a classification model. On the other hand, running NAS from scratch for each new dataset may not be feasible, due to time constraints. In these situations, we found that combining NAS results and human expertise was the fastest approach.

Channel reduction step.png
The initial channel reduction step (1x1 conv.) in the inverted-bottleneck (IBN) block at left is fused with the channel expansion step (KxK depth. conv.) in the fused IBN at right. This proved to be a common subgraph modification across datasets.

When we performed NAS on different datasets, we saw common patterns, such as the fusion of convolution layers with previous convolution layers, reducing the number of channels and, aligning them with the hardware parallelism factor.

In particular, fusing convolution layers in inverted bottleneck (IBN) blocks contributed most to boosting efficiency. With just these modifications, we observed latency reductions of up to 50%, whereas a fully converged NAS model would yield a slightly better 53% reduction.

In situations where running NAS from scratch is not feasible, a human expert can rely on mathematical intuition and observations of the results of NAS on similar datasets to build the required model architecture.

Results and product impact

We applied this technique to multiple products in the Amazon Devices portfolio, ranging from Echo Show and Blink home security products to the latest Astro, the in-home consumer robot.

1. Reduced detection latency by half on Echo Show

Echo Show runs a model to detect human presence and locate the detected person in a room. The original model used IBN blocks. We used accelerator-aware NAS to reduce the latency of this model by 53%.

Human-presence detection.png
Schematic representation of human-presence detection.

We performed a search for depth multipliers — that is, layers that multiply the number of channels — and for opportunities to replace IBN blocks with fused-IBN blocks. The requirement was to maintain the same mean average precision (mAP) of the original model while improving the latency. Our V3 model improved the latency by more than 53% (i.e. 2.2x faster) while keeping the mAP scores same as baseline.

Latency results for the original model and three models found through NAS.

Fused-IBN search

Depth multiplier search

Latency reduction (%)

Baseline

No

No

Baseline

V1

No

Yes

14%

V2

Yes

No

35%

V3

Yes

Yes

53%

After performing NAS, we found that not every IBN fusion improves latency and accuracy. The later layers are larger, and replacing them with fused layers hurt performance. For the layers where fusion was selected, the FLOPs, as expected, increased, but the latency did not.

2. Model fitting within the tight memory budget of the Blink Floodlight Camera

Blink cameras use a classification model for security assistance. Our goal was to fit the model parameters and peak activation memory within a tight memory budget. In this case, we combined NAS techniques with an expert-in-the-loop to provide fine-tuning. The NAS result on the classification dataset provided intuition on what operator/subgraph changes could extract benefits from the accelerator design.

Classification.png
Schematic representation of the classification model output.

The expert recommendations were to replace the depth-wise convolutions with standard convolutions and reduce the channels by making them even across the model, preferably by a multiple of the parallelism factor. With these changes, model developers were able to reduce both the model size and the intermediate memory usage by 47% and fit the model within the required budget.

3. Fast semantic segmentation for robotics

In the context of robotics, semantic segmentation is used to understand the objects and scenes the robot is interacting with. For example, it can enable the robot to identify chairs, tables, or other objects in the environment, allowing it to navigate and interact with its surroundings more effectively. Our goal for this model was to reduce latency by half. Our starting point was a semantic-segmentation model that was optimized to run on a CPU.

Semantic segmentation.png
Left: original image of a room at night; center: semantic-segmentation image; right: semantic segmentation overlaid on original image.

For this model, we searched for different channel sizes, fusion, and also output and input dimensions. We used the multishot method with the evolutionary search algorithm. NAS gave us multiple candidates with different performances. The best candidate was able to reduce the latency by half.

Latency improvement for different architectures found through NAS.

Latency reduction (%)

Original

Baseline

Model A

27%

Model B

37%

Model C

38%

Model D

41%

Model E

51%

4. User privacy with on-device inference

Amazon's Neural Engine supports large-model inference on-device, so we can process microphone and video feeds without sending data to the cloud. For example, the Amazon Neural Engine has enabled Alexa to perform automatic speech recognition on-device. On-device processing also provides a better user experience because the inference pipeline is not affected by intermittent connection issues. In our NAS work, we discovered that even larger, more accurate models can now fit on-device with no hit on latency.

Making edge AI sustainable

We mentioned earlier that multishot NAS with full training can take up to 2,000 GPU-days. However, with some of the techniques described in this blog, we were able to create efficient architectures in a substantially shorter amount of time, making NAS much more scalable and sustainable. But our sustainability efforts don't end there.

Related content
Innovative training methods and model compression techniques combine with clever engineering to keep speech processing local.

Because of its parallelism and mixed-precision features, the Neural Engine is more power efficient than a generic CPU. For a million average users, the difference is on order of millions of kilowatt-hours per year, equivalent to 200 gasoline-powered passenger vehicles per year or the energy consumption of a hundred average US households.

When we optimize models through NAS, we increase the device's capability to run more neural-network models simultaneously. This allows us to use smaller application processors and, in some cases, fewer of them. By reducing the hardware footprint in this way, we are further reducing the carbon footprint of our devices.

Future work

We have identified that curation requires an expert who understands the hardware design well. This may not scale to future generations of more complex hardware. We have also identified that in situations where time is tight, having an expert in the loop is still faster than running NAS from scratch. Because of this, we are continuing to investigate how NAS algorithms with accelerator awareness can handle large search spaces. We are also working on improving the search algorithm’s efficiency and effectiveness by exploring how the three categories of algorithms can be combined. We also plan to explore model optimization by introducing sparsity through pruning and clustering. Stay tuned!

Acknowledgements: Manasa Manohara, Lingchuan Meng, Rahul Bakshi, Varada Gopalakrishnan, Lindo St. Angel

Research areas

Related content

US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a highly skilled and experienced Sr. Applied Scientist, to support the development and implementation of state-of-the-art algorithms and models for supervised fine-tuning and reinforcement learning through human feedback and complex reasoning; with a focus across text, image, and video modalities. As an Sr. Applied Scientist, you will play a critical role in supporting the development of Generative AI (Gen AI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities Collaborate with cross-functional teams of engineers, product managers, and scientists to identify and solve complex problems in Gen AI Design and execute experiments to evaluate the performance of different algorithms (PT, SFT, RL) and models, and iterate quickly to improve results Think big about the arc of development of Gen AI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems Communicate results and insights to both technical and non-technical audiences, including through presentations and written reports About the team We are passionate scientists dedicated to pushing the boundaries of innovation in Gen AI with focus on Software Development use cases.
IN, HR, Gurugram
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Principal Applied Scientist with a strong deep learning background, to lead the development of industry-leading technology with multimodal systems. As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance).
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the next level. We focus on creating entirely new products and services with a goal of positively impacting the lives of our customers. No industries or subject areas are out of bounds. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As a Research Scientist, you will work with a unique and gifted team developing exciting products for consumers and collaborate with cross-functional teams. Our team rewards intellectual curiosity while maintaining a laser-focus in bringing products to market. At the edge of both academic and applied research in this product area, you have the opportunity to work together with some of the most talented scientists, engineers, and product managers. Here at Amazon, we embrace our differences. We are committed to furthering our culture of inclusion. We have thirteen employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We are constantly learning through programs that are local, regional, and global. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Our team highly values work-life balance, mentorship and career growth. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We care about your career growth and strive to assign projects and offer training that will challenge you to become your best. Key job responsibilities * Partner with laboratory science teams on design and analysis of experiments * Originate and lead the development of new data collection workflows with cross-functional partners * Develop and deploy scalable bioinformatics analysis and QC workflows * Evaluate and incorporate novel bioinformatic approaches to solve critical business problems
US, CA, Sunnyvale
As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance). A day in the life About the team Amazon’s AGI team is focused on building foundational AI to solve real-world problems at scale, delivering value to all existing businesses in Amazon, and enabling entirely new services and products for people and enterprises around the world.
LU, Luxembourg
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models and speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, Spain, South Africa, UAE, and UK). Please note these are not remote internships.
US, WA, Seattle
Revolutionize the Future of AI at the Frontier of Applied Science Are you a brilliant mind seeking to push the boundaries of what's possible with artificial intelligence? Join our elite team of researchers and engineers at the forefront of applied science, where we're harnessing the latest advancements in natural language processing, deep learning, and generative AI to reshape industries and unlock new realms of innovation. As an Applied Science Intern, you'll have the unique opportunity to work alongside world-renowned experts, gaining invaluable hands-on experience with cutting-edge technologies such as large language models, transformers, and neural networks. You'll dive deep into complex challenges, fine-tuning state-of-the-art models, developing novel algorithms for named entity recognition, and exploring the vast potential of generative AI. This internship is not just about executing tasks – it's about being a driving force behind groundbreaking discoveries. You'll collaborate with cross-functional teams, leveraging your expertise in statistics, recommender systems, and question answering to tackle real-world problems and deliver impactful solutions. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for LLM & GenAI Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA; Pittsburgh, PA. Key job responsibilities We are particularly interested in candidates with expertise in: LLMs, NLP/NLU, Gen AI, Transformers, Fine-Tuning, Recommendation Systems, Deep Learning, NER, Statistics, Neural Networks, Question Answering. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and GenAI. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on recommendation systems, question answering, deep learning and generative AI. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Collaborate with cross-functional teams to tackle complex challenges in natural language processing, computer vision, and generative AI. - Fine-tune state-of-the-art models and develop novel algorithms to push the boundaries of what's possible. - Explore the vast potential of generative AI and its applications across industries. - Attend cutting-edge research seminars and engage in thought-provoking discussions with industry luminaries. - Leverage state-of-the-art computing infrastructure and access to the latest research papers to fuel your innovation. - Present your groundbreaking work and insights to the team, fostering a culture of knowledge-sharing and continuous learning.
US, WA, Seattle
Unlock the Future with Amazon Science! Calling all visionary minds passionate about the transformative power of machine learning! Amazon is seeking boundary-pushing graduate student scientists who can turn revolutionary theory into awe-inspiring reality. Join our team of visionary scientists and embark on a journey to revolutionize the field by harnessing the power of cutting-edge techniques in bayesian optimization, time series, multi-armed bandits and more. At Amazon, we don't just talk about innovation – we live and breathe it. You'll conducting research into the theory and application of deep reinforcement learning. You will work on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. You will propose and deploy solutions that will likely draw from a range of scientific areas such as supervised, semi-supervised and unsupervised learning, reinforcement learning, advanced statistical modeling, and graph models. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Machine Learning Applied Science Internships in, but not limited to Arlington, VA; Bellevue, WA; Boston, MA; New York, NY; Palo Alto, CA; San Diego, CA; Santa Clara, CA; Seattle, WA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Programming/Scripting Languages, Statistics, Reinforcement Learning, Causal Inference, Large Language Models, Time Series, Graph Modeling, Supervised/Unsupervised Learning, Deep Learning, Predictive Modeling In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Reinforcement Learning and Optimization within Machine Learning. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on developing novel RL algorithms and applying them to complex, real-world challenges. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Design, development and evaluation of highly innovative ML models for solving complex business problems. - Research and apply the latest ML techniques and best practices from both academia and industry. - Think about customers and how to improve the customer delivery experience. - Use and analytical techniques to create scalable solutions for business problems.
US, WA, Seattle
Shape the Future of Human-Machine Interaction Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies. Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated.. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling. - Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling. - Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning. - Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
US, WA, Seattle
Do you enjoy solving challenging problems and driving innovations in research? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? We are looking for builders, innovators, and entrepreneurs who want to bring their ideas to reality and improve the lives of millions of customers. As a Research Science intern focused on Operations Research and Optimization intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes and work with massive datasets. As you navigate through complex algorithms and data structures, you'll find yourself at the forefront of innovation, shaping the future of Amazon's fulfillment, logistics, and supply chain operations. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions. You'll then immerse yourself in a world of data, leveraging your expertise in optimization, causal inference, time series analysis, and machine learning to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Operations Research Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Causal Inference, Time Series, Algorithms and Data Structures, Statistics, Operations Research, Machine Learning, Programming/Scripting Languages, LLMs In this role, you will gain hands-on experience in applying cutting-edge analytical techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights to drive operational excellence, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life Develop and apply optimization, causal inference, and time series modeling techniques to drive operational efficiencies and improve decision-making across Amazon's fulfillment, logistics, and supply chain operations Design and implement scalable algorithms and data structures to support complex optimization systems Leverage statistical methods and machine learning to uncover insights and patterns in large-scale operations data Prototype and validate new approaches through rigorous experimentation and analysis Collaborate closely with cross-functional teams of researchers, engineers, and business stakeholders to translate research outputs into tangible business impact