Optimizing neural networks for special-purpose hardware

Curating the neural-architecture search space and taking advantage of human intuition reduces latency on real-world applications by up to 55%.

As neural networks grow in size, deploying them on-device increasingly requires special-purpose hardware that parallelizes common operations. But for maximum efficiency, it’s not enough to optimize the hardware for the networks; the networks should be optimized for the hardware, too.

Related content
The first step in training a neural network to solve a problem is usually the selection of an architecture: a specification of the number of computational nodes in the network and the connections between them. Architectural decisions are generally based on historical precedent, intuition, and plenty of trial and error.

The standard way to optimize a neural network is through neural-architecture search (NAS), where the goal is to minimize both the size of the network and the number of floating-point operations (FLOPS) it performs. But this approach doesn’t work with neural chips, which can often execute easily parallelized but higher-FLOPS tasks more rapidly than they can harder-to-parallelize but lower-FLOPS tasks.

Minimizing latency is a more complicated optimization objective than minimizing FLOPS, so in the Amazon Devices Hardware group, we’ve developed a number of strategies for adapting NAS to the problem of optimizing network architectures for Amazon’s new Neural Engine family of accelerators. Those strategies involve curating the architecture search space to, for instance, reduce the chances of getting stuck in local minima. We’ve also found that combining a little human intuition with the results of NAS for particular tasks can help us generalize to new tasks more reliably and efficiently.

In experiments involving several different machine learning tasks, we’ve found that our NAS strategies can reduce latencies by as much as 55%.

Varieties of neural-architecture search

NAS needs three things: a definition of the search space, which specifies the building blocks available to construct a network; a cost model, which is a function of the network's accuracy, latency, and memory; and an optimization algorithm. We use a performance estimator to measure latency and memory footprint, but to measure accuracy, we must train the network. This is a major bottleneck, as training a single network can take days. Sampling thousands of architectures would take thousands of GPU days, which is clearly neither practical nor environmentally sustainable.

There are three categories of NAS algorithm, which require networks to be trained different numbers of times: multishot, single-shot, and zero-shot.

Related content
A new approach that grows networks dynamically promises improvements over GANs with fixed architectures or predetermined growing strategies.

Multishot methods sample a cohort of architectures in each iteration. Each network is trained and evaluated for accuracy and performance, and the next set of architectures is sampled based on their cost. Evolutionary or reinforcement-learning-based algorithms are generally used for multishot methods.

Single-shot methods start with a large network called the supernet, which has multiple possible subgraphs. During training, the subgraphs start converging to a single, small network. Single-shot methods are designed to be trained only once, but their training takes much longer than that of a single network in multishot methods.

Zero-shot methods works like multishot methods, with the key difference that the network is never trained. As a proxy for accuracy, we use the network’s trainability score, which is computed using the network's topology, nonlinearity, and operations. Zero-shot methods are the fastest to converge, because calculating the score is computationally very cheap. The downside is that the trainability may not correlate well with model accuracy.

Search space curation

The NAS cost function can be visualized as a landscape, with each point representing a potential architecture. A cost function based on FLOPS changes monotonically with factors such as sizes or channels: that is, if you find a direction across the terrain in which the cost is going down, you can be sure that continuing in that direction will not cause the cost to go up.

However, the inclusion of accelerator-aware constraints disrupts the function by introducing more asymptotes, or points at which the cost switches from going down to going up. This results in a more complex and rocky landscape.

Related content
How to make trained systems evolve gracefully.

To address this issue, we reduced the number of options in the search space. We were exploring convolutional architectures, meaning that the inputs are decomposed into several different components, each of which has its own channel through the network. The data in each channel, in turn, is filtered in several different ways; each filter involves a different data convolution.

Previously, we would have explored the number of channels — known as the channel size — at increments of one; instead, we considered only a handful of channel sizes. We limited the options for channel sizes to certain values that were favorable for the parallelism factor of the Neural Engine. The parallelism factor is a count of operations, such as dot product, that can be performed in parallel. In some cases, we even added "depth multiplier" ratio that could be used to scale the number of channels across the entire model to the search space.

These improvements can be visualized as taking fewer, larger steps across a smoother terrain, rather than trying to navigate the rocky landscape that resulted from the inclusion of accelerator-aware performance in the cost function. During the optimization process, they resulted in a faster convergence rate because of the reduced number of options and in improved stability and reliability thanks to the monotonic nature of the curated search space.

NAS - 3x1.png
Illustration of how the cost landscape (green) changes from smooth (left) to rocky (center and right) when a cost function based on Neural Engine performance replaces one based on FLOPS. Curation (right) reduces the discrete search space (black dots) and ensures that points are far apart. The trajectory of a search algorithm (blue arrows) shows how curation (right) ensures that with each step in a search, the cost is monotonically decreasing.

One key detail in our implementation is the performance estimator. Instead of deploying an architecture on real hardware or an emulator to obtain performance metrics, we estimated them using a machine learning regression model trained on measurements of different operators or subgraphs.

At inference time, the estimator would decompose the queried architecture into subgraphs and use the regression model to estimate the performance of each. Then it would accumulate these estimates to give the model-level performance. This regressor-based design simplified our NAS framework, as it no longer required compilation, inference, or hardware. This technique enables us to test accelerators in the design phase, before we’ve developed custom compilers and hardware emulators for them.

Productizing NAS with expert-in-the-loop

Curating the search space improves convergence rate, stability, and reliability, but transferability to new use cases is not straightforward. NAS results for a detector model, for instance, may not be easy to transfer to a classification model. On the other hand, running NAS from scratch for each new dataset may not be feasible, due to time constraints. In these situations, we found that combining NAS results and human expertise was the fastest approach.

Channel reduction step.png
The initial channel reduction step (1x1 conv.) in the inverted-bottleneck (IBN) block at left is fused with the channel expansion step (KxK depth. conv.) in the fused IBN at right. This proved to be a common subgraph modification across datasets.

When we performed NAS on different datasets, we saw common patterns, such as the fusion of convolution layers with previous convolution layers, reducing the number of channels and, aligning them with the hardware parallelism factor.

In particular, fusing convolution layers in inverted bottleneck (IBN) blocks contributed most to boosting efficiency. With just these modifications, we observed latency reductions of up to 50%, whereas a fully converged NAS model would yield a slightly better 53% reduction.

In situations where running NAS from scratch is not feasible, a human expert can rely on mathematical intuition and observations of the results of NAS on similar datasets to build the required model architecture.

Results and product impact

We applied this technique to multiple products in the Amazon Devices portfolio, ranging from Echo Show and Blink home security products to the latest Astro, the in-home consumer robot.

1. Reduced detection latency by half on Echo Show

Echo Show runs a model to detect human presence and locate the detected person in a room. The original model used IBN blocks. We used accelerator-aware NAS to reduce the latency of this model by 53%.

Human-presence detection.png
Schematic representation of human-presence detection.

We performed a search for depth multipliers — that is, layers that multiply the number of channels — and for opportunities to replace IBN blocks with fused-IBN blocks. The requirement was to maintain the same mean average precision (mAP) of the original model while improving the latency. Our V3 model improved the latency by more than 53% (i.e. 2.2x faster) while keeping the mAP scores same as baseline.

Latency results for the original model and three models found through NAS.

Fused-IBN search

Depth multiplier search

Latency reduction (%)

Baseline

No

No

Baseline

V1

No

Yes

14%

V2

Yes

No

35%

V3

Yes

Yes

53%

After performing NAS, we found that not every IBN fusion improves latency and accuracy. The later layers are larger, and replacing them with fused layers hurt performance. For the layers where fusion was selected, the FLOPs, as expected, increased, but the latency did not.

2. Model fitting within the tight memory budget of the Blink Floodlight Camera

Blink cameras use a classification model for security assistance. Our goal was to fit the model parameters and peak activation memory within a tight memory budget. In this case, we combined NAS techniques with an expert-in-the-loop to provide fine-tuning. The NAS result on the classification dataset provided intuition on what operator/subgraph changes could extract benefits from the accelerator design.

Classification.png
Schematic representation of the classification model output.

The expert recommendations were to replace the depth-wise convolutions with standard convolutions and reduce the channels by making them even across the model, preferably by a multiple of the parallelism factor. With these changes, model developers were able to reduce both the model size and the intermediate memory usage by 47% and fit the model within the required budget.

3. Fast semantic segmentation for robotics

In the context of robotics, semantic segmentation is used to understand the objects and scenes the robot is interacting with. For example, it can enable the robot to identify chairs, tables, or other objects in the environment, allowing it to navigate and interact with its surroundings more effectively. Our goal for this model was to reduce latency by half. Our starting point was a semantic-segmentation model that was optimized to run on a CPU.

Semantic segmentation.png
Left: original image of a room at night; center: semantic-segmentation image; right: semantic segmentation overlaid on original image.

For this model, we searched for different channel sizes, fusion, and also output and input dimensions. We used the multishot method with the evolutionary search algorithm. NAS gave us multiple candidates with different performances. The best candidate was able to reduce the latency by half.

Latency improvement for different architectures found through NAS.

Latency reduction (%)

Original

Baseline

Model A

27%

Model B

37%

Model C

38%

Model D

41%

Model E

51%

4. User privacy with on-device inference

Amazon's Neural Engine supports large-model inference on-device, so we can process microphone and video feeds without sending data to the cloud. For example, the Amazon Neural Engine has enabled Alexa to perform automatic speech recognition on-device. On-device processing also provides a better user experience because the inference pipeline is not affected by intermittent connection issues. In our NAS work, we discovered that even larger, more accurate models can now fit on-device with no hit on latency.

Making edge AI sustainable

We mentioned earlier that multishot NAS with full training can take up to 2,000 GPU-days. However, with some of the techniques described in this blog, we were able to create efficient architectures in a substantially shorter amount of time, making NAS much more scalable and sustainable. But our sustainability efforts don't end there.

Related content
Innovative training methods and model compression techniques combine with clever engineering to keep speech processing local.

Because of its parallelism and mixed-precision features, the Neural Engine is more power efficient than a generic CPU. For a million average users, the difference is on order of millions of kilowatt-hours per year, equivalent to 200 gasoline-powered passenger vehicles per year or the energy consumption of a hundred average US households.

When we optimize models through NAS, we increase the device's capability to run more neural-network models simultaneously. This allows us to use smaller application processors and, in some cases, fewer of them. By reducing the hardware footprint in this way, we are further reducing the carbon footprint of our devices.

Future work

We have identified that curation requires an expert who understands the hardware design well. This may not scale to future generations of more complex hardware. We have also identified that in situations where time is tight, having an expert in the loop is still faster than running NAS from scratch. Because of this, we are continuing to investigate how NAS algorithms with accelerator awareness can handle large search spaces. We are also working on improving the search algorithm’s efficiency and effectiveness by exploring how the three categories of algorithms can be combined. We also plan to explore model optimization by introducing sparsity through pruning and clustering. Stay tuned!

Acknowledgements: Manasa Manohara, Lingchuan Meng, Rahul Bakshi, Varada Gopalakrishnan, Lindo St. Angel

Research areas

Related content

IN, KA, Bengaluru
Have you ever wondered how that Amazon box with the smile arrives so quickly, where it came from, and how much it cost Amazon to deliver? The WW Amazon Logistics, Business Analytics team manages the delivery of tens of millions of products every week to Amazon's customers, achieving on-time delivery in a cost-effective manner. We are seeking an enthusiastic, customer-obsessed Manager Research Science with strong analytical skills to join our team. This role is crucial in optimizing Amazon's vast delivery network and will have significant impact on the customer experience, particularly in the final phase of delivery. As a Manager Research Science, you will: 1. Address business challenges through building compelling cases and using data to influence change across the organization 2. Develop input and assumptions based on preexisting models to estimate costs and savings opportunities associated with varying levels of network growth and operations 3. Create metrics to measure business performance, identify root causes and trends, and prescribe action plans 4. Manage multiple high-impact projects simultaneously 5. Work with technology teams and product managers to develop new tools and systems supporting business growth 6. Communicate with and support various internal stakeholders and external audiences 7. Implement scheduling solutions, improve metrics, and develop scalable processes and tools The ideal candidate will have: - Extensive experience in operations research and data-driven decision making - Strong analytical and problem-solving skills - Robust program management and research science skills - Ability to work with a team and make independent decisions in ambiguous environments - Customer-obsessed mindset with a focus on improving the Amazon delivery experience This role offers the autonomy to think strategically and make data-driven decisions from day one. Join us in shaping the future of e-commerce delivery and addressing the core challenges in our world-class operations space! Key job responsibilities 1. Advanced Modeling and Algorithm Development: - Design and implement sophisticated machine learning models for logistics optimization - Develop complex time series forecasting algorithms for demand prediction and resource allocation 2. AI and Machine Learning Integration: - Architect and deploy AI-powered systems to enhance decision-making in logistics operations - Implement deep learning techniques for image recognition in package sorting and handling - Develop reinforcement learning algorithms for adaptive scheduling and resource management 3. Big Data Analytics and Processing: - Design and implement distributed computing solutions for processing massive logistics datasets - Utilize cloud computing platforms (e.g., AWS) for scalable data processing and analysis 4. AI-Driven Workflow Optimization: - Design and implement AI agents for autonomous decision-making in logistics processes - Create machine learning models for customer behavior analysis and personalized delivery options 5. Software Development and System Architecture: - Write efficient, scalable code in languages such as Python, Java, or C++ - Develop and maintain complex software systems for logistics optimization - Stay at the forefront of AI and ML research - Publish research findings in top-tier conferences and journals About the team We are Amazon's Last Mile Science and Analytics team, dedicated to improving e-commerce delivery. We work to optimize our vast network, forecast demand using machine learning, and enhance route efficiency. Our efforts focus on developing innovative delivery methods, applying AI to solve complex problems, and conducting geospatial analysis. We create simulations to refine processes and plan capacity effectively. Operating globally, we strive to develop adaptable solutions for diverse markets. We aim to advance logistics science, continually improving speed, efficiency, and customer satisfaction, in support of Amazon's mission to be Earth's most customer-centric company.
US, WA, Seattle
Ever wish you could use your quantitative and critical thinking skills to influence business decisions? Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems. As part of the Content Discovery and Experimentation Science team within Prime Video, you will leverage your expertise in causal inference and experimental design to make Prime Video the best-in-class digital video experience. Key job responsibilities - Build causal models and metrics that capture trade-off decisions when business and customer outcomes do not align - Partner with data scientists and product managers to integrate these metrics into Prime Video's experimentation tooling - Work with finance partners to ensure that the team's product metrics contribute to Prime Video's strategic business and financial objectives - Contribute to technical and business documents to communicate ideas and proposals to various audiences - Educate and advocate for best practices in experimentation and how to use it for decision-making
US, CA, Sunnyvale
MULTIPLE POSITIONS AVAILABLE Employer: AMAZON.COM SERVICES LLC Offered Position: Manager III, Economist Job Location: Sunnyvale, California Job Number: AMZ9803624 Position Responsibilities: Independently manage a team of economists and/or scientists in developing strategic economic analyses and demand estimation models. Translate business questions into econometric methodologies and causal inference analyses. Communicate economic insights to non-technical audiences to guide strategic-level, high-impact business decisions. Scale economic models through cross-functional collaboration with engineering teams. Establish scientific quality standards and research priorities. Drive operational efficiency and research excellence across the team. 40 hours / week, 8:00am-5:00pm, Salary Range: $201,300/year to $272,400/year. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, visit: https://www.aboutamazon.com/workplace/employee-benefits. Amazon.com is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.#0000
AU, VIC, Melbourne
Are you excited about leveraging state-of-the-art Computer Vision algorithms and large datasets to solve real-world problems? Join Amazon as an Applied Scientist Intern and be at the forefront of AI innovation! As an Applied Scientist Intern, you'll work in a fast-paced, cross-disciplinary team of pioneering researchers. You'll tackle complex problems, developing solutions that either build on existing academic and industrial research or stem from your own innovative thinking. Your work may even find its way into customer-facing products, making a real-world impact. Please note: This internship is a duration of 6 months full time with a start date in Jan-March 2027. The successful intern is required to be based in Melbourne and relocation allowance will be provided if you are based outside of Melbourne. Key job responsibilities - Develop novel solutions and build prototypes - Work on complex problems in Computer Vision and Machine Learning - Contribute to research that could significantly impact Amazon's operations - Collaborate with a diverse team of experts in a fast-paced environment - Collaborate with scientists on writing and submitting papers to Tier-1 conferences (e.g., CVPR, ICCV, NeurIPS, ICML) - Present your research findings to both technical and non-technical audiences Key Opportunities - Collaborate with leading machine learning researchers - Access Amazon tools and hardware (large GPU clusters) - Address challenges at an unparalleled scale - Become a disruptor, innovator, and problem solver in the field of computer vision - Potentially deliver solutions to production in customer-facing applications - Opportunities to become an FTE after the internship Join us in shaping the future of AI at Amazon. Apply now and turn your research into real-world solutions!
IN, KA, Bengaluru
Are you passionate about solving complex business problems at scale through Generative AI? Do you want to help build intelligent systems that reason, act, and learn from minimal supervision? If so, we have an exciting opportunity for you on Amazon's Trustworthy Shopping Experience (TSE) team. At TSE, our vision is to guarantee customers a worry-free shopping experience by earning their trust that the products they buy are safe, authentic, and compliant with regulations and policy. We do this in close partnership with our selling partners, empowering them with best-in-class tools and expertise to offer a high-quality, compliant selection that customers trust. As an Applied Scientist I, you will bring subject matter expertise in at least one relevant discipline (e.g., NLP, computer vision, representation learning, agentic architecture) to contribute to next-generation agentic AI solutions that automate complex manual investigation processes at Amazon scale. Working alongside senior scientists, you will map business goals—such as reducing cost-of-serving while maintaining trust and safety standards—to well-defined scientific problems and metrics. You will invent, refine, and experiment with solutions spanning agentic reasoning, self-supervised representation learning, few-shot adaptation, multimodal understanding, and model compression. With guidance from senior scientists, you will stay current on research trends and benchmark your results against the state of the art. You will help design and execute experiments to identify optimal solutions, initiating the development and implementation of small components with team guidance. You will write secure, stable, testable, and well-documented production code at the level of an SDE I, rigorously evaluating models and quantifying performance. You will handle data in accordance with Amazon policies, troubleshoot issues to root cause, and ensure your work does not put the company at risk. Your scope of influence will typically be at the self-level, with the possibility of mentoring interns. You will participate in team design and prioritization discussions, learn the business context behind TSE's products, and escalate problems with proposed solutions. You will publish internal technical reports and may contribute to peer-reviewed publications and external review activities when aligned with business needs. This role offers a unique opportunity to contribute to end-to-end AI development—from research through production—with your contributions serving hundreds of millions of customers within months, not years. Key job responsibilities • Contribute to the design and development of agentic AI systems with multi-step reasoning, autonomous task execution, and multimodal intelligence, including feedback and memory mechanisms, leveraging reinforcement learning techniques for agent decision-making and policy optimization, with input and guidance from senior scientists • Help productionize models built on top of SFT (Supervised Fine-tuning) and RFT (Reinforced Fine-tuning) approaches, as well as few-shot approaches based on multimodal datasets spanning text, images, and structured data, applying mathematical optimization techniques to improve efficiency, resource allocation, and decision-making in complex workflows, working alongside senior scientists to identify optimal solutions • Contribute to building production-ready deep learning and conventional ML solutions, including multimodal fusion and cross-modal alignment techniques that seamlessly connect visual, textual, and relational understanding, to support automation requirements within your team's scope • Help identify customer and business problems; use reasonable assumptions, data, and customer requirements to solve well-defined scientific problems involving multimodal inputs such as unstructured text, documents, product images, and relational data, developing representations that capture complementary signals across modalities and mapping business goals to scientific metrics • May co-author research papers for peer-reviewed internal and/or external venues, including contributions in areas such as multimodal representation learning and vision-language modeling, and contribute to the wider scientific community by reviewing research submissions, when aligned with business needs • Prototype rapidly, iterate based on feedback, and deliver small components at SDE I level—including multimodal data pipelines and inference modules—that integrate into production-scale systems • Write secure, stable, testable, maintainable, and well-documented code, balancing model capability, deployment cost, and resource usage across multimodal architectures while understanding state-of-the-art data structures, algorithms, and performance tradeoffs • Rigorously test code and evaluate models across individual and combined modalities, quantifying their performance; troubleshoot issues, research root causes, and thoroughly resolve defects, leaving systems more maintainable • Participate in team design, scoping, and prioritization discussions through clear verbal and written communication; seek to learn the business context, science, and engineering behind your team's products, including how multimodal signals contribute to trust and safety decisions • Participate in engineering best practices with peer reviews; clearly document approaches and communicate design decisions; publish internal technical reports to institutionalize scientific learning • Help train and mentor scientist interns; identify and escalate problems with proposed solutions, taking ownership or ensuring clear hand-off to the right owner About the team Trustworthy Shopping Experience Product team in TSE is responsible for the human-in-the-loop products and technology used in the risk investigations at Amazon. The team is also responsible for reducing the cost of performing the investigations, by automating wherever possible and optimizing the experience where manual interventions are needed. The team leverages state-of-the art technology and GenAI to deliver the products and associated goals.
IN, KA, Bengaluru
The Trust CX Innovations team is looking for an Applied Scientist with strong background in Generative AI space to build solutions that help in upholding customer trust for Alexa+. As an Applied Scientist in Trust CX innovations, you will be at the forefront of developing innovative solutions to critical challenges in AI trust and privacy. You'll lead research in trust-preserving machine learning techniques. We are working on revolutionizing the way Amazonians work and collaborate. You will help us achieve new heights of productivity through the power of advanced generative AI technologies. Key job responsibilities - Lead research initiatives in generative AI, focusing on LLMs, multimodal models, and frontier AI capabilities - Develop innovative approaches for model optimization, including prompt engineering, few-shot learning, and efficient fine-tuning - Pioneer new methods for AI safety, alignment, and responsible AI development - Design and execute sophisticated experiments to evaluate model performance and behavior - Lead the development of production-ready AI solutions that scale efficiently - Collaborate with product teams to translate research innovations into practical applications - Guide engineering teams in implementing AI models and systems at scale - Author technical papers for top-tier conferences - File patents for novel AI technologies and applications A day in the life You will be working with a group of talented scientists on researching algorithm and running experiments to test scientific proposal/solutions to improve our trust-preserving experiences. This will involve collaboration with partner teams including engineering, PMs, data annotators, and other scientists to discuss data quality, policy, and model development. You work closely with partner teams across Alexa to deliver platform features that require cross-team leadership. About the team Who We Are: Trust CX Innovations is a strategic innovation team within Amazon Devices & Services that focuses on advancing AI technology while prioritizing customer trust and experience. Our team operates at the intersection of artificial intelligence, privacy engineering and customer-centric design. Our Mission: To pioneer trustworthy AI innovations that delight customers while setting new standards for privacy and responsible technology development. We aim to transform how Amazon builds AI products by creating solutions that balance innovation with customer trust.
US, CA, Pasadena
The Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is looking to hire a Research Scientist with experience in semiconductor process development who will aid in AWS’s effort to bring cloud quantum computing services to its worldwide customer base. You will join a multi-disciplinary team of scientists, and hardware and software engineers working at the forefront of quantum computing. Through your work inside and outside of the cleanroom environment in the fabrication research and development group, you will solve problems related to developing next-generation quantum processors. Candidates must have a demonstrated background in sound scientific and engineering principles, and must have excellent data analysis, bias for action, problem solving, and communication skills, and be highly motivated and curious to research and learn new technical topics as needed. As a research scientist you will be expected to work on new ideas and stay abreast of novel approaches in fabricating and packaging superconducting quantum processors. Working effectively within a team environment is critical. Key job responsibilities Responsibilities include developing novel processes to fabricate high-coherence superconducting qubits; developing advanced 3DI interconnect and routing technologies for integrating superconducting quantum technologies; analyzing inline metrology and electrical test data; writing production standard operating procedures to transfer newly-developed processes to production teams; interacting with project leads to provide feedback that continuously improves different processes. A day in the life The candidate will develop novel technologies using micro-/nano-fabrication techniques inside the cleanroom (independently or in collaboration with other scientists and engineers) for next-generation quantum computing. Outside the cleanroom, the candidate will plan experiments, analyze data, and conceive future innovations. About the team AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
US, WA, Redmond
We are searching for a talented candidate with expertise in orbital mechanics and spaceflight navigation, including LEO Satellite Orbit Determination. This position requires experience in simulation and analysis of spacecraft orbital mechanics and sequential orbit determination methods, including Extended Kalman Filters (EKF) and/or Unscented Kalman Filter (UKF). Strong analysis skills are required to develop engineering studies of complex large-scale dynamical systems. This position requires demonstrated expertise in computational analysis automation and tool development. Key job responsibilities - Perform spacecraft maneuver or navigation analysis in support of multi-disciplinary trades within the Amazon Leo team. - Contribute to prototype software development of flight algorithms. - Test and assess navigation software for integration into flight systems. - Assess and trouble-shoot the performance of Leo on-board GNSS hardware and software systems. - Work closely with GNC engineers to manage on-orbit performance and develop flight dynamics operations processes. Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum. A day in the life - Interacting with GNC teams to evaluate and troubleshoot satellite issues. - Working within the Flight Dynamics Research team to prioritize tasks. - Performing analysis, simulation, testing and documentation to address assigned tasks.
US, TX, Austin
What happens when you combine startup speed with Amazon-scale impact? You get this team. Amazon Enterprise Security Products is a newly launched group building intelligent, cloud-agnostic security tools using AI-first development practices. Here, you build AI and you build with AI at the same time. This role is a chance to define and lead the science strategy for the future of security tooling with a small, fast team that ships like a startup but deploys at Amazon scale. We're looking for a Senior Data Scientist who operates at the intersection of applied ML, agentic AI, and security; and who can set technical direction across ambiguous, undefined problem spaces. You won't just build models; you'll decide which problems are worth solving, architect the scientific approach for an entire product area, and raise the bar for how the team applies science. You'll partner with senior and principal engineers, applied scientists, security researchers, and PMs, and your judgment will shape roadmaps, not just deliverables. This is a role for someone who thrives in ambiguity, influences without authority, and turns "too ambitious" into shipped reality. Key job responsibilities - Set the science direction for a product area: Define the modeling strategy, scientific approach, and success metrics for entire categories of AI-first security capabilities, agentic systems, anomaly detection, threat classification, and automated response across multi-cloud environments. Decide where science can move the needle and where it can't. - Own the hardest, most ambiguous problems: Take on undefined, open-ended challenges where the path isn't clear, the data is messy or scarce, and the stakes are high. Frame the problem, choose the approach, and bring others along. - Build with AI to build AI and define how the team does it: Drive adoption of agentic coding tools, LLM-powered workflows, and experimental AI tooling across the science org. Establish the practices that multiply velocity for every scientist, not just yourself. - Architect agentic intelligence: Lead the design of models, embeddings, RAG pipelines, evaluation frameworks, and feedback loops that make multi-agent security systems smart, safe, and customer-ready at scale. Own the science architecture decisions others build on. - Drive technical strategy across teams: Influence roadmaps, dive deep with senior and principal scientists and engineers, and align cross-functional partners around a shared scientific vision. Your recommendations shape what the team invests in next. - Prototype, validate, and scale: Turn ambiguous hypotheses into prototypes in days, validate with real customer signal, and chart the path from prototype to production system that runs reliably at Amazon scale. - Communicate to influence at the executive level: Translate complex modeling results and scientific trade-offs into clear recommendations for engineers, product leaders, and senior executives. Drive organizational decisions with data and earn trust across the company. - Raise the bar and grow others: Mentor data scientists and applied scientists, lead technical and science reviews, and champion AI-first development practices. Shape the science culture and hiring bar of a fast-growing team from the ground floor. A day in the life No two days look the same on this fast-growing, AI-first team. You might start your morning setting direction in a roadmap review; making the call on which science investments will have the biggest customer impact and then dive into architecting an evaluation framework that the whole team will build on. Before lunch, you're pair-prompting with an agentic coding assistant to validate a new approach, then unblocking a teammate stuck on a thorny modeling problem. In the afternoon, you lead a design session with senior and principal scientists and engineers, then distill it into a crisp recommendation for senior leadership. You own ambiguous problems end to end, define how the team works, and see your decisions ripple across the product. This is where builders who want to lead with science come to do their best work. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Hybrid Work We value innovation and recognize this sometimes requires uninterrupted time to focus on a build. We also value in-person collaboration and time spent face-to-face. Our team affords employees options to work in the office every day or in a flexible, hybrid work model near one of our U.S. Amazon offices.
AU, NSW, Sydney
AWS Networking operates one of the largest and most complex networks on the planet. The team you'd join is responsible for the availability of that network — measuring how it performs for customers, predicting where it is most likely to degrade, and reshaping how we operate it as the workload grows. We are in the middle of a significant change in how network operations are run. Lessons from our recent work on automation, AI, and ML — including agentic systems that triage and mitigate incidents alongside engineers — are feeding into a broader rethink of where humans focus, where automation takes over, and how we measure whether either is working. We are looking for a Data Scientist to join the team in Sydney to drive the data science strategy behind that change. You will define the metrics that matter, own the evidence the team uses to make decisions, and measure whether each decision delivered the outcomes we expected. You'll be the data science voice on a team of senior network and software engineers — the person who decides what we measure, how we measure it, and what the numbers actually mean. Concretely, that means setting the analytical bar for the program, designing risk and reliability models against telemetry from millions of network devices, surfacing the patterns that drive customer-impact incidents, and turning that analysis into the dashboards and metrics our leaders use to set priorities. It also means owning the evaluations that tell us when a new piece of automation — including the agents we are rolling out to support engineers on the front line — is actually moving the needle on availability, and not just adding noise. If you are a scientist who wants to shape how a tier-one production network is run — using data to drive program strategy, not just to support it — at a scale no academic lab or startup can match, and you're at your best as the data science voice embedded in a team of engineers, this is the team for you. Key job responsibilities - Define and drive the data science strategy for the program — the metrics, the experiments, and what counts as evidence that a change worked - Lead the design and deployment of predictive risk and reliability models for network availability, using device failures, alarm telemetry, ticket data, and traffic signals - Own the evidence behind program decisions: where availability is at risk, where automation is ready to expand, where engineering effort has the highest leverage. Defend recommendations to senior technical and business audiences - Design and own the operational analytics and dashboards (Amazon QuickSight, Amazon CloudWatch, Python) used by senior leadership to track network health and the impact of operational change - Design and run experiments to evaluate the automation we are rolling out — including agentic systems supporting engineers on incidents — measuring whether each rollout improved availability - Drive data quality and classification improvements — event categorisation, root-cause attribution — so the program's metrics rest on solid ground - Build and own event-driven scoring pipelines (Python, SQL, AWS Lambda, Amazon S3, Amazon Athena) that keep the decide / measure / improve loop running - Bring statistical rigour to the engineers you partner with — review experiment designs, push back on unsupported assumptions, and raise the bar on how the team uses evidence A day in the life You might start the morning defining how the team will measure a new initiative — the success metrics, the counterfactual, the bar for calling it a win. By mid-morning you're with the engineering team turning a proposal into a decision: walking through trade-offs, pushing back where the data doesn't support an assumption. The afternoon is outcome measurement — refining the evaluation pipeline that tracks last week's rollout, updating the CloudWatch dashboard senior leadership uses to gate the next expansion, and prepping the data for an upcoming Director review. About the team We sit inside AWS Networking with a strong Sydney presence and a remit that spans network availability, the data and analytics that support it, and the automation we are building to change how operations are done. You'd be the data science voice in a small, senior team of network and software engineers in Sydney, partnering with the broader network engineering organisation across Seattle and Dublin. Small team, high autonomy, direct line to senior leadership, and a roadmap with real production impact rather than research demos.