Making deep learning practical for Earth system forecasting

Novel “cuboid attention” helps transformers handle large-scale multidimensional data, while diffusion models enable probabilistic prediction.

The Earth is a complex system. Variabilities ranging from regular events like temperature fluctuations to extreme events like drought, hailstorms, and the El Niño–Southern Oscillation (ENSO) phenomenon can influence crop yields, delay airline flights, and cause floods and forest fires. Precise and timely forecasting of these variabilities can help people take necessary precautions to avoid crises or better utilize natural resources such as wind and solar energy.

The success of transformer-based models in other AI domains has led researchers to attempt applying them to Earth system forecasting, too. But these efforts have encountered several major challenges. Foremost among these is the high dimensionality of Earth system data: naively applying the transformer’s quadratic-complexity attention mechanism is too computationally expensive.

Most existing machine-learning-based Earth systems models also output single, point forecasts, which are often averages across wide ranges of possible outcomes. Sometimes, however, it may be more important to know that there’s a 10% chance of an extreme weather event than to know the general averages across a range of possible outcomes. And finally, typical machine learning models don’t have guardrails imposed by physical laws or historical precedents and can produce outputs that are unlikely or even impossible.

In recent work, our team at Amazon Web Services has tackled all these challenges. Our paper “Earthformer: Exploring space-time transformers for Earth system forecasting”, published at NeurIPS 2022, suggests a novel attention mechanism we call cuboid attention, which enables transformers to process large-scale, multidimensional data much more efficiently.

And in “PreDiff: Precipitation nowcasting with latent diffusion models”, to appear at NeurIPS 2023, we show that diffusion models can both enable probabilistic forecasts and impose constraints on model outputs, making them much more consistent with both the historical record and the laws of physics.

Earthformer and cuboid attention

The heart of the transformer model is its “attention mechanism”, which enables it to weigh the importance of different parts of an input sequence when processing each element of the output sequence. This mechanism allows transformers to capture spatiotemporally long-range dependencies and relationships in the data, which have not been well modeled by conventional convolutional-neural-network- or recurrent-neural-network-based architectures.

Earth system data, however, is inherently high-dimensional and spatiotemporally complex. In the SEVIR dataset studied in our NeurIPS 2022 paper, for instance, each data sequence consists of 25 frames of data captured at five-minute intervals, each frame having a spatial resolution of 384 x 384 pixels. Using the conventional transformer attention mechanism to process such high-dimensional data would be extremely expensive.

In our NeurIPS 2022 paper, we proposed a novel attention mechanism we call cuboid attention, which decomposes input tensors into cuboids, or higher-dimensional analogues of cubes, and applies attention at the level of each cuboid. Since the computational cost of attention scales quadratically with the tensor size, applying attention locally in each cuboid is much more computationally tractable than trying to compute attention weights across the entire tensor at once. For instance, decomposing along the temporal axis can result in cost reduction by a factor of 3842 for the SEVIR dataset, since each frame has a spatial resolution of 384 x 384 pixels

Of course, such decomposition introduces a limitation: attention functions independently within each cuboid, with no communication between cuboids. To address this issue, we also compute global vectors that summarize the cuboids’ attention weights. Other cuboids can factor the global vectors into their own attention weight computations.

cuboid_illustration.gif
Cuboid attention layer processing an input tensor (X) with global vectors (G).

We call our transformer-based model with cuboid attention Earthformer. Earthformer adopts a hierarchical encoder-decoder architecture, which gradually encodes the input sequence to multiple levels of representations and generates the prediction via a coarse-to-fine procedure. Each hierarchy includes a stack of cuboid attention blocks. By stacking multiple cuboid attention layers with different configurations, we are able to efficiently explore effective space-time attention.

earthforer_enc_dec.png
The Earthformer architecture is a hierarchical transformer encoder-decoder with cuboid attention. In this diagram, “×D” means to stack D cuboid attention blocks with residual connections, while “×M” means to have M layers of hierarchies.

We experimented with multiple methods for decomposing an input tensor into cuboids. Our empirical studies show that the “axial” pattern, which stacks three unshifted local decompositions along the temporal, height, and width axes, is both effective and efficient. It achieves the best performance while avoiding the exponential computational cost of vanilla attention.

cub_pattern_together.png
Illustration of cuboid decomposition strategies when the input shape is (T, H, W) = (6, 4, 4), and cuboid size is (3, 2, 2). Elements that have the same color belong to the same cuboid and will attend to each other. Local decompositions aggregate contiguous elements of the tensor, and dilated decompositions aggregate elements according to a step function determined by the cuboid size. Both local and dilated decompositions, however, can be shifted by some number of elements along any of the tensor’s axes.

Experimental results

To evaluate Earthformer, we compared it to six state-of-the-art spatiotemporal forecasting models on two real-world datasets: SEVIR, for the task of continuously predicting precipitation probability in the near future (“nowcasting”), and ICAR-ENSO, for forecasting sea surface temperature (SST) anomalies.

On SEVIR, the evaluation metrics we used were standard mean squared error (MSE) and critical success index (CSI), a standard metric in precipitation nowcasting evaluation. CSI is also known as intersection over union (IoU): at different thresholds, it's denoted as CSI-thresh; their mean is denoted as CSI-M.

On both MSE and CSI, Earthformer outperformed all six baseline models across the board. Earthformer with global vectors also uniformly outperformed the version without global vectors.

Model

#Params.(M)

GFLOPS

Metrics

CSI-M↑

CSI-219↑

CSI-181↑

MSE(10-3)↓

Persistence

-

-

0.2613

0.0526

0.0969

11.5338

UNet

16.6

33

0.3593

0.0577

0.1580

4.1119

ConvLSTM

14.0

527

0.4185

0.1288

0.2482

3.7532

PredRNN

46.6

328

0.4080

0.1312

0.2324

3.9014

PhyDNet

13.7

701

0.3940

0.1288

0.2309

4.8165

E3D-LSTM

35.6

523

0.4038

0.1239

0.2270

4.1702

Rainformer

184.0

170

0.3661

0.0831

0.1670

4.0272

Earthformer w/o global

13.1

257

0.4356

0.1572

0.2716

3.7002

Earthformer

15.1

257

0.4419

0.1791

0.2848

3.6957

On ICAR-ENSO, we report the correlation skill of the three-month-moving-averaged Nino3.4 index, which evaluates the accuracy of SST anomaly prediction across a certain area (170°-120°W, 5°S-5°N) of the Pacific. Earthformer consistently outperforms the baselines in all concerned evaluation metrics, and the version using global vectors further improves performance.

Model

#Params.(M)

GFLOPS

Metrics

C-Nino3.4-M↑

C-Nino3.4-WM↑

MSE(10-4)↓

Persistence

-

-

0.3221

0. 447

4.581

UNet

12.1

0.4

0.6926

2.102

2.868

ConvLSTM

14.0

11.1

0.6955

2.107

2.657

PredRNN

23.8

85.8

0.6492

1.910

3.044

PhyDNet

3.1

5.7

0.6646

1.965

2.708

E3D-LSTM

12.9

99.8

0.7040

2.125

3.095

Rainformer

19.2

1.3

0.7106

2.153

3.043

Earthformer w/o global

6.6

23.6

0.7239

2.214

2.550

Earthformer

7.6

23.9

0.7329

2.259

2.546

PreDiff

Diffusion models have recently emerged as a leading approach to many AI tasks. Diffusion models are generative models that establish a forward process of iteratively adding Gaussian noise to training samples; the model then learns to incrementally remove the added noise in a reverse diffusion process, gradually reducing the noise level and ultimately resulting in clear and high-quality generation.

During training, the model learns a sequence of transition probabilities between each of the denoising steps it incrementally learns to perform. It is therefore an intrinsically probabilistic model, which is well suited for probabilistic forecasting.

A recent variation on diffusion models is the latent diffusion model: before passing to the diffusion model, an input is first fed to an autoencoder, which has a bottleneck layer that produces a compressed embedding (data representation); the diffusion model is then applied in the compressed space.

In our forthcoming NeurIPS paper, “PreDiff: Precipitation nowcasting with latent diffusion models”, we present PreDiff, a latent diffusion model that uses Earthformer as its core neural-network architecture.

By modifying the transition probabilities of the trained model, we can impose constraints on the model output, making it more likely to conform to some prior knowledge. We achieve this by simply shifting the mean of the learned distribution, until it complies better with the constraint we wish to impose. 

prediff_overview_new_v1.png
An overview of PreDiff. The autoencoder (e) encodes the input as a latent vector (zcond). The latent diffusion model, which adopts the Earthformer architecture, then incrementally denoises (steps zt+1 to z0) the noisy version of the input (zT). In the knowledge control step, the transition distributions between denoising steps are modified to accord with prior knowledge.

Results

We evaluated PreDiff on the task of predicting precipitation intensity in the near future (“nowcasting”) on SEVIR. We use anticipated precipitation intensity as a knowledge control to simulate possible extreme weather events like rainstorms and droughts.

We found that knowledge control with anticipated future precipitation intensity effectively guides generation while maintaining fidelity and adherence to the true data distribution. For example, the third row of the following figure simulates how weather unfolds in an extreme case (with probability around 0.35%) where the future average intensity exceeds μτ + 4στ. Such simulation can be valuable for estimating potential damage in extreme-rainstorm cases.

nbody_vis_v6.png
A set of example forecasts from PreDiff with knowledge control (PreDiff-KC), i.e., PreDiff under the guidance of anticipated average intensity. From top to bottom: context sequence y, target sequence x, and forecasts from PreDiff-KC showcasing different levels of anticipated future intensity τ + nστ), where n takes the values −4, −2, 0, 2, and 4.

Related content

US, MA, N.reading
Amazon Industrial Robotics is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of dexterous manipulation system that: - Enables unprecedented generalization across diverse tasks - Enables contact-rich manipulation in different environments - Seamlessly integrates low-level skills and high-level behaviors - Leverage mechanical intelligence, multi-modal sensor feedback and advanced control techniques. The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. Key job responsibilities - Design and implement methods for dexterous manipulation with single and dual arm manipulation - Leverage simulation and real-world data collection to create large datasets for model development - Develop a hierarchical system that combines low-level control with high-level planning - Utilize state-of-the-art manipulation models and optimal control techniques - Collaborate effectively with multi-disciplinary teams to co-design hardware and algorithms for dexterous manipulation
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the limits. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As an Applied Scientist on our team, you will focus on building state-of-the-art ML models for biology. Our team rewards curiosity while maintaining a laser-focus in bringing products to market. Competitive candidates are responsive, flexible, and able to succeed within an open, collaborative, entrepreneurial, startup-like environment. At the forefront of both academic and applied research in this product area, you have the opportunity to work together with a diverse and talented team of scientists, engineers, and product managers and collaborate with other teams. Key job responsibilities As an Applied Science, you will have access to large datasets with billions of images and video to build large-scale machine learning systems. Additionally, you will analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept. We are looking for smart scientists capable of using a variety of domain expertise combined with machine learning and statistical techniques to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. About the team Our team highly values work-life balance, mentorship and career growth. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We care about your career growth and strive to assign projects and offer training that will challenge you to become your best.
TW, TPE, Hsinchu City
Are you passionate about robotics and research? Do you want to solve real customer problems through innovative technology? Do you enjoy working on scalable research and projects in a collaborative team environment? Do you want to see your science solutions directly impact millions of customers worldwide? At Amazon, we hire the best minds in technology to innovate and build on behalf of our customers. Customer obsession is part of our company DNA, which has made us one of the world's most beloved brands. We’re looking for current PhD or Master students with a passion for robotic research and applications to join us as Robotics Applied Scientist II Intern/Co-ops in 2026 to shape the future of robotics and automation at an unprecedented scale across. For these positions, our Robotics teams at Amazon are looking for students with a specialization in one or more of the research areas in robotics such as: robotics, robotics manipulation (e.g., robot arm, grasping, dexterous manipulation, end of arm tools/end effector), autonomous mobile robots, mobile manipulation, movement, autonomous navigation, locomotion, motion/path planning, controls, perception, sensing, robot learning, artificial intelligence, machine learning, computer vision, large language models, human-robot interaction, robotics simulation, optimization, and more! We're looking for curious minds who think big and want to define tomorrow's technology. At Amazon, you'll grow into the high-impact engineer you know you can be, supported by a culture of learning and mentorship. Every day brings exciting new challenges and opportunities for personal growth. By applying to this role, you will be considered for Robotics Applied Scientist II Intern/Co-op (2026) opportunities across various Robotics teams at Amazon with different robotics research focus, with internship positions available for multiple locations, durations (3 to 6+ months), and year-round start dates (winter, spring, summer, fall). Amazon intern and co-op roles follow the same internship structure. "Intern/Internship" wording refers to both interns and co-ops. Amazon internships across all seasons are full-time positions during vacation, and interns should expect to work in office, Monday-Friday, up to 40 hours per week typically between 9am-6pm. Specific team norms around working hours will be communicated by your manager. Interns should not have other employment during the Amazon work-day. Applicants should have a minimum of one quarter/semester/trimester remaining in their studies after their internship concludes. The robotics internship join dates, length, location, and prospective team will be finalized at the time of any applicable job offers. In your application, you will be able to provide your preference of research interests, start dates, internship duration, and location. While your preference will be taken into consideration, we cannot guarantee that we can meet your selection based on several factors including but not limited to the internship availability and business needs of this role.
US, WA, Seattle
Here at Amazon, we embrace our differences. We are committed to furthering our culture of diversity and inclusion of our teams within the organization. How do you get items to customers quickly, cost-effectively, and—most importantly—safely, in less than an hour? And how do you do it in a way that can scale? Our teams of hundreds of scientists, engineers, aerospace professionals, and futurists have been working hard to do just that! We are delivering to customers, and are excited for what’s to come. Check out more information about Prime Air on the About Amazon blog (https://www.aboutamazon.com/news/transportation/amazon-prime-air-delivery-drone-reveal-photos). If you are seeking an iterative environment where you can drive innovation, apply state-of-the-art technologies to solve real world delivery challenges, and provide benefits to customers, Prime Air is the place for you. Come work on the Amazon Prime Air Team! Prime Air is seeking an experienced Applied Science Manager to help develop our advanced Navigation algorithms and flight software applications. In this role, you will lead a team of scientists and engineers to conduct analyses, support cross-functional decision-making, define system architectures and requirements, contribute to the development of flight algorithms, and actively identify innovative technological opportunities that will drive significant enhancements to meet our customers' evolving demands. This person must be comfortable working with a team of top-notch software developers and collaborating with our science teams. We’re looking for someone who innovates, and loves solving hard problems. You will work hard, have fun, and make history! Export Control License: This position may require a deemed export control license for compliance with applicable laws and regulations. Placement is contingent on Amazon’s ability to apply for and obtain an export control license on your behalf.
US, VA, Herndon
Application deadline: Applications will be accepted on an ongoing basis Are you excited to help the US Intelligence Community design, build, and implement AI algorithms, including advanced Generative AI solutions, to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) and Generative AI methods. We build models for text, image, video, audio, and multi-modal use cases, leveraging both traditional ML approaches and state-of-the-art generative models including Large Language Models (LLMs), text-to-image generation, and other advanced AI capabilities to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position requires that the candidate selected must currently possess and maintain an active TS/SCI Security Clearance with Polygraph. The position further requires the candidate to opt into a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities As an Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction - This position may require up to 25% local travel. About the team About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences and inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
US, VA, Arlington
The Benefits Science team drives evidence-based decision-making across BXT (Benefits, eXperience & Technology) through causal evaluation, structural modeling, conjoint experiments, and the creation of tools that scale our analytic capabilities. We transform complex data into actionable insights that enhance the employee experience and advance innovative benefits design. We are looking for an economist who is able to work with business partners to hone complex problems into specific, scientific questions, and test those questions to generate insights. The ideal candidate will collaborate with business partners to design and evaluate pilots, estimate models on large scale data, develop and deploy conjoint surveys, and transform successful prototypes into improved policies and programs at scale. This job requires analysis of complex health claims data. Economists with experience working with claims data and an understanding of the structure of the health care industry are strongly encouraged to apply. Key job responsibilities - Design and conduct rigorous evaluations of benefits programs - Support the development and application of structural models - Develop experiments to evaluate the impact of benefits initiatives - Communicate complex findings to business stakeholders in clear, actionable terms - Work with engineering teams to develop scalable tools that automate and streamline evaluation processes A day in the life Work with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions.
US, WA, Seattle
Are you fascinated by the power of Large Language Models (LLM) and applying Generative AI to solve complex challenges within one of Amazon's most significant businesses? Amazon Selection and Catalog Systems (ASCS) builds the systems that host and run the world's largest e-Commerce products catalog, it powers the online buying experience for customers worldwide so they can find, discover and buy anything they want. Amazon's customers rely on the completeness, consistency and correctness of Amazon's product data to make well-informed purchase decisions. We develop LLM applications that make Catalog the best-in-class source of product information for all products worldwide. This problem is challenging due to sheer scale (billions of products in the catalog), diversity (products ranging from electronics to groceries) and multitude of input sources (millions of sellers contributing product data with different quality). We are seeking a passionate, talented, and inventive individual to join the Catalog AI team and help build industry-leading technologies that customers will love. You will apply machine learning and large language model techniques, such as fine-tuning, reinforcement learning, and prompt optimization, to solve real customer problems. You will work closely with scientists and engineers to experiment with new methods, run large-scale evaluations, and bring research ideas into production. Key job responsibilities * Design and implement LLM-based solutions to improve catalog data quality and completeness * Conduct experiments and A/B tests to validate model improvements and measure business impact * Optimize large language models for quality and cost on catalog-specific tasks * Collaborate with engineering teams to deploy models at scale serving billions of products
US, TX, Austin
Our team is involved with pre-silicon design verification for custom IP. A critical requirement of the verification flow is the requirement of legal and realistic stimulus of a custom Machine Learning Accelerator Chip. Content creation is built using formal methods that model legal behavior of the design and then solving the problem to create the specific assembly tests. The entire frame work for creating these custom tests is developed using a SMT solver and custom software code to guide the solution space into templated scenarios. This highly visible and innovative role requires the design of this solving framework and collaborating with design verification engineers, hardware architects and designers to ensure that interesting content can be created for the projects needs. Key job responsibilities Develop an understanding for a custom machine learning instruction set architecture. Model correctness of instruction streams using first order logic. Create custom API's to allow control over scheduling and randomness. Deploy algorithms to ensure concurrent code is safely constructed. Create coverage metrics to ensure solution space coverage. Use novel methods like machine learning to automate content creation. About the team Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for customers who require specialized security solutions for their cloud services. Annapurna Labs (our organization within AWS UC) designs silicon and software that accelerates innovation. Customers choose us to create cloud solutions that solve challenges that were unimaginable a short time ago—even yesterday. Our custom chips, accelerators, and software stacks enable us to take on technical challenges that have never been seen before, and deliver results that help our customers change the world. About AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
IN, KA, Bengaluru
Interested to build the next generation Financial systems that can handle billions of dollars in transactions? Interested to build highly scalable next generation systems that could utilize Amazon Cloud? Massive data volume + complex business rules in a highly distributed and service oriented architecture, a world class information collection and delivery challenge. Our challenge is to deliver the software systems which accurately capture, process, and report on the huge volume of financial transactions that are generated each day as millions of customers make purchases, as thousands of Vendors and Partners are paid, as inventory moves in and out of warehouses, as commissions are calculated, and as taxes are collected in hundreds of jurisdictions worldwide. Key job responsibilities • Understand the business and discover actionable insights from large volumes of data through application of machine learning, statistics or causal inference. • Analyse and extract relevant information from large amounts of Amazon’s historical transactions data to help automate and optimize key processes • Research, develop and implement novel machine learning and statistical approaches for anomaly, theft, fraud, abusive and wasteful transactions detection. • Use machine learning and analytical techniques to create scalable solutions for business problems. • Identify new areas where machine learning can be applied for solving business problems. • Partner with developers and business teams to put your models in production. • Mentor other scientists and engineers in the use of ML techniques. A day in the life • Understand the business and discover actionable insights from large volumes of data through application of machine learning, statistics or causal inference. • Analyse and extract relevant information from large amounts of Amazon’s historical transactions data to help automate and optimize key processes • Research, develop and implement novel machine learning and statistical approaches for anomaly, theft, fraud, abusive and wasteful transactions detection. • Use machine learning and analytical techniques to create scalable solutions for business problems. • Identify new areas where machine learning can be applied for solving business problems. • Partner with developers and business teams to put your models in production. • Mentor other scientists and engineers in the use of ML techniques. About the team The FinAuto TFAW(theft, fraud, abuse, waste) team is part of FGBS Org and focuses on building applications utilizing machine learning models to identify and prevent theft, fraud, abusive and wasteful(TFAW) financial transactions across Amazon. Our mission is to prevent every single TFAW transaction. As a Machine Learning Scientist in the team, you will be driving the TFAW Sciences roadmap, conduct research to develop state-of-the-art solutions through a combination of data mining, statistical and machine learning techniques, and coordinate with Engineering team to put these models into production. You will need to collaborate effectively with internal stakeholders, cross-functional teams to solve problems, create operational efficiencies, and deliver successfully against high organizational standards.
US, WA, Seattle
The Sponsored Products and Brands (SPB) team at Amazon Ads is transforming advertising through generative AI technologies. We help millions of customers discover products and engage with brands across Amazon.com and beyond. Our team combines human creativity with artificial intelligence to reinvent the entire advertising lifecycle—from ad creation and optimization to performance analysis and customer insights. We develop responsible AI technologies that balance advertiser needs, enhance shopping experiences, and strengthen the marketplace. Our team values innovation and tackles complex challenges that push the boundaries of what's possible with AI. Join us in shaping the future of advertising. Key job responsibilities This role will redesign how ads create personalized, relevant shopping experiences with customer value at the forefront. Key responsibilities include: - Design and develop solutions using GenAI, deep learning, multi-objective optimization and/or reinforcement learning to transform ad retrieval, auctions, whole-page relevance, and shopping experiences. - Partner with scientists, engineers, and product managers to build scalable, production-ready science solutions. - Apply industry advances in GenAI, Large Language Models (LLMs), and related fields to create innovative prototypes and concepts. - Improve the team's scientific and technical capabilities by implementing algorithms, methodologies, and infrastructure that enable rapid experimentation and scaling. - Mentor junior scientists and engineers to build a high-performing, collaborative team. A day in the life As an Applied Scientist on the Sponsored Products and Brands Off-Search team, you will contribute to the development in Generative AI (GenAI) and Large Language Models (LLMs) to revolutionize our advertising flow, backend optimization, and frontend shopping experiences. This is a rare opportunity to redefine how ads are retrieved, allocated, and/or experienced—elevating them into personalized, contextually aware, and inspiring components of the customer journey. You will have the opportunity to fundamentally transform areas such as ad retrieval, ad allocation, whole-page relevance, and differentiated recommendations through the lens of GenAI. By building novel generative models grounded in both Amazon’s rich data and the world’s collective knowledge, your work will shape how customers engage with ads, discover products, and make purchasing decisions. If you are passionate about applying frontier AI to real-world problems with massive scale and impact, this is your opportunity to define the next chapter of advertising science. About the team The Off-Search team within Sponsored Products and Brands (SPB) is focused on building delightful ad experiences across various surfaces beyond Search on Amazon—such as product detail pages, the homepage, and store-in-store pages—to drive monetization. Our vision is to deliver highly personalized, context-aware advertising that adapts to individual shopper preferences, scales across diverse page types, remains relevant to seasonal and event-driven moments, and integrates seamlessly with organic recommendations such as new arrivals, basket-building content, and fast-delivery options. To execute this vision, we work in close partnership with Amazon Stores stakeholders to lead the expansion and growth of advertising across Amazon-owned and -operated pages beyond Search. We operate full stack—from backend ads-retail edge services, ads retrieval, and ad auctions to shopper-facing experiences—all designed to deliver meaningful value.