An animation shows a stack of boxes slowly reducing in number to arrive at an optimal suite of boxes for packing items as part of Amazon's PackOpt system
By the end of 2022, about 90% of all boxes shipped by Amazon will be sent from an optimized box suite, thanks to implementation of the pioneering web-based PackOpt tool.

How Amazon learned to cut its cardboard waste

Pioneering web-based PackOpt tool has resulted in an annual reduction in cardboard waste of 7% to 10% in North America, saving roughly 60,000 tons of cardboard annually.

In a world of ideal sustainability, every customer order received by Amazon that required a box would ship in a box tailored precisely to the size of its contents to minimize cardboard (corrugate) waste for the customer and maximize the efficiency of order fulfillment.

But with an ever-changing catalogue of hundreds of millions of items and multiple items often shipped in a shared box, this dream scenario would require a near-infinite range of box sizes standing ready at Amazon’s fulfillment centers (FCs).

While Amazon works toward producing right-sized boxes for each shipment, the current solution to minimizing waste is to furnish every fulfillment center with a limited suite of cardboard box options. These suites vary depending on the type of items being fulfilled. For example, some FCs are focused on shipping single or multiple items that have been sorted automatically by robots and packed by Amazon associates.

Related content
A combination of deep learning, natural language processing, and computer vision enables Amazon to hone in on the right amount of packaging for each product.

In North America, single items shipped from sortable FCs that require a box, with some exceptions, are typically shipped within one of a finite number of box sizes. Multiple items being shipped together are packed into a box drawn from a different suite of boxes that are designed for a larger and heavier payload.

Another type of FC, known as non-sortable, deals with larger items that require oversized boxes — patio furniture, for example — and these FCs need yet another suite of boxes.

The question that Amazon has addressed with increasing success over the past few years is this: Given the items typically shipped in a particular Amazon region, marketplace, or FC, what is the optimal box suite?

That answer has now been embodied in a pioneering web-based tool called PackOpt that is being embraced by Amazon managers all over the world.

By the end of 2022, about 90% of all boxes shipped by Amazon will be sent from an optimized box suite. In North America, applying PackOpt technology has resulted in an annual reduction in cardboard waste of 7% to 10%, saving roughly 60,000 tons of cardboard annually. In emerging countries such as Singapore, PackOpt has delivered more than double that percentage efficiency.

Matrix revolutions

David Gasperino, an Amazon principal research scientist, led the technical development of PackOpt, which is helping Amazon’s stakeholders to not only minimize the amount of “air” shipped to customers, but also helping Amazon deliver on its Climate Pledge commitment to reaching net-zero carbon emissions across its business by 2040.

Arriving at the perfect suite of boxes is incredibly difficult, says Gasperino, partly because the number of possibilities is enormous.

This problem belongs to a theoretical class of problems called ‘NP hard’: essentially, no one knows if there's a really efficient algorithm to solve them.
Renan Garcia

To imagine the challenge in the simplest terms, first picture a matrix 100+ million rows deep — these represent shipments over a time period within a given region. Each of the 20,000 or so columns on the matrix, meanwhile, represents a candidate box of various dimensions that might become part of a suite of boxes.

“To create an optimal set of boxes, you need to select a small subset of columns to pack all of the shipments, and those columns must lead to the smallest overall box volume when you sum it all up,” explains Gasperino.

It is a hard challenge — literally.

“This problem belongs to a theoretical class of problems called ‘NP hard’: essentially, no one knows if there's a really efficient algorithm to solve them,” says Renan Garcia, a principal research scientist who helped to design PackOpt’s optimization framework (NP Hard is the same class of problem as the infamous “traveling salesman problem”).

The sheer size of the matrix is a challenge, says Garcia. “The matrix that you need to build is so big, you can't even store it in memory.”

Related content
Amazon joins the US DOE’s Bio-Optimized Technologies to keep Thermoplastics out of Landfills and the Environment (BOTTLE™) Consortium, focusing on materials and recycling innovation.

The team addressed this computational tractability issue in several ways. First, to simplify the problem their approach narrows the range of candidate-box dimensions to 2-inch increments in any direction before the first phase of iterative improvements, reducing the initial set of candidate boxes into the hundreds.

After the optimizer discovers the best candidates in this “coarse” set of boxes, it will take those best prospects as a starting point and search again, this time using 1-inch dimensional increments, and so on toward finer dimensions.

“Theoretically, the algorithm will converge on a high-quality box suite no matter where you start,” says Garcia.

The team also employed process parallelization across multiple computational cores to break the problem into smaller chunks.

“Multiple cores can be doing this in parallel, exploring alternate solutions. And every so often they communicate their best solution back to each other,” says Garcia. The result: PackOpt can solve in minutes what previously took weeks of computation time.

3D Tetris

PackOpt for box suites shipping single items launched in 2018. A year later, an enhanced version was capable of identifying the best box suite for shipments containing multiple items in the same box.

For this iteration, the team added a high-performance algorithm that very rapidly determines how the different items to be delivered together can be configured to fit inside a candidate box — think 3D Tetris. PackOpt also knows, for example, that foldable or compressible items such as clothing can easily be slotted in around other, more solid items.

Related content
The story of a decade-plus long journey toward a unified forecasting model.

In theory, this meant packing more items into better-fitting boxes. But did it work in practice?

“One of our colleagues, Neb Getaneh, designed and conducted studies in the Amazon Packaging Lab to quantify the impact of packaging boxes with less air due to size and fitting algorithm optimization,” says Gasperino. “And we did not see any degradation in packing performance.”

But creating a clever algorithm doesn’t automatically translate into real-world impact.

“There are many different steps that must happen between solving this optimization problem and actually delivering optimized packaging to our customers’ doorsteps,” says Gasperino. “We needed the regional packaging leads all over the world, who aren’t scientists, to quickly understand how to use PackOpt and to see the economic value in it for themselves, and eventually become the champions for packaging optimization.”

Democratizing the tool

Ease of use would be critical in the push to democratize the tool.

“PackOpt’s algorithms have about 25 different parameters and they're all scientific in nature,” Garcia says. “We didn’t want the user to worry about that kind of thing, so we abstracted these parameters away, behind the scenes.”

Gasperino and team also partnered with AWS ProServe consultants to design and build a streamlined web app to democratize use of PackOpt. The resulting user interface is simple, essentially requiring two inputs: historical shipment data of the region aiming to optimize their boxes, and the dimensions of the boxes in their current suite.

“PackOpt will then simulate how well your products fit in your current boxes, giving you a total cardboard weight, box utilization rate, and packaging volume — among many other metrics — and compare those metrics with an optimized box suite,” says Chris Collins, a support engineer who helped develop the PackOpt web tool.

Related content
How Amazon’s scientists developed a first-of-its-kind multi-echelon system for inventory buying and placement.

If a significant improvement is revealed, there is an immediate business and sustainability case for optimizing that suite with boxes of more appropriate dimensions. PackOpt can also identify if increasing the number of box options in a given suite will boost efficiency significantly as well as automatically track savings after teams have deployed their suite.

“The savings tracking function was developed to help stakeholders quantify the impacts of their optimized box suites in a scalable manner,” Collins explains. “This function could also be used to help the stakeholder keep their finger on the pulse of the optimized packaging suite, knowing that if the savings metrics begin to fall off it could signal to the team the need to re-optimize the current package selections.”

Another of the key metrics PackOpt reveals is air per shipment.

“It’s understandably a hot topic with Amazon customers who receive an order with too much air in the box compared with the item itself,” says Collins. “PackOpt helps improve our customer experience by really driving down such shipments.”

The word gets out

PackOpt has been embraced in fulfillment centers around the world. After proving the tool’s operational effectiveness in North America, Amazon Japan was first to show a keen interest and develop its own box suite.

“The products going through our Japan FCs are different to those going through North America’s, so there's no reason the box suites should be the same across those two regions,” notes Gasperino.

“Using PackOpt has simplified my team’s work significantly,” says Myles Lefkovitz, a customer packaging experience manager in Tokyo. “We’ve been able to accomplish things that simply wouldn’t have been possible without it and driven down our packaging costs.”

Use of the tool quickly spread around the world at the regional level. But such is the power and flexibility of PackOpt, it is increasingly being used at a more granular level by Amazon stakeholders, says Collins.

See Amazon's Bengalaru research office
Research in Bengaluru spans numerous disciplines, including fraud detection, information retrieval, advertising, automatic speech recognition, and operations.

“In India, for example, customers’ purchasing behavior, and the items purchased, vary vastly across the country, so managers at Amazon India have used PackOpt to tailor bespoke box suites for each fulfillment center.”

“Packaging optimization is a crucial part of Amazon’s commitment to The Climate Pledge and reducing waste on behalf of customers,” says Alex Hartford, business lead for packaging optimization. “In a company the scale of Amazon, even seemingly small optimizations in material reduction make a big impact not only in terms of carbon impact, but also on Amazon’s ability to lower our cost structures and spin the Amazon flywheel.”

In addition to different Amazon regions selling different products, as much as a third of a given region’s Amazon catalogue might change from one year to the next, meaning the product profile is forever changing. Moreover, new packaging types — such as recycled padded mailers or poly bags — also affect the optimal box suite. As a result, PackOpt’s monitoring mission is ongoing.

Amazon itself is a nested packing problem, right? You put customer orders inside boxes, you put boxes inside tote bags, you put tote bags inside trucks … we need to optimize the dimensions of all of these.
Renan Garcia

Its creators envision how the technology could usefully spill over to the wider Amazon.

“Amazon itself is a nested packing problem, right?” says Garcia. “You put customer orders inside boxes, you put boxes inside tote bags, you put tote bags inside trucks … We have storage facilities of all shapes and sizes, and we need to optimize the dimensions of all of these.”

In fact, Renan has begun applying the underlying PackOpt concepts to related applications throughout Amazon. For example, he has partnered with colleagues from Last Mile Transportation to redesign Amazon Robotics pods for outbound packages in sortation centers.

The team developed a local search framework to solve this more challenging nested packing variant (products in packages, packages in bins, and bins in pods) which generates designs requiring 33% fewer pods and leads to more efficient use of precious facility space.

“This sort of optimization opportunity exists throughout our supply chain,” says Hartford. “It is critical that we look at other parts of our network to see where we can apply both the fitting algorithms that we've developed and the optimization tools.”

Related content

ES, B, Barcelona
Are you interested in defining the science strategy that enables Amazon to market to millions of customers based on their lifecycle needs rather than one-size-fits-all campaigns? We are seeking a Applied Scientist to lead the science strategy for our Lifecycle Marketing Experimentation roadmap within the PRIMAS (Prime & Marketing analytics and science) team. The position is open to candidates in Amsterdam and Barcelona. In this role, you will own the end-to-end science approach that enables EU marketing to shift from broad, generic campaigns to targeted, cohort-based marketing that changes customer behavior. This is a high-ambiguity, high-impact role where you will define what problems are worth solving, build the science foundation from scratch, and influence senior business leaders on marketing strategy. You will work directly with Business Directors and channel leaders to solve critical business problems: how do we win back customers lost to competitors, convert Young Adults to Prime, and optimize marketing spend by de-averaging across customer cohorts. Key job responsibilities Science Strategy & Leadership: 1. Own the end-to-end science strategy for lifecycle marketing, defining the roadmap across audience targeting, behavioral modeling, and measurement 2. Navigate high ambiguity in defining customer journey frameworks and behavioral models – our most challenging science problem with no established playbook 3. Lead strategic discussions with business leaders translating business needs into science solutions and building trust across business and tech partners 4. Mentor and guide a team of 2-3 scientists and BIEs on technical execution while contributing hands-on to the hardest problems Advanced Customer Behavior Modeling: 1. Build sophisticated propensity models identifying customer cohorts based on lifecycle stage and complex behavioral patterns (e.g., Bargain hunters, Young adults Prime prospects) 2. Define customer journey frameworks using advanced techniques (Hidden Markov Models, sequential decision-making) to model how customers transition across lifecycle stages 3. Identify which customer behaviors and triggers drive lifecycle progression and what messaging/levers are most effective for each cohort 4. Integrate 1P behavioral data with 2P survey insights to create rich, actionable audience definitions Measurement & Cross-Workstream Integration: 1. Partner with measurement scientist to design experiments (RCTs) that isolate audience targeting effects from creative effects 2. Ensure audience definitions, journey models, and measurement frameworks work coherently across Meta, LiveRamp, and owned channels 3. Establish feedback loops connecting measurement insights back to model improvements About the team The PRIMAS (Prime & Marketing Analytics and Science) is the team that support the science & analytics needs of the EU Prime and Marketing organization, an org that supports the Prime and Marketing programs in European marketplaces and comprises 250-300 employees. The PRIMAS team, is part of a larger tech tech team of 100+ people called WIMSI (WW Integrated Marketing Systems and Intelligence). WIMSI core mission is to accelerate marketing technology capabilities that enable de-averaged customer experiences across the marketing funnel: awareness, consideration, and conversion.
IN, KA, Bengaluru
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced algorithmic systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning and Data Sciences team for India Consumer Businesses. If you have an entrepreneurial spirit, know how to deliver, love to work with data, are deeply technical, highly innovative and long for the opportunity to build solutions to challenging problems that directly impact the company's bottom-line, we want to talk to you. Major responsibilities - Use machine learning and analytical techniques to create scalable solutions for business problems - Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes - Design, development, evaluate and deploy innovative and highly scalable models for predictive learning - Research and implement novel machine learning and statistical approaches - Work closely with software engineering teams to drive real-time model implementations and new feature creations - Work closely with business owners and operations staff to optimize various business operations - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation - Mentor other scientists and engineers in the use of ML techniques
ES, M, Madrid
At Amazon, we are committed to being the Earth's most customer-centric company. The European International Technology group (EU INTech) owns the enhancement and delivery of Amazon's engineering to all the varied customers and cultures of the world. We do this through a combination of partnerships with other Amazon technical teams and our own innovative new projects. You will be joining the Tamale team to work on Haul. As part of EU INTech and Haul, Tamale strives to create a discovery-driven shopping experience using challenging machine learning and ranking solutions. You will be exposed to large-scale recommendation systems, multi-objective optimization, and state-of-the-art deep learning architectures, and you'll be part of a key effort to improve our customers' browsing experience by building next-generation ranking models for Amazon Haul's endless scroll experience. We are looking for a passionate, talented, and inventive Scientist with a strong machine learning background to help build industry-leading ranking solutions. We strongly value your hard work and obsession to solve complex problems on behalf of Amazon customers. Key job responsibilities We look for applied scientists who possess a wide variety of skills. As the successful applicant for this role, you will work closely with your business partners to identify opportunities for innovation. You will apply machine learning solutions to optimize multi-objective ranking, improve discovery engagement through contextual signals, and scale ranking systems across multiple marketplaces. You will work with business leaders, scientists, and product managers to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of highly scalable distributed ranking services. You will be part of a team of scientists and engineers working on solving ranking and personalization challenges at scale. You will be able to influence the scientific roadmap of the team, setting the standards for scientific excellence. You will be working with state-of-the-art architectures and real-time feature serving systems. Your work will improve the experience of millions of daily customers using Amazon Haul worldwide. You will have the chance to have great customer impact and continue growing in one of the most innovative companies in the world. You will learn a huge amount - and have a lot of fun - in the process!
IN, HR, Gurugram
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for International Emerging Stores (IES). Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the International Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions across International Emerging Store (India, MENA, Far-East, LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, WA, Bellevue
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.
US, MA, Boston
The Artificial General Intelligence (AGI) team is seeking a dedicated, skilled, and innovative Applied Scientist with a robust background in machine learning, statistics, quality assurance, auditing methodologies, and automated evaluation systems to ensure the highest standards of data quality, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As part of the AGI team, an Applied Scientist will collaborate closely with core scientist team developing Amazon Nova models. They will lead the development of comprehensive quality strategies and auditing frameworks that safeguard the integrity of data collection workflows. This includes designing auditing strategies with detailed SOPs, quality metrics, and sampling methodologies that help Nova improve performances on benchmarks. The Applied Scientist will perform expert-level manual audits, conduct meta-audits to evaluate auditor performance, and provide targeted coaching to uplift overall quality capabilities. A critical aspect of this role involves developing and maintaining LLM-as-a-Judge systems, including designing judge architectures, creating evaluation rubrics, and building machine learning models for automated quality assessment. The Applied Scientist will also set up the configuration of data collection workflows and communicate quality feedback to stakeholders. An Applied Scientist will also have a direct impact on enhancing customer experiences through high-quality training and evaluation data that powers state-of-the-art LLM products and services. A day in the life An Applied Scientist with the AGI team will support quality solution design, conduct root cause analysis on data quality issues, research new auditing methodologies, and find innovative ways of optimizing data quality while setting examples for the team on quality assurance best practices and standards. Besides theoretical analysis and quality framework development, an Applied Scientist will also work closely with talented engineers, domain experts, and vendor teams to put quality strategies and automated judging systems into practice.