The rate of innovation in machine learning is simply off the chart — what is possible today was barely on the drawing board even a handful of years ago. At Amazon, this has manifested in a robotic system that can not only identify potential space in a cluttered storage bin, but also sensitively manipulate that bin’s contents to create that space before successfully placing additional items inside — a result that, until recently, was impossible.
This journey starts when a product arrives at an Amazon fulfillment center (FC). The first order of business is to make it available to customers by adding it to the FC's available inventory.
In practice, this means picking it up and stowing it in a storage pod. A pod is akin to a big bookcase, made of sturdy yellow fabric, that comprises up to 40 cubbies, known as bins. Each bin has strips of elastic across its front to keep the items inside from falling out. These pods are carried by a wheeled robot, or drive unit, to the workstation of the Amazon associate doing the stowing. When the pod is mostly full, it is wheeled back into the warehouse, where the items it contains await a customer order.
Stowing is a major component of Amazon’s operations. It is also a task that seemed an intractable problem from a robotic automation perspective, due to the subtlety of thought and dexterity required to do the job.
Picture the task. You have an item for stowing in your hand. You gauge its size and weight. You look at the array of bins before you, implicitly perceiving which are empty, which are already full, which bins have big chunks of space in them, and which have the potential to make space if you, say, pushed all the items currently in the bin to one side. You select a bin, move the elastic out of the way, make room for the item, and pop it in. Job done. Now repeat.
“Breaking all existing industrial robot thinking”
This stow task requires two high-level capabilities not generally found in robots. One, an excellent understanding of the three-dimensional world. Two, the ability to manipulate a wide range of packaged but sometimes fragile objects — from lightbulbs to toys — firmly, but sensitively: pushing items gently aside, flipping them up, slotting one item at an angle between other items and so on.
For a robotic system to stand a chance at this task, it would need intelligent visual perception, a free-moving robot arm, an end-of-arm manipulator unknown to engineering, and a keen sense of how much force it is exerting. In short: good luck with that.
“Stow fundamentally breaks all existing industrial robotic thinking,” says Siddhartha Srinivasa, director of Amazon Robotics AI. “Industrial manipulators are typically bulky arms that execute fixed trajectories very precisely. It’s very positional.”
When Srinivasa joined Amazon in 2018, multiple robotics programs had already attempted to stow to fabric pods using stiff positional manipulators.
“They failed miserably at it because it's a nightmare. It just doesn't work unless you have the right computational tool: you must not think physically, but computationally.”
Srinivasa knew the science for robotic stow didn’t exist yet, but he knew the right people to hire to develop it. He approached Parker Owan as he completed his PhD at the University of Washington.
A “beautiful problem”
“At the time I was working on robotic contact, imitation learning, and force control,” says Owan, now a Robotics AI senior applied scientist. “Sidd said ‘Hey, there’s this beautiful problem at Amazon that you might be interested in taking a look at’, and he left it at that.”
The seed was planted. Owan joined Amazon, and then in 2019 dedicated himself to the stow challenge.
“I came at it from the perspective of decision-making algorithms: the perception needs; how to match items to the appropriate bin; how to leverage information of what's in the bin to make better decisions; motion planning for a robot arm moving through free space; and then actually making contact with products and creating space in bins.”
About six months into his exploratory work, Owan was joined by a small team of applied scientists, and hardware expert Aaron Parness, now a Robotics AI senior manager of applied science. Parness admits he was skeptical.
“My initial reaction was ‘Oh, how brave and naïve that this guy, fresh out of his PhD, thinks robots can deal with this level of clutter and physical contact!’”
But Parness was quickly hooked. “Once you see how the problem can be broken down and structured, it suddenly becomes clear that there's something super useful and interesting here.”
“Uncharted territory”
From a hardware perspective, the team needed to find a robot arm with force feedback. They tried several, before the team landed on an effective model. The arm provides feedback hundreds of times per second on how much force it is applying and any resistance it is meeting. Using this information to control the robot is called compliant manipulation.
“We knew from the beginning that we needed compliant manipulation, and we hadn't seen anybody in industry do this at scale before,” says Owan. “It was uncharted territory.”
Parness got to work on the all-important hardware. The problem of moving the elastics aside to stow an item was resolved using a relatively simple hooking system.
The end-of-arm tool (EOAT) proved to be a next-level challenge. One reason that stowing is difficult for robots is the sheer diversity of items Amazon sells, and their associated packaging. You might have an unpumped soccer ball next to a book, next to a sports drink, next to a T-shirt, next to a jewelry box. A robot would need to handle this level of variety. The EOAT evolved quickly over two years, with multiple failures and iterations.
“In the end, we found that gently squeezing an item between two paddles was the more stable way to hold items than using suction cups or mechanical pinchers,” says Parness.
However, the paddle set up presented a challenge when trying to insert held items into bins — the paddles kept getting in the way. Parness and his growing team hit upon an alternative: holding the item next to a bin, before simultaneously opening the paddles and using a plunger to push the item in. This drop-and-push technique was prone to errors because not all items reacted to it in the same way.
The EOAT’s next iteration saw the team put miniature conveyor belts on each paddle, enabling the EOAT to feed items smoothly into the bins without having to enter the bin itself.
“With that change, our stowing success rate jumped from about 80% to 99%. That was a eureka moment for us — we knew we had our winner,” says Parness.
Making space with motion primitives
The ability to place items in bins is crucial, but so is making space in cluttered bins. To better understand what would be required of the robot system, the team closely studied how they performed the task themselves. Owan even donned a head camera to record his efforts.
The team was surprised to find that the vast majority of space-making hand movements within a fabric bin could be boiled down to four types or “motion primitives”. These include a sideways sweep of the bin’s current contents, flipping upright things that are lying flat, stacking, and slotting something at an angle into the gap between other items.
The engineers realized that the EOAT’s paddles could not get involved with this bin-manipulation task, because they would get in the way. The solution, in the end, was surprisingly simple: a thin metal sheet that could extend from the EOAT, dubbed “the spatula”. The extended spatula can firmly, but sensitively, push items to one side, flip them up, and generally be used to make room in a bin, before the paddles eject an item into the space created.
But how does the system know how full the pod’s bins are, and how does it decide where, and how, it will make space for the next item to be stowed? This is where visual perception and machine learning come into play.
Deciding where to attempt to stow an item requires a good understanding of how much space, in total, is available in each fabric bin. In an ideal world, this is where 3D sensor technologies such as LiDAR would be used. However, because the elastic cords across the front of every bin partially blocks the view inside, this option isn’t feasible.
Instead, the system’s visual perception is based on cameras pointed at the pod that feed their image data to a machine learning system. Based on what it can see of each bin’s contents, the system “erases” the elastics and models what is lying unseen in the bin, and then estimates the total available space in each of the pod’s bins.
Often there is space available in a cluttered bin, but it is not contiguous: there are pockets of space here and there. The ML system — based in part on existing models developed by the Amazon Fulfillment Technologies team — then predicts how much contiguous space it can create in each bin, given the motion primitives at its disposal.
“These primitives, each of which can be varied as needed, can be chained in infinitely many ways,” Srinivasa explains. “It can, say, flip it over here, then push it across and drop the item in. Humans are great at identifying these primitives in the first place, and machine learning is great at organizing and orchestrating them.”
When the system has a firm idea of the options, it considers the items in its buffer — an area near the robot arm’s gantry in which products of various shapes and sizes wait to be stowed — and decides which items are best placed in which bins for maximum efficiency.
“For every potential stow, the system will predict its likelihood of success,” says Parness. “When the best prediction of success falls to about 96%, which happens when a pod is nearly full, we send that pod off and wheel in a new one.”
“Robots and people work together”
At the end of summer 2021, with its potential feasibility and value becoming clearer, the senior leadership team at Amazon gave the project their full backing.
“They said ‘As fast as you can go; whatever you need’. So this year has been a wild, wild ride. It feels like we’re a start-up within Amazon,” says Parness, who noted the approach has significant advantages for FC employees as well.
“Robots and people work together in a hybrid system. Robots handle repetitive tasks and easily reach to the high and low shelves. Humans handle more complex items that require intuition and dexterity. The net effect will be more efficient operations that are also safer for our workers.”
Prototypes of the robotic stow workstation are installed at a lab in Seattle, Washington, and another system has been installed at an FC in Sumner, Washington, where it deals with live inventory. Already, the prototypes are stowing items well and showcasing the viability of the system.
“And there are always four or five scientists and engineers hovering around the robot, documenting issues and looking for improvements,” says Parness.
Stow will be the first brownfield automation project, at scale, at Amazon. We're enacting a future in which robots and humans can actually work side by side without us having to dramatically change the human working environment.
This year, in a stowing test designed to include a variety of challenging product attributes — bagged items, irregular items with an offset center of gravity, and so on — the system successfully stowed 94 of 95 items. Of course, some items can never be stowed by this system, including particularly bulky or heavy products, or cylindrical items that don’t behave themselves on conveyor belts. The team’s ultimate target is to be able to stow 85% of products stocked by a standard Amazon FC.
“Interacting with chaotic arrangements of items, unknown items with different shapes and sizes, and learning to manipulate them in intelligent ways, all at Amazon scale — this is ground-breaking,” says Owan. “I feel like I’m at ground zero for a big thing, and that’s what makes me excited to come to work every day.”
“Stow will be the first brownfield automation project, at scale, at Amazon,” says Srinivasa. “Surgically inserting automation into existing buildings is very challenging, but we're enacting a future in which robots and humans can actually work side by side without us having to dramatically change the human working environment.
"One of the advantages of the type of brownfield automation we do at Robotics AI is that it’s minimally disruptive to the process flow or the building space, which means that our robots can truly work alongside humans," Srinivasa adds. "This is also a future benefit of compliant arms as they can, via software and AI, be made safer than industrial arms.”
Robots and humans working side by side is key to the long-term expansion of this technology beyond retail, says Parness.
“Think of robots loading delicate groceries or, longer term, loading dishwashers or helping people with tasks around the house. Robots with a sense of force in their control loop is a new paradigm in compliant-robotics applications.”