Every day at Amazon fulfillment centers, more than half a million robots assist with stocking inventory, filling orders, and sorting packages for delivery. These robots follow directions provided by cloud-based algorithms and navigate along a grid of encoded markers. Virtual and physical barriers restrict their interactions with people, as well as where they can and cannot go.
Now, the company is testing a new class of robots that use artificial intelligence and computer vision to roam freely throughout the fulfillment center (FC). They are helping associates accomplish tasks such as transporting oversized and unwieldy items through the shape-shifting maze of people, pallets, and pillars laid out across the fulfillment center floor, which can cover several dozen football fields.
“This is the first instance of AI being used in autonomous mobility at Amazon,” said Siddhartha Srinivasa, director of Amazon Robotics AI.
The key to success for these new robots is what Amazon scientists call semantic understanding: the ability of robots to understand the three-dimensional structure of their world in a way that distinguishes each object in it and with knowledge about how each object behaves. With this understanding updated in real-time, the robots can safely navigate cluttered, dynamic environments.
For now, these robots are deployed in a few fulfillment centers where they are performing a narrow set of tasks. Researchers are exploring how to integrate these robots seamlessly and safely with the established processes that Amazon associates follow to fulfill millions of customer orders every day.
“We don’t develop technology for technology’s sake,” said Srinivasa. “We want to develop technology with an end goal in mind of empowering our associates to perform their activities better and safer. If we don’t integrate seamlessly end-to-end, then people will not use our technology.”
Robots today
About 10% of the items ordered from the Amazon Store are too long, wide, or otherwise unwieldy to fit in pods or on conveyor belts in many Amazon FCs. Today, FC employees transport these oversized items across the fulfillment center with pulleys and forklifts, navigating the ever-shifting maze of pods, pallets, robots, and people. The goal is to have robots handle this sometimes awkward task.
Ben Kadlec, perception lead for Amazon Robotics AI, is leading the development of the AI for the new robots. His team has deployed the robots for preliminary testing as autonomous transports for non-conveyable items.
To succeed, the robots need to be able to map their environment in real-time and understand what’s a stationary object — and what’s not — and use that information to make on-the-fly decisions about where to go, and how to avoid collisions to safely deliver the oversized items to their intended destinations.
“Navigating through those dynamic spaces is one aspect of the challenge,” he said. “The other one is working in close proximity with humans. That has to do with first recognizing that this thing in front of you is a human and it might move, you might need to keep a further distance from it to be safe, you might need to predict the direction the human is going.”
Teaching robots what’s what
We humans learn about the objects in our environment and how to safely navigate around them through curiosity and trial and error, along with the guidance of family, friends, and teachers. Kadlec and his team use machine learning.
The process begins with semantic understanding, or scene comprehension, based on data collected with the robot’s cameras and LIDAR.
“When the robot takes a picture of the world, it gets pixel values and depth measurements,” explained Lionel Gueguen, an Amazon Robotics AI machine learning applied scientist. “So, it knows at that distance, there are points in space — an obstacle of some sort. But that is the only knowledge the robot has without semantic understanding.”
Semantic understanding, he continued, is about teaching the robot to define that point in space — to determine if it belongs to a person, a pod, or a pillar. Or, if it’s a cable lying across the floor, or a forklift, or another robot.
When these labels are layered on top of the three-dimensional visual representation, the robot can then classify the point in space as stable or mobile and use that information to calculate the safest path to its destination.
“The navigation system does what we call semantically aware planning and navigation,” said Srinivasa. “The intuition is very simple: The way a robot moves around a trash can is probably going to be different from the way it navigates around a person or a precious asset. The only way the robot can know that is if it’s able to identify, ‘Oh that’s the trash can or that’s the person.’ And that’s what our AI is able to do.”
To teach the robots semantics, scientists collected thousands of images taken by the robots as they navigated. Then, teams trace the shape of each object in each image and label it. Data scientists use this labeled data to train a machine learning model that segments and labels each object in the cameras’ field of view, a process known as semantic segmentation.
Layered on top of the semantic understanding are predictive models that teach the robot how to treat each object detected. When it detects a pillar, for example, it knows that pillars are static and will always be there. The team is working on another model to predict the paths of the people the robot encounters, and adjust course accordingly.
“Our work is improving the representation of static obstacles in the present as well as starting to model the near future of where the dynamic obstacles are going to be,” said Gueguen. “And that representation is passed down in such a way that the robot can plan accordingly to, on one hand, avoid static obstacles and on the other hand avoid dynamic obstacles.”
Fulfillment center deployment
Kadlec and his team have deployed a few dozen robots for preliminary testing and refinement at a few fulfillment centers. There, they are moving packages, collecting more data, and delivering insights to the science team on how to improve their real-world performance.
“It’s really exciting,” Kadlec said. “We can see the future scale that we want to be operating at. We see a clear path to being successful.”
Once Kadlec and his colleagues succeed in the full-scale deployment of autonomous mobile robot fleets that can transport precious, oversized packages, they can apply the learnings to additional robots.
“The particular problem we’re going after right now is pretty narrow, but the capability is very general,” Kadlec said.
The road ahead
Among the challenges of deploying free-roaming robots in Amazon fulfillment centers is making them acceptable to associates, Srinivasa noted.
“If the robot sneaks up on you really fast and hits the brake a millimeter before it touches you, that might be functionally safe, but not necessarily acceptable behavior,” he said. “And so, there’s an interesting question around how do you generate behavior that is not only safe and fluent, but also acceptable, that is also legible, which means that it’s human understandable.”
Amazon scientists who study human-robot interaction are developing techniques for robots to indicate their next move to other people without bright lights and loud sounds. One way they’re doing this is through imitation learning, where robots watch how people move around each other and learn to imitate the behavior.
The challenge of acceptance, Srinivasa said, is part of the broader challenge of seamlessly integrating robots into the process path at Amazon fulfillment centers.
“We are writing the book of robotics at Amazon,” he said, noting that it’s an ongoing process. “One of the joys of being in a place like Amazon is that we have direct access to and direct contact with our end users. We get to talk to our associates and ask them, ‘How do you feel about this?’ That internal customer feedback is critical to our development process.”