Using hypergraphs to improve product retrieval

Augmenting query-product graphs with hypergraphs describing product-product relationships improves recall score by more than 48%.

Information retrieval engines like the one that helps Amazon customers find products in the Amazon Store commonly rely on bipartite graphs that map queries to products. The graphs are typically based on customer behaviors: if enough customers executing the same query click the link to a product or buy that product, the graph will include an edge between query and product. A graph neural network (GNN) can then ingest the graph and predict edges corresponding to new queries.

This approach has two drawbacks. One is that most products in the Amazon Store belong to the long tail of items that are rarely searched for, which means that they don’t have enough associated data to make GNN training reliable. Conversely, when handling long-tail queries, a GNN will tend to match them to popular but probably unrelated products, simply because they have a high click and purchase rate overall. This phenomenon is known as disassortative mixing.

Related content
Dual embeddings of each node, as both source and target, and a novel loss function enable 30% to 160% improvements over predecessors.

In a paper we presented at the ACM Conference on Web Search and Data Mining (WSDM), we address both of these problems by augmenting the bipartite query-product graph with information about which products customers tend to look at during the same online shopping sessions. The idea is that knowing which types of products are related to each other can help the GNN generalize from high-frequency to low-frequency queries.

To capture the information about product relationships, we use a hypergraph, a generalization of the graph structure; where an edge in an ordinary graph links exactly two nodes, an edge in a hypergraph can link multiple nodes. Other information retrieval approaches have used product similarity to improve performance, but modeling product similarity with the hypergraph allows us to use GNNs for prediction, so we can exploit the added structure available in graph representations of data.

In tests, we compared our approach to one that uses GNNs on a bipartite graph only and found that the addition of the hypergraph improved the mean reciprocal rank of the results, which assigns a higher score the closer the correct answer is to the top of a ranked list, by almost 25% and the recall score, which measures the percentage of correct answers retrieved, by more than 48%.

Two-channel architecture

GNNs produce vector representations, or embeddings, of individual graph nodes that capture information about their neighbors. The process is iterative: the first embedding captures only information about the object associated with the node — in our case, product descriptions or query semantics. The second embedding combines the first embedding with those of the node’s immediate neighbors; the third embedding extends the node’s neighborhood by one hop; and so on. Most applications use one- or two-hop embeddings.

Related content
Using reinforcement learning improves candidate selection and ranking for search, ad platforms, and recommender systems.

The embedding of a hypergraph modifies this procedure slightly. The first iteration, as in the standard case, embeds each of the item nodes individually. The second iteration creates an embedding for the entirety of each hyperedge. The third iteration then produces an embedding for each node that factors in both its own content-level embedding and the embeddings of all the hyperedges it touches.

The architecture of our model has two channels, one for the query-item bipartite graph and one for the item-item hypergraph. Each passes to its own GNN (a graph convolution network), yielding an embedding for each node.

Hypergraph search.jpeg
An overview of the hypergraph-augmented product retrieval method.

During training, an attention mechanism learns how much weight to give the embedding produced by each channel. A common query with a few popular associated products, for instance, may be well represented by the standard GNN embedding of the bipartite graph. A rarely purchased item, by contrast, associated with a few diverse queries, may benefit from greater weighting of the hypergraph embedding.

Related content
Locality-sensitive hashing enables cache to hold more than three times as many query results.

To maximize the quality of our model’s predictions, we also experimented with two different unsupervised pretraining methods. One is a contrastive-learning approach in which the GNN is fed pairs of training examples. Some are positive pairs, whose embeddings should be as similar as possible, and some are negative pairs, whose embeddings should be as different as possible.

Following existing practice, we produce positive pairs by randomly deleting edges or nodes of a source graph, so the resulting graphs are similar but not identical. Negative pairs pair the source graph with a different, random graph. We extend this procedure to the hypergraph and ensure consistency between the two channels’ training data; e.g., a node deleted from one channel’s inputs will also be deleted from the other channel’s.

Related content
Information extraction, drug discovery, and software analysis are just a few applications of this versatile tool.

We also experiment with DropEdge, a procedure in which, in successive training epochs, slightly different versions of the same graph are used, with a few edges randomly dropped. This prevents overfitting and oversmoothing, as it encourages the GNN to learn more abstract representations of its inputs.

Pretraining dramatically improves the quality of both our two-channel model and the baseline GNN. But it also increases the discrepancy between the two. That is, our approach by itself sometimes yields only a modest improvement over the baseline model. But our approach with pretraining outperforms the baseline model with pretraining by a larger margin.

Related content

US, NY, New York
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Senior Applied Scientist to work on pre-training methodologies for Generative Artificial Intelligence (GenAI) models. You will interact closely with our customers and with the academic and research communities. Key job responsibilities Join us to work as an integral part of a team that has experience with GenAI models in this space. We work on these areas: - Scaling laws - Hardware-informed efficient model architecture, low-precision training - Optimization methods, learning objectives, curriculum design - Deep learning theories on efficient hyperparameter search and self-supervised learning - Learning objectives and reinforcement learning methods - Distributed training methods and solutions - AI-assisted research About the team The AGI team has a mission to push the envelope in GenAI with Large Language Models (LLMs) and multimodal systems, in order to provide the best-possible experience for our customers.
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Key job responsibilities - Develop ML models for various recommendation & search systems using deep learning, online learning, and optimization methods - Work closely with other scientists, engineers and product managers to expand the depth of our product insights with data, create a variety of experiments to determine the high impact projects to include in planning roadmaps - Stay up-to-date with advancements and the latest modeling techniques in the field - Publish your research findings in top conferences and journals A day in the life We're using advanced approaches such as foundation models to connect information about our videos and customers from a variety of information sources, acquiring and processing data sets on a scale that only a few companies in the world can match. This will enable us to recommend titles effectively, even when we don't have a large behavioral signal (to tackle the cold-start title problem). It will also allow us to find our customer's niche interests, helping them discover groups of titles that they didn't even know existed. We are looking for creative & customer obsessed machine learning scientists who can apply the latest research, state of the art algorithms and ML to build highly scalable page personalization solutions. You'll be a research leader in the space and a hands-on ML practitioner, guiding and collaborating with talented teams of engineers and scientists and senior leaders in the Prime Video organization. You will also have the opportunity to publish your research at internal and external conferences.
US, NY, New York
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! We are looking for a self-motivated, passionate and resourceful Applied Scientist to bring diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. You will spend your time as a hands-on machine learning practitioner and a research leader. You will play a key role on the team, building and guiding machine learning models from the ground up. At the end of the day, you will have the reward of seeing your contributions benefit millions of Amazon.com customers worldwide. Key job responsibilities - Develop AI solutions for various Prime Video Search systems using Deep learning, GenAI, Reinforcement Learning, and optimization methods; - Work closely with engineers and product managers to design, implement and launch AI solutions end-to-end; - Design and conduct offline and online (A/B) experiments to evaluate proposed solutions based on in-depth data analyses; - Effectively communicate technical and non-technical ideas with teammates and stakeholders; - Stay up-to-date with advancements and the latest modeling techniques in the field; - Publish your research findings in top conferences and journals. About the team Prime Video Search Science team owns science solution to power search experience on various devices, from sourcing, relevance, ranking, to name a few. We work closely with the engineering teams to launch our solutions in production.
US, CA, San Francisco
If you are interested in this position, please apply on Twitch's Career site https://www.twitch.tv/jobs/en/ About Us: Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day. We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and smash (or elegantly solve) problems together. We’re on a quest to empower live communities, so if this sounds good to you, see what we’re up to on LinkedIn and X, and discover the projects we’re solving on our Blog. Be sure to explore our Interviewing Guide to learn how to ace our interview process. You can work in San Francisco, CA or Seattle, WA. Perks - Medical, Dental, Vision & Disability Insurance - 401(k) - Maternity & Parental Leave - Flexible PTO - Amazon Employee Discount
US, WA, Bellevue
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to help build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As an Applied Scientist with the AGI team, you will work with world-class scientists and engineers to develop novel data, modeling and engineering solutions to support the responsible AI initiatives at AGI. Your work will directly impact our customers in the form of products and services that make use of audio technology. About the team While the rapid advancements in Generative AI have captivated global attention, we see these as just the starting point. Our team is dedicated to pushing the boundaries of what’s possible, leveraging Amazon’s unparalleled ML infrastructure, computing resources, and commitment to responsible AI principles. And Amazon’s leadership principle of customer obsession guides our approach, prioritizing our customers’ needs and preferences each step of the way.
US, WA, Bellevue
Are you interested in a unique opportunity to advance the accuracy and efficiency of Artificial General Intelligence (AGI) systems? If so, you're at the right place! As a Quantitative Researcher on our team, you will be working at the intersection of mathematics, computer science, and finance, you will collaborate with a diverse team of engineers in a fast-paced, intellectually challenging environment where innovative thinking is encouraged and rewarded. We operate at Amazon's large scale with the energy of a nimble start-up. If you have a learner's mindset, enjoy solving challenging problems, and value an inclusive team culture, you will thrive in this role, and we hope to hear from you. Key job responsibilities * Conduct statistical analyses on web-scale datasets to develop state-of-the-art multimodal large language models * Conceptualize and develop mathematical models, data sampling and preparation strategies to continuously improve existing algorithms * Identify and utilize data sources to drive innovation and improvements to our LLMs About the team We are passionate engineers and scientists dedicated to pushing the boundaries of innovation. We evaluate and represent the customer perspective through accurate benchmarking.
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a highly skilled and experienced Senior Applied Scientist, to lead the development and implementation of algorithms and models for supervised fine-tuning and reinforcement learning through human feedback; with a focus across text, image, and video modalities. As a Senior Applied Scientist, you will play a critical role in driving the development of Generative AI (Gen AI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities - Collaborate with cross-functional teams of engineers, product managers, and scientists to identify and solve complex problems in GenAI - Design and execute experiments to evaluate the performance of different algorithms and models, and iterate quickly to improve results - Think big about the arc of development of GenAI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems - Communicate results and insights to both technical and non-technical audiences, including through presentations and written reports - Mentor and guide junior scientists and engineers, and contribute to the overall growth and development of the team
MX, DIF, Mexico City
Do you like working on projects that are highly visible and are tied closely to Amazon’s growth? Are you seeking an environment where you can drive innovation leveraging the scalability and innovation with Amazon's AWS cloud services? The Amazon International Technology Team is hiring Applied Scientists to work in our Machine Learning team in Mexico City. The Intech team builds International extensions and new features of the Amazon.com web site for individual countries and creates systems to support Amazon operations. We have already worked in Germany, France, UK, India, China, Italy, Brazil and more. Key job responsibilities About you You want to make changes that help millions of customers. You don’t want to make something 10% better as a part of an enormous team. Rather, you want to innovate with a small community of passionate peers. You have experience in analytics, machine learning, LLMs and Agentic AI, and a desire to learn more about these subjects. You want a trusted role in strategy and product design. You put the customer first in your thinking. You have great problem solving skills. You research the latest data technologies and use them to help you innovate and keep costs low. You have great judgment and communication skills, and a history of delivering results. Your Responsibilities - Define and own complex machine learning solutions in the consumer space, including targeting, measurement, creative optimization, and multivariate testing. - Design, implement, and evolve Agentic AI systems that can autonomously perceive their environment, reason about context, and take actions across business workflows—while ensuring human-in-the-loop oversight for high-stakes decisions. - Influence the broader team's approach to integrating machine learning into business workflows. - Advise leadership, both tech and non-tech. - Support technical trade-offs between short-term needs and long-term goals.
BR, SP, Sao Paulo
Do you like working on projects that are highly visible and are tied closely to Amazon’s growth? Are you seeking an environment where you can drive innovation leveraging the scalability and innovation with Amazon's AWS cloud services? The Amazon International Technology Team is hiring Applied Scientists to work in our Machine Learning team in Mexico City. The Intech team builds International extensions and new features of the Amazon.com web site for individual countries and creates systems to support Amazon operations. We have already worked in Germany, France, UK, India, China, Italy, Brazil and more. Key job responsibilities About you You want to make changes that help millions of customers. You don’t want to make something 10% better as a part of an enormous team. Rather, you want to innovate with a small community of passionate peers. You have experience in analytics, machine learning, LLMs and Agentic AI, and a desire to learn more about these subjects. You want a trusted role in strategy and product design. You put the customer first in your thinking. You have great problem solving skills. You research the latest data technologies and use them to help you innovate and keep costs low. You have great judgment and communication skills, and a history of delivering results. Your Responsibilities - Define and own complex machine learning solutions in the consumer space, including targeting, measurement, creative optimization, and multivariate testing. - Design, implement, and evolve Agentic AI systems that can autonomously perceive their environment, reason about context, and take actions across business workflows—while ensuring human-in-the-loop oversight for high-stakes decisions. - Influence the broader team's approach to integrating machine learning into business workflows. - Advise leadership, both tech and non-tech. - Support technical trade-offs between short-term needs and long-term goals.
BR, SP, Sao Paulo
Do you like working on projects that are highly visible and are tied closely to Amazon’s growth? Are you seeking an environment where you can drive innovation leveraging the scalability and innovation with Amazon's AWS cloud services? The Amazon International Technology Team is hiring Applied Scientists to work in our Software Development Center in Sao Paulo. The Intech team builds International extensions and new features of the Amazon.com web site for individual countries and creates systems to support Amazon operations. We have already worked in Germany, France, UK, India, China, Italy, Brazil and more. Key job responsibilities About you You want to make changes that help millions of customers. You don’t want to make something 10% better as a part of an enormous team. Rather, you want to innovate with a small community of passionate peers. You have experience in analytics, machine learning and big data, and a desire to learn more about these subjects. You want a trusted role in strategy and product design. You put the customer first in your thinking. You have great problem solving skills. You research the latest data technologies and use them to help you innovate and keep costs low. You have great judgment and communication skills, and a history of delivering results. Your Responsibilities - Define and own complex machine learning solutions in the consumer space, including targeting, measurement, creative optimization, and multivariate testing. - Influence the broader team's approach to integrating machine learning into business workflows. - Advise senior leadership, both tech and non-tech. - Make technical trade-offs between short-term needs and long-term goals.