Using supervised learning to train models for image clustering

Approach that uses a hierarchical graph neural network improves F-score by 49% relative to predecessors.

Most machine learning models use supervised learning, meaning they’re trained on annotated data, which is costly and time consuming to acquire.

The chief method for doing unsupervised learning, which doesn’t require annotated data, is clustering, or grouping data points together by salient characteristics. The idea is that each cluster represents some category, such as photos of the same person or the same species of animal.

To decide where to draw boundaries between clusters, clustering algorithms typically rely on heuristics, such as a threshold distance between cluster centers or the shape of the clusters’ distributions. In a paper we’re presenting at the International Conference on Computer Vision (ICCV), we propose, instead, to learn from data how to draw boundaries.

We first represent visual data using a graph, then use a graph neural network (GNN) to produce vector representations of the graph’s nodes. So far, we follow on previous work.

Instead of relying on heuristics, however, we use labeled data to learn how to cluster the vectors and, crucially, to decide how fine-grained those clusters should be. We call the labeled data meta-training data, since the goal is to learn a general clustering technique, not a specific classification model. 

In particular, we propose a hierarchical GNN, meaning that it creates clusters by adding edges between nodes of a graph, then adds edges between the clusters to create still larger clusters, and so on, iterating until it decides that no more edges should be added.

Hierarchical clustering.png
A schematic of our graph-based hierarchical clustering approach. The colors of the image borders and of the graph nodes indicate data types (in this case, photos of the same actor). Our approach is hierarchical, iteratively treating small clusters generated at one level as the units of clustering for the next level. We call our base model LANDER, for link approximation and density estimation refinement, and our hierarchical clustering method Hi-LANDER.

Finally, we apply our hierarchical clustering technique to test sets whose classification categories are disjoint with those of the meta-training data. In our experiments we found that, compared to previous GNN-based supervised and unsupervised approaches, ours increased the F-score — which factors in both false positives and false negatives — by an average of 49% and 47%, respectively.

Constructing the graph

In our paper, we investigate the case in which we are training a model to cluster visual data that is similar to the meta-training data but has no class overlaps with it. For instance, the meta-training data might be faces of movie stars, while the target application is to cluster faces of politicians, athletes, or other public figures.

The first step in our process is to use the meta-training data to build a supervised classifier: if the meta-training data is faces of movie stars, the classifier labels input images with names of movie stars.

The classifier is an encoder-decoder model: the encoder produces a fixed-length vector representation of the input, or feature vector, and the decoder uses that vector to predict a label. Once we’ve trained the classifier, however, we use only the encoder for the rest of the process.

The feature vectors define points in a multidimensional space. On the basis of the vectors’ locations, we construct a graph, in which each node represents an image, and each image’s k nearest neighbors in the feature space are connected to it (share edges with it) in the graph.

This graph will serve as the input to the clustering model, which is also an encoder-decoder model. The encoder is a GNN, which produces a vector representation of each node in the graph, based on that node’s feature vector and those of the nodes it’s connected to. Call this vector the node embedding.

The clustering model

We adopt a hierarchical approach to clustering. Based on the node embeddings, the clustering model predicts edges between nodes. A cluster is defined as a group of nodes each of which shares an edge with at least one other node in the group and none of which shares an edge with any node outside the group.

Note that the goal of the clustering model is not just to reproduce the nearest-neighbor graph but to link nodes that represent data of the same type. The nearest-neighbor linkages are useful for predicting clustering linkages, but they are not identical with them.

After the first pass through the data, we aggregate each cluster into a single, representative “supernode” and repeat the whole process. That is, we create edges between each supernode and its k nearest neighbors, pass the resulting graph through the same GNN, and predict edges based on the supernode embeddings. We repeat this process until the clustering model predicts no edges between nodes.

We train our clustering model on two different objectives. One is to correctly predict links between nodes, where a correct link is one that picks out two representatives of the same data type in the meta-training data (say, two photos of the same actor).

We also train the model to correctly predict the density of a given data type in a given graph neighborhood. That is, for each node, the model should predict the proportion of nearby neighbors of the same data type.

Past research on clustering has shown that factoring in data density improves results. Previously, however, link prediction and data density prediction were handled by separate models. By using a single model to jointly predict both, we significantly increase computational efficiency. We believe that the combination also contributes to our increase in accuracy.

The other novelty of our approach is that, because of our hierarchical processing scheme, we optimize clustering across the entire input graph. Previous approaches would first divide the graph into subgraphs, then perform inference within subgraphs. This prevents natural parallelization, which is runtime efficient, and limits the effectiveness of information flow through the graph. The full graph-wide processing is another reason for our model’s improved efficiency.

In experiments, we considered two different sets of meta-training data. One consisted of closeups of human faces, the other of images of particular animal species. We tested the model trained on human faces on two other datasets, whose data categories had zero or very little overlap with those of the meta-training set — 0% and less than 2%. We tested the model trained on animal species on a dataset of previously unseen species. Across both models and the three test sets, our average improvements over previous GNN-based clustering models and unsupervised clustering methods were 49% and 47%, respectively.

In ongoing work, we are investigating the possibility training a more general clustering model, whose performance at inference time will be more transferrable across different data types — accurately clustering both faces and animal species, for instance.

Acknowledgements: Tianjun Xiao, Yongxin Wang, Yuanjun Xiong, Wei Xia, David Wipf, Zhang Zheng, Stefano Soatto

Related content

US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, CA, San Francisco
Join our Frontier AI & Robotics team to support the hardware integration of next-generation robotic systems that will transform how robots perceive and interact with the world. You'll take ownership of hands-on hardware assembly, software integration, and system validation tasks across advanced actuators, precision sensors, and robotic subsystems — ensuring they work seamlessly together to support breakthrough AI research and real-world deployment. Key job responsibilities - Assembly, Integration & DFx — Assemble and integrate robotic hardware (actuators, sensors, vision systems, machined components). Execute assembly processes and test protocols developed with engineering. Provide DFM/DFA feedback and perform simple mechanical/electrical/software design tasks; support integration/debug and partner with engineers to optimize manufacturability and testability. - R&D Prototype Test & Validation — Validate hardware revisions, verify mechanical assemblies, power sequencing, communication interfaces, and peripherals during bring-up. - Debugging & Failure Analysis — Troubleshoot and root-cause issues across the robotic platform (power, compute, comms, actuators, sensors). Conduct failure analysis from component to system level. Reproduce critical failures, interpret schematics, and bridge communication between the lab and engineering teams. - Technical Documentation — Author and maintain runbooks, failure analysis reports, assembly guides, and troubleshooting guides; uphold consistent documentation standards across the lab. - Mechanical Design Support — Perform simple R&D design tasks and test fixture design in CAD, ensuring quality and alignment with engineering priorities. - Lab Operations Support — Support machine shop capabilities, equipment maintenance, inventory management, vendor coordination, and safety/regulatory compliance. - Test Capability Development — Develop test methodologies, design jigs/fixtures, support hardware-in-the-loop (HIL) testing, and streamline failure-to-resolution workflows. A day in the life Your focus centers on the hardware and software that powers our advanced robotic platforms. You'll execute high degree-of-freedom (DoF) robotic prototype assembly and validation, working alongside engineers and fellow technicians. Your responsibilities include building, debugging, validating prototype, performing critical component and assembly quality assessments, providing DFM/DFA feedback to engineers, and designing test jigs and fixtures. Throughout the day, you balance complex assemblies and integration testing while handling urgent prototyping requests, documentation updates, and preparation for upcoming milestones. You're switching between working at the bench, collaborating in design reviews with engineers, and ensuring lab safety and equipment maintenance. About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
Join Amazon's Frontier AI & Robotics team as a Member of Technical Staff, this Technical Program Manager will become the driving force behind breakthrough robotics innovation. You'll orchestrate complex, cross-functional programs that bridge AI research, software, hardware, and production deployment—managing the technical workstreams that enable robots to see, reason, and act in Amazon's warehouse environments. Your program leadership will directly accelerate our mission to build the next generation of embodied intelligence. Key job responsibilities · Establish and drive program management mechanisms and cadence for complex robotics and AI development initiatives spanning research, software engineering, hardware, and operations · Manage end-to-end program execution across the full robotics stack—including AI models, software engineering, and hardware deployment · Drive decision-making velocity by facilitating tradeoff discussions when there are conflicting priorities; determine whether decisions are one-way or two-way doors · Own program-level risk management, proactively identifying technical, schedule, and resource risks; escalate where necessary and drive mitigation strategies · Manage dependencies and scope changes across internal teams and partner organizations, ensuring alignment on commitments, timelines, and technical requirements · Create transparency through clear RACI frameworks, program dashboards, and communication mechanisms that keep stakeholders aligned on status, risks, and decisions · Exercise strong technical judgment to influence program-level decisions on deployment methodology, scalability requirements, and technical feasibility—acting as the voice back to research and engineering teams · Build sustainable program management processes that scale as our organization grows, adapting agile frameworks to the unique challenges of AI robotics A day in the life Your focus centers on driving velocity and alignment across our robotics programs. You might start your morning facilitating tradeoff decisions between AI researchers and software engineers on a critical prototype milestone, then transition to managing dependencies across hardware and operations teams to keep timelines on track. In the afternoon, you could be conducting risk assessments on supply chain constraints that impact our development roadmap, updating program dashboards to provide leadership visibility, or working with partner teams to align on deployment strategies. You'll establish the mechanisms and cadence that keep our fast-moving organization synchronized—from sprint planning rituals to cross-functional design reviews. Throughout the day, you balance hands-on program execution with strategic escalation, ensuring technical decisions align with our long-term vision while removing obstacles that slow teams down. You're the connective tissue that enables researchers, engineers, and operations specialists to move fast together. About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
About the Role: We are looking for a Member of Technical Staff - Mechanical Engineer with a passion for building complex robotic systems from the ground up. This role is ideal for someone with a deep understanding of structural and electromechanical design, who thrives in hands-on environments and has experience taking high-performance robots from concept to production. You will work on the mechanical and system architecture of advanced robotics platforms, including high degree-of-freedom systems, where considerations such as actuator selection, thermal constraints, cabling, sensing integration, and manufacturability are critical. This is a cross-disciplinary role requiring close collaboration with electrical, software, and AI research teams. Beyond day-to-day hardware development, this role also provides exciting avenues to contribute to innovative research projects. Whether you’re interested in mechatronics, sensor integration, or novel actuation methods, you’ll find opportunities to explore your research interests while building real-world systems that advance in the field of high degree-of-freedom robotics. What You Bring: * A systems-thinking mindset with a strong grasp of cross-domain engineering tradeoffs. * A bias toward action: comfortable building, testing, and iterating rapidly. * A collaborative and communicative working style — especially in multi-disciplinary research environments. * A passion for robotics and advancing the state of the art in intelligent, capable machines. Key job responsibilities * Lead mechanical design of robotic subsystems and full platforms, including structures, joints, enclosures, and mechanisms for a research environment. * Own kinematic, dynamic, and structural analyses to guide the design and optimization of full systems and subsystems of high-DoF robots * Specify and integrate actuators and motors for high-torque density applications in high-degree-of-freedom systems. * Contribute to thermal management strategies for motors, sensors, and embedded compute hardware. * Integrate sensors such as lidar, stereo cameras, IMUs, tactile sensors, and compute modules into compact, functional assemblies. * Design and route cabling and wire harnesses, ensuring reliability, serviceability, and thermal/electrical integrity. * Prototype and test mechanical systems; support hands-on builds, debug sessions, and field testing. * Conduct root cause analysis on system-level failures or performance issues and implement design improvements. * Apply Design for Manufacturing (DFM) and Design for Assembly (DFA) principles to transition prototypes into scalable builds (10s–100s of units). * Collaborate with cross-functional teams in electrical engineering, controls, perception, and research to meet research and product goals. About the team Frontier AI & Robotics (FAR) is the team at Amazon building the next generation of embodied intelligence. FAR drives the development and implementation of advanced AI models within Amazon’s operations that enable robots to see, reason, and act on the world around them, supporting a number of different warehouse automation tasks.
US, CA, San Francisco
Join Amazon's Frontier AI & Robotics team and help shape the future of intelligent robotic systems from the inside out. As a Member of Technical Staff - Firmware Engineer, Electronics, you will develop the low-level firmware that brings our in-house robotic actuators to life—writing the embedded code that bridges sophisticated hardware and the high-level AI control systems that power our next-generation robots. Your work will directly enable our robots to see, reason, and act in real-world warehouse environments, making you a critical contributor to one of the most ambitious robotics programs in the world. Key job responsibilities • Develop, test, and optimize embedded firmware for custom in-house robotic actuators, including motor control algorithms (FOC, commutation, current/torque/speed/position loops) running on microcontrollers and DSPs • Design and implement real-time firmware for actuator state estimation, fault detection, and protection logic, ensuring robust and safe operation across all actuator variants deployed in FAR's robotic systems • Collaborate with electronics engineers and motor design engineers to define firmware requirements, hardware interfaces (SPI, I2C, CAN, EtherCAT, RS-485), and actuator bring-up procedures for new hardware revisions • Develop and maintain firmware for field-oriented control (FOC) and sensored/sensorless motor commutation, including tuning current regulators, velocity controllers, and position controllers for high-performance robots • Build and maintain firmware test frameworks and hardware-in-the-loop (HIL) test environments to validate firmware behavior across actuator operating conditions, edge cases, and failure modes • Partner with controls engineers and AI researchers to ensure firmware-level interfaces support high-bandwidth, low-latency communication required by whole-body control and motion planning algorithms • Contribute to actuator firmware architecture decisions, define software-hardware interface standards, and maintain firmware documentation and version control practices to enable scalable multi-actuator development • Support rapid hardware bring-up and debugging of new actuator prototypes, leveraging oscilloscopes, logic analyzers, and custom diagnostic tools to characterize and validate firmware behavior on novel hardware A day in the life Your day is rooted in the intersection of hardware and software where you’ll be wiring firmware from scratch to control custom motors. You might start your morning reviewing firmware behavior logs from the previous night's actuator characterization runs, then spend time working alongside motor design and electronics engineers to debug a torque ripple issue in the motor control loop. In the afternoon, you could be writing and validating embedded firmware for a new actuator variant, tuning (field-oriented control) FOC algorithms, and collaborating with the controls team to ensure firmware interfaces align with high-level motion planning requirements. Beyond the bench, you'll participate in architecture reviews with hardware and software engineers, contribute to code reviews, and document firmware specifications that enable smooth hardware handoffs. You'll be working on actuator variants—each with unique power, torque, and speed requirements—and you'll be the firmware voice in cross-functional design discussions that shape how our actuators are built and controlled. The pace is fast, the problems are novel, and the impact is direct. About the team Frontier AI & Robotics (FAR) is the team at Amazon building the next generation of embodied intelligence. FAR drives the development and implementation of advanced AI models within Amazon’s operations that enable robots to see, reason, and act on the world around them, supporting a number of different warehouse automation tasks.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.