Overview
The IEEE / CVF Computer Vision and Pattern Recognition Conference (CVPR) is the premier annual computer vision event comprising the main conference and several co-located workshops and short courses. On June 19th, Swami Sivasubramanian, AWS VP of AI and Data, will deliver an expo track keynote on, 'Computer vision at scale: Driving customer innovation and industry adoption'. Learn more about Amazon's accepted publications in our paper guide.
Organizing committee
-
-
-
Applied Scientist
-
Principal, Applied Scientist
-
Amazon Scholar
-
Senior Manager, Applied Science
-
Principal, Applied Scientist
-
Applied Scientist II
-
Applied Scientist
-
Research Scientist
-
Senior Manager, Applied Science
-
Applied Scientist
-
Applied Scientist
-
Manager, Applied Science
-
Manager, Applied Science
-
Senior Manager, Applied Science
-
Applied Scientist
-
Accepted publications
-
CVPR 2024, CVPR 2024 Workshop on What is Next in Multimodal Foundation Models?, CVPR 2024 Workshop on Robustness in Large Language Models2024
Workshops and events
CVPR 2024 Event: Diversity and Inclusion for Everyone
June 19, 7:00 PM - 9:00 PM EDT
Amazon is proud to be a sponsor for the CVPR 2024 Social Event “Diversity and Inclusion for Everyone”, hosted by the organisers of Women in Computer Vision (WiCV) and LatinX in Computer Vision workshops.
CVPR 2024 Workshop on Urban Scene Modeling: Where Vision Meets Photogrammetry and Graphics
June 17
Rapid urbanization poses social and environmental challenges. Addressing these issues effectively requires access to accurate and up-to-date 3D building models, obtained promptly and cost-effectively. Urban modeling is an interdisciplinary topic among computer vision, graphics, and photogrammetry. The demand for automated interpretation of scene geometry and semantics has surged due to various applications, including autonomous navigation, augmented reality, smart cities, and digital twins. As a result, substantial research effort has been dedicated to urban scene modeling within the computer vision and graphics communities, with a particular focus on photogrammetry, which has coped with urban modeling challenges for decades. This workshop is intended to bring researchers from these communities together. Through invited talks, spotlight presentations, a workshop challenge, and a poster session, it will increase interdisciplinary interaction and collaboration among photogrammetry, computer vision and graphics. We also solicit original contributions in the areas related to urban scene modeling.
Website: https://usm3d.github.io/
Website: https://usm3d.github.io/
CVPR 2024 Workshop on Virtual Try-On
June 17
Featured Amazon keynote speakers: Ming Lin, Amazon Scholar; Sunil Hadap, Principal Applied Scientist
Website: https://vto-cvpr24.github.io/
Website: https://vto-cvpr24.github.io/
CVPR 2024 Workshop on the Evaluation of Generative Foundation Models
June 18
The landscape of artificial intelligence is being transformed by the advent of Generative Foundation Models (GenFMs), such as Large Language Models (LLMs) and diffusion models. GenFMs offer unprecedented opportunities to enrich human lives and transform industries. However, they also pose significant challenges, including the generation of factually incorrect or biased information, which might be potentially harmful or misleading. With the emergence of multimodal GenFMs, which leverage and generate content in an increasing number of modalities, these challenges are set to become even more complex. This emphasizes the urgent need for rigorous and effective evaluation methodologies.
The 1st Workshop on Evaluation for Generative Foundation Models at CVPR 2024 aims to build a forum to discuss ongoing efforts in industry and academia, share best practices, and engage the community in working towards more reliable and scalable approaches for GenFMs evaluation.
Website: https://evgenfm.github.io/
The 1st Workshop on Evaluation for Generative Foundation Models at CVPR 2024 aims to build a forum to discuss ongoing efforts in industry and academia, share best practices, and engage the community in working towards more reliable and scalable approaches for GenFMs evaluation.
Website: https://evgenfm.github.io/
CVPR 2024 Workshop on Fine-Grained Visual Categorization
June 18
CVPR 2024 Workshop on Generative Models for Computer Vision
June 18
CVPR 2024 Workshop on the GroceryVision Dataset @ RetailVision
June 18
CVPR 2024 Workshop on Learning with Limited Labelled Data for Image and Video Understanding
June 18
CVPR 2024 Workshop on Prompting in Vision
June 17
This workshop aims to provide a platform for pioneers in prompting for vision to share recent advancements, showcase novel techniques and applications, and discuss open research questions about how the strategic use of prompts can unlock new levels of adaptability and performance in computer vision.
Website: https://prompting-in-vision.github.io/index_cvpr24.html
Website: https://prompting-in-vision.github.io/index_cvpr24.html
CVPR 2024 Workshop on Open-Vocabulary 3D Scene Understanding
June 18
CVPR 2024 Workshop on Multimodal Learning and Applications
June 18
CVPR 2024 Workshop on RetailVision
June 18
The rapid development in computer vision and machine learning has caused a major disruption in the retail industry in recent years. In addition to the rise of online shopping, traditional markets also quickly embraced AI-related technology solutions at the physical store level. Following the introduction of computer vision to the world of retail, a new set of challenges emerged. These challenges were further expanded with the introduction of image and video generation capabilities.
The physical domain exhibits challenges such as the detection of shopper and product interactions, fine-grained recognition of visually similar products, as well as new products that are introduced on a daily basis. The online domain contains similar challenges, but with their own twist. Product search and recognition is performed on more than 100,000 classes, each including images, textual captions, and text by users during their search. In addition to discriminative machine learning, image generation has also started being used for the generation of product images and virtual try-on.
All of these challenges are shared by different companies in the field, and are also at the heart of the computer vision community. This workshop aims to present the progress in these challenges and encourage the forming of a community for retail computer vision.
Website: https://retailvisionworkshop.github.io/
The physical domain exhibits challenges such as the detection of shopper and product interactions, fine-grained recognition of visually similar products, as well as new products that are introduced on a daily basis. The online domain contains similar challenges, but with their own twist. Product search and recognition is performed on more than 100,000 classes, each including images, textual captions, and text by users during their search. In addition to discriminative machine learning, image generation has also started being used for the generation of product images and virtual try-on.
All of these challenges are shared by different companies in the field, and are also at the heart of the computer vision community. This workshop aims to present the progress in these challenges and encourage the forming of a community for retail computer vision.
Website: https://retailvisionworkshop.github.io/
CVPR 2024 Workshop on Responsible Generative AI
June 18
Responsible Generative AI (ReGenAI) workshop aims to bring together researchers, practitioners, and industry leaders working at the intersection of generative AI, data, ethics, privacy and regulation, with the goal of discussing existing concerns, and brainstorming possible avenues forward to ensure the responsible progress of generative AI. We hope that the topics addressed in this workshop will constitute a crucial step towards ensuring a positive experience with generative AI for everyone.
Website: https://sites.google.com/view/cvpr-responsible-genai/home
Website: https://sites.google.com/view/cvpr-responsible-genai/home
CVPR 2024 Workshop on Visual Odometry and Computer Vision
June 18
Visual odometry and localization maintain an increasing interest in recent years, especially with the extensive applications on autonomous driving, augmented reality, and mobile computing. With the location information obtained through odometry, services based on location clues are also rapidly emerging. Particularly, in this workshop, we focus on mobile platform applications.
Website: https://sites.google.com/view/vocvalc2024
Website: https://sites.google.com/view/vocvalc2024
CVPR 2024 Workshop on What is Next in Multimodal Foundation Models?
June 18
CVPR 2024 Demo: Amazon Lens & View in Your Room
June 20
June 20-21, 11-11:30am
Amazon Lens is a feature which allows customers to search for products using their photos or live camera.
View in Your Room allows customers to preview how products like furniture would look in their home using Augmented reality.
Both features are available in the Amazon Mobile Shopping App today for anyone to use. We have videos showcasing these features available to show on conference displays, and team members can guide conference attendees to try the features out on their own devices.
Amazon Lens is a feature which allows customers to search for products using their photos or live camera.
View in Your Room allows customers to preview how products like furniture would look in their home using Augmented reality.
Both features are available in the Amazon Mobile Shopping App today for anyone to use. We have videos showcasing these features available to show on conference displays, and team members can guide conference attendees to try the features out on their own devices.
CVPR 2024 Demo: Amazon Dash Cart and Amazon One
June 19 - June 21
June 19: 11:30am-12:00pm, 2:30-3pm
June 20: 11:30am-12:00pm, 1-1:30pm, 2:30-3pm
June 21: 11:30am-12:00pm, 1-1:30pm
Learn how Amazon Dash Cart and Amazon One are helping customers saving money, time and effort shopping for everyday grocery at scale, through computer vision and artificial intelligence! The Dash Cart is a smart cart that makes grocery trips faster and more personalized than ever. Find items quickly and easily. Add, remove, and weigh items right in your Dash Cart. When you're done shopping, skip the checkout line and roll out to your car. For more information, visit: https://aws.amazon.com/dash-cart/
June 20: 11:30am-12:00pm, 1-1:30pm, 2:30-3pm
June 21: 11:30am-12:00pm, 1-1:30pm
Learn how Amazon Dash Cart and Amazon One are helping customers saving money, time and effort shopping for everyday grocery at scale, through computer vision and artificial intelligence! The Dash Cart is a smart cart that makes grocery trips faster and more personalized than ever. Find items quickly and easily. Add, remove, and weigh items right in your Dash Cart. When you're done shopping, skip the checkout line and roll out to your car. For more information, visit: https://aws.amazon.com/dash-cart/
CVPR 2024 Demo: Proteus
June 19 - June 21
June 19 12-12:30pm
June 21 12:30-1pm
Proteus is Amazon's first fully autonomous mobile robot. Historically, it’s been difficult to safely incorporate robotics where people are working in the same physical space as the robot. We believe Proteus will change that while remaining smart, safe, and collaborative.
June 21 12:30-1pm
Proteus is Amazon's first fully autonomous mobile robot. Historically, it’s been difficult to safely incorporate robotics where people are working in the same physical space as the robot. We believe Proteus will change that while remaining smart, safe, and collaborative.
CVPR 2024 Demo: Analyze data from AWS Databases with zero-ETL integrations
June 19 - June 20
June 19-20, 12:30-1:00pm
Making the most of your data often means using multiple AWS services. In this demo, learn about the zero-ETL integrations available for AWS Databases with AWS Analytics services and how they remove the need for you to build and manage complex data pipelines. Deep dive with a demo on how you can build your own pipeline with Amazon DynamoDB zero-ETL integration with Amazon OpenSearch.
Making the most of your data often means using multiple AWS services. In this demo, learn about the zero-ETL integrations available for AWS Databases with AWS Analytics services and how they remove the need for you to build and manage complex data pipelines. Deep dive with a demo on how you can build your own pipeline with Amazon DynamoDB zero-ETL integration with Amazon OpenSearch.
CVPR 2024 Demo: Get started with GraphRAG on Amazon Neptune
June 19 - June 20
June 19, 11-11:30am
June 20, 1:30-2:00pm
Retrieval Augmented Generation (RAG) helps improve the accuracy of outputs from Large Language Models (LLMs) by retrieving information from authoritative, predetermined knowledge sources. However, baseline RAG may flounder when a query requires connecting disparate information or a higher-level understanding of large data sets. GraphRAG combines the power of knowledge graphs and RAG technology to improve your generative AI application’s ability to answer questions across data sets, summarize concepts across a broad corpus, and provide human readable explanations of the results, therefore, improve its accuracy and reducing hallucinations. In this flash talk, learn how to use Amazon Neptune, our high-performance graph analytics and serverless database, to get started with GraphRAG and improve the accuracy of your generative AI applications.
June 20, 1:30-2:00pm
Retrieval Augmented Generation (RAG) helps improve the accuracy of outputs from Large Language Models (LLMs) by retrieving information from authoritative, predetermined knowledge sources. However, baseline RAG may flounder when a query requires connecting disparate information or a higher-level understanding of large data sets. GraphRAG combines the power of knowledge graphs and RAG technology to improve your generative AI application’s ability to answer questions across data sets, summarize concepts across a broad corpus, and provide human readable explanations of the results, therefore, improve its accuracy and reducing hallucinations. In this flash talk, learn how to use Amazon Neptune, our high-performance graph analytics and serverless database, to get started with GraphRAG and improve the accuracy of your generative AI applications.
CVPR 2024 Demo: How to use Amazon Aurora as a Knowledge Base for Amazon Bedrock
June 19
June 19-20, 2-2:30pm
Generative AI and Foundational Models (FMs) are powerful technologies for building richer, personalized applications. With pgvector on Amazon Aurora PostgreSQL-Compatible Edition, you can access vector database capabilities to store, search, index, and query ML embeddings. Aurora is available as a Knowledge Base for Amazon Bedrock to securely connect your organization’s private data sources to FMs and enable Retrieval Augmented Generation (RAG) workflows on them. With Amazon Aurora Optimized Reads, you can boost vector search performance by up to 9x for memory-intensive workloads. In this demo, learn to integrate Aurora with Bedrock and how to utilize Optimized Reads to improve generative AI application performance.
Generative AI and Foundational Models (FMs) are powerful technologies for building richer, personalized applications. With pgvector on Amazon Aurora PostgreSQL-Compatible Edition, you can access vector database capabilities to store, search, index, and query ML embeddings. Aurora is available as a Knowledge Base for Amazon Bedrock to securely connect your organization’s private data sources to FMs and enable Retrieval Augmented Generation (RAG) workflows on them. With Amazon Aurora Optimized Reads, you can boost vector search performance by up to 9x for memory-intensive workloads. In this demo, learn to integrate Aurora with Bedrock and how to utilize Optimized Reads to improve generative AI application performance.
CVPR 2024 Demo: Getting started with Amazon ElastiCache Serverless
June 19 - June 20
June 19-20, 3-3:30pm
Serverless databases free you from capacity management while providing you with the economics of pay-per-use pricing. With AWS, customers have a broad choice of serverless databases to choose from, such as Amazon Aurora, Amazon DynamoDB, Amazon Neptune, and most recently Amazon ElastiCache. In this demo, learn how you can begin to instantly scale your own databases with Amazon ElastiCache Serverless and how to utilize the feature with the new open source project Valkey.
Serverless databases free you from capacity management while providing you with the economics of pay-per-use pricing. With AWS, customers have a broad choice of serverless databases to choose from, such as Amazon Aurora, Amazon DynamoDB, Amazon Neptune, and most recently Amazon ElastiCache. In this demo, learn how you can begin to instantly scale your own databases with Amazon ElastiCache Serverless and how to utilize the feature with the new open source project Valkey.
CVPR 2024 Demo: AR-ID
June 19
June 19, 3:30-4:00pm
Feedback from employees led us to create Amazon Robotics Identification (AR ID), an AI-powered scanning capability with innovative computer vision and machine learning technology to enable easier scanning of packages in our facilities. Currently, all packages in our facilities are scanned at each destination on their journey. In fulfillment centers, this scanning is currently manual—an item arrives at a workstation, the package is picked from a bin by an employee, and using a hand scanner, the employee finds the bar code and hand-scans the item.
AR ID removes the manual scanning process by using a unique camera system that runs at 120 frames per second, giving employees greater mobility and helping reduce the risk of injury. Employees can handle the packages freely with both hands instead of one hand while holding a scanner in the other, or they can work to position the package to scan it by hand. This creates a natural movement, and the technology does its job in the background.
Feedback from employees led us to create Amazon Robotics Identification (AR ID), an AI-powered scanning capability with innovative computer vision and machine learning technology to enable easier scanning of packages in our facilities. Currently, all packages in our facilities are scanned at each destination on their journey. In fulfillment centers, this scanning is currently manual—an item arrives at a workstation, the package is picked from a bin by an employee, and using a hand scanner, the employee finds the bar code and hand-scans the item.
AR ID removes the manual scanning process by using a unique camera system that runs at 120 frames per second, giving employees greater mobility and helping reduce the risk of injury. Employees can handle the packages freely with both hands instead of one hand while holding a scanner in the other, or they can work to position the package to scan it by hand. This creates a natural movement, and the technology does its job in the background.