CoD: Coherent detection of entities from images with multiple modalities

2024
Download Copy BibTeX
Copy BibTeX
Object detection is a fundamental problem in computer vision, whose research has primarily focused on unimodal models, solely operating on visual data. However, in many real-world applications, data from multiple modalities may be available, such as text accompanying the visual data. Leveraging traditional models on these multi-modal data sources may lead to difficulties in accurately delineating object boundaries. For example, in a document containing a combination of text and images, the model must encompass the images and texts pertaining to the same object in a single bounding box. To address this, we propose a model that takes in multi-scale image features, text extracted through OCR, and 2D positional embeddings of words as inputs, and returns bounding boxes that incorporate the image and associated description as single entities. Furthermore, to address the challenge posed by the irregular arrangement of images and their corresponding textual descriptions, we propose the concept of a ”Negative Product Bounding Box” (PBB). This box encapsulates instances where the model faces confusion and tends to predict incorrect bounding boxes. To enhance the model’s performance, we incorporate these negative boxes into the loss function governing matching and classification. Additionally, a domain adaptation model is proposed to handle scenarios involving a domain gap between training and test samples. In order to assess the effectiveness of our model, we construct a multimodal dataset comprising product descriptions from online retailers’ catalogs. On this dataset, our proposed model demonstrates significant improvements of 27.2%, 4.3%, and 1.7% in handling hard negative samples, multi-modal input, and domain shift scenarios, respectively.
Research areas

Latest news

US, WA, Seattle
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, TX, Austin
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, TX, Austin
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, WA, Seattle
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, WA, Seattle
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Sr. Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
US, MA, Boston
Applied Scientists in AWS Automated Reasoning are dedicated to making AWS the best computing service in the world for customers who require advanced and rigorous solutions for automated reasoning, privacy, and sovereignty. Key job responsibilities The successful candidate will: - Solve large or significantly complex problems that require deep knowledge and understanding of your domain and scientific innovation. - Own strategic problem solving, and take the lead on the design, implementation, and delivery for solutions that have a long-term quantifiable impact. - Provide cross-organizational technical influence, increasing productivity and effectiveness by sharing your deep knowledge and experience. - Develop strategic plans to identify fundamentally new solutions for business problems. - Assist in the career development of others, actively mentoring individuals and the community on advanced technical issues. A day in the life This is a unique and rare opportunity to get in early on a fast-growing segment of AWS and help shape the technology, product and the business. You will have a chance to utilize your deep technical experience within a fast moving, start-up environment and make a large business and customer impact. About the team Diverse Experiences Amazon Automated Reasoning values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn't followed a traditional path, or includes alternative experiences, don't let it stop you from applying. Why Amazon Automated Reasoning? At Amazon, automated reasoning is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for automated reasoning across all of Amazon's products and services. We offer talented automated reasoning professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Automated Reasoning, it's in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest automated reasoning challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We're continuously raising our performance bar as we strive to become Earth's Best Employer. That's why you'll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there's nothing we can't achieve.
IN, KA, Bengaluru
Do you want to lead the development of advanced machine learning systems that protect millions of customers and power a trusted global eCommerce experience? Are you passionate about modeling terabytes of data, solving highly ambiguous fraud and risk challenges, and driving step-change improvements through scientific innovation? If so, the Amazon Buyer Risk Prevention (BRP) Machine Learning team may be the right place for you. We are seeking a Senior Applied Scientist to define and drive the scientific direction of large-scale risk management systems that safeguard millions of transactions every day. In this role, you will lead the design and deployment of advanced machine learning solutions, influence cross-team technical strategy, and leverage emerging technologies—including Generative AI and LLMs—to build next-generation risk prevention platforms. Key job responsibilities Lead the end-to-end scientific strategy for large-scale fraud and risk modeling initiatives Define problem statements, success metrics, and long-term modeling roadmaps in partnership with business and engineering leaders Design, develop, and deploy highly scalable machine learning systems in real-time production environments Drive innovation using advanced ML, deep learning, and GenAI/LLM technologies to automate and transform risk evaluation Influence system architecture and partner with engineering teams to ensure robust, scalable implementations Establish best practices for experimentation, model validation, monitoring, and lifecycle management Mentor and raise the technical bar for junior scientists through reviews, technical guidance, and thought leadership Communicate complex scientific insights clearly to senior leadership and cross-functional stakeholders Identify emerging scientific trends and translate them into impactful production solutions