A first look: Towards explainable textVQA models via visual and textual explanations

By Varun Nagaraj Rao, Xingjian Zhen, Karen Hovsepian, Mingwei Shen
2021
Download Copy BibTeX
Copy BibTeX
Explainable deep learning models are advantageous in many situations. Prior work mostly provide unimodal explanations through posthoc approaches not part of the original system design. Explanation mechanisms also ignore useful textual information present in images. In this paper, we propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations, which focuses on the text in the image. We curate a novel dataset TextVQA-X, containing ground truth visual and multi-reference textual explanations that can be leveraged during both training and evaluation. We then quantitatively show that training with multimodal explanations complements model performance and surpasses unimodal baselines by up to 7% in CIDEr scores and 2% in IoU. More importantly, we demonstrate that the multimodal explanations are consistent with human interpretations, help justify the models’ decision, and provide useful insights to help diagnose an incorrect prediction. Finally, we describe a real-world ecommerce application for using the generated multimodal explanations.

Latest news

LU, Luxembourg
Are you interested in building state-of-the-art machine learning systems for the most complex, and fastest growing, transportation network in the world? If so, Amazon has the most exciting, and never-before-seen, challenges at this scale (including those in sustainability, e.g. how to reach net zero carbon by 2040). Amazon’s transportation systems get millions of packages to customers worldwide faster and cheaper while providing world class customer experience – from online checkout, to shipment planning, fulfillment, and delivery. Our software systems include services that use tens of thousands of signals every second to make business decisions impacting billions of dollars a year, that integrate with a network of small and large carriers worldwide, that manage business rules for millions of unique products, and that improve experience of over hundreds of millions of online shoppers. As part of this team you will focus on the development and research of machine learning solutions and algorithms for core planning systems, as well as for other applications within Amazon Transportation Services, and impact the future of the Amazon delivery network. Current research and areas of work within our team include machine learning forecast, uncertainty quantification, planning systems, model interpretability, graph neural nets, among others. We are looking for a Machine Learning Scientist with a strong academic background in the areas of machine learning, time series forecasting, and/or optimization. At Amazon, we strive to continue being the most customer-centric company on earth. To stay there and continue improving, we need exceptionally talented, bright, and driven people. If you'd like to help us build the place to find and buy anything online, and deliver in the most efficient and greenest way possible, this is your chance to make history. About the team The EU ATS Science and Technology (SnT) team owns scalable algorithms, models and systems that improve customer experience in middle-mile. We work backwards from Amazon's customers aiming to make transportation faster, cheaper, safer, more reliable and ecologically sustainable.
US, WA, Bellevue
We are a part of Amazon Alexa Devices organization with the mission “delight customers through contextual and personalized proactive experiences that keep customers informed, engaged, and productive without cognitive burden”. We are developing an advanced system using Large Language Model (LLM) technologies to deliver engaging, intuitive, and adaptive content recommendations across all Amazon surfaces. We aim to facilitate seamless reasoning and customer experiences, surpassing the capabilities of previous machine learning models. We are looking for a passionate, talented, and resourceful Applied Scientist in the field of Natural Language Processing (NLP), Recommender Systems and/or Information Retrieval, to invent and build scalable solutions for a state-of-the-art context-aware speech assistant. A successful candidate will have strong machine learning background and a desire to push the envelope in one or more of the above areas. The ideal candidate would also enjoy operating in dynamic environments, be self-motivated to take on challenging problems to deliver big customer impact, shipping solutions via rapid experimentation and then iterating on user feedback and interactions. Key job responsibilities As an Applied Scientist on the team, you will collaborate with other applied scientists and engineers to develop novel algorithms to enable timely, relevant and delightful recommendations and conversations. Your work will directly impact our customers in the form of products and services that make use of various machine learning, deep learning and language model technologies. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in the state of art.
IN, KA, Bengaluru
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, WA, Seattle
We are building GenAI based shopping assistant for Amazon. We reimage Amazon Search with an interactive conversational experience that helps you find answers to product questions, perform product comparisons, receive personalized product suggestions, and so much more, to easily find the perfect product for your needs. We’re looking for the best and brightest across Amazon to help us realize and deliver this vision to our customers right away. This will be a once in a generation transformation for Search, just like the Mosaic browser made the Internet easier to engage with three decades ago. If you missed the 90s—WWW, Mosaic, and the founding of Amazon and Google—you don’t want to miss this opportunity.
US, WA, Seattle
The Worldwide Defect Elimination (WWDE) Science team in Amazon Customer Service builds state-of-the-art Artificial Intelligence (AI) models to enable defect-free shopping experiences for Amazon customers. We develop technology and mechanisms to discover, root cause, measure, and escalate defects for resolution before they impact a broader range of customers. We are looking for a creative problem solver and technically-skilled Applied Scientist able and interested in building AI solutions to address customer issues at scale. The ideal candidate will lead the development of innovative solutions that identify, root cause, attribute, and summarize problems embedded in large volumes of customer feedback in different modalities. They will also utilize the latest advances in GenAI technology to explore billions of customer contacts and automate defect resolution workflows. As a part of this role, this candidate will collaborate with a large team of experts in the field and move the state of defect elimination research forward. This candidate should have a knack for leveraging AI to translate complex data insights into actionable strategies and can communicate these effectively to both technical and non-technical audiences. Key job responsibilities * Apply science models to extract actionable information from large volumes and varying modalities of customer feedback * Leverage GenAI/Large Language Model (LLM) technology for scaling and automating defect elimination workflows * Design and implement metrics to evaluate the effectiveness of AI models * Present deep dives and analysis to both technical and non-technical stakeholders, ensuring clarity and understanding and influencing business partners * Perform statistical analysis and statistical tests including hypothesis testing and A/B testing * Recognize and adopt best practices in reporting and analysis: data integrity, test design, analysis, validation, and documentation A day in the life If you are not sure that every qualification on the list above describes you exactly, we'd still love to hear from you! At Amazon, we value people with unique backgrounds, experiences, and skillsets. If you’re passionate about this role and want to make an impact on a global scale, please apply! Amazon offers a full range of benefits that support you and eligible family members, including domestic partners and their children. Benefits can vary by location, the number of regularly scheduled hours you work, length of employment, and job status such as seasonal or temporary employment. The benefits that generally apply to regular, full-time employees include: 1. Medical, Dental, and Vision Coverage 2. Maternity and Parental Leave Options 3. Paid Time Off (PTO) 4. 401(k) Plan About the team The Worldwide Defect Elimination (WWDE) team's mission is to understand and resolve all issues impacting customers at scale. The WWDE Science team is a force multiplier within this group, helping to to apply science solutions to eliminate defects and enhance customer experience.
US, VA, Arlington
Amazon Web Services (AWS) is building a world-class marketing organization, and we are looking for an experienced Economist to join the central data and science organization for AWS Marketing. This candidate will develop innovative solutions to measure the return on marketing investments. They will work closely with business leaders, scientists, and engineers to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of innovative measurement solutions. They will interact with functional leaders owning events (e.g. re:Invent, summits, webinars), paid media (paid search, paid social, display), AWS-owned channels (email, website, console) as well as lead management organization to drive the development, fine-tuning and adoption of the consistent measurement framework across these diverse initiatives. We seek candidates with an entrepreneurial spirit who want to make a big impact on AWS growth. They will develop strong working relationships and thrive in a collaborative team environment. They will have the creativity, curiosity, and strong judgment to work on high-impact, high-visibility products to improve the experience of AWS leads and customers. Key job responsibilities - Apply your expertise in causal inference and ML to develop systems to measure B2B marketing impact - Develop and execute science products from concept, prototype to production incorporating feedback from customers, scientists and business leaders - Identify new opportunities for leveraging economic insights and models in the marketing space - Write technical white papers and business-facing documents to clearly explain complex technical concepts to audiences with diverse business/scientific backgrounds
US, CA, Santa Clara
Amazon’s Middle Mile Planning Research and Optimization Science (mmPROS) group is looking for an Research Scientist specializing in machine learning and optimization algorithms applicable to large-scale transportation planning and pricing problems. This includes the development of novel machine learning, reinforcement learning and causal inference techniques for better marketplace optimization solutions. Middle Mile Air and Ground transportation represents one of the fastest growing logistics areas within Amazon. Amazon Fulfillment Services transports millions of packages via air and ground and continues to grow year over year. The scale of this operation challenges Amazon to design, build and operate robust transportation networks to minimize the overall operational cost while meeting all customer deadlines. The mmPROS group is charged with developing an evolving suite of decision support and optimization tools to facilitate the design of efficient air and ground transport networks, optimize the flow of packages within the network to efficiently align network capacity and shipment demand, set prices, and effectively utilize scarce resources, such as aircraft and trucks. Time horizons for these tools vary from years and months for long-term planning to hours and minutes for near-term operational decision making and disruption recovery. These tools rely heavily on mathematical optimization, stochastic simulation, meta-heuristic and machine learning techniques. In addition, Amazon often finds existing techniques not effectively matching our unique business needs, which necessitates the innovation and development of new approaches and algorithms for an adequate solution. As a Research Scientist supporting middle mile transportation, you will be working closely with different teams including business leaders and engineers to design and build scalable products operating across multiple transportation modes. You will create experiments and prototype implementations of new learning algorithms and prediction techniques. You will have exposure to top level leadership to present findings of your research. You will also work closely with other scientists and also engineers to implement your models within our production system. You will implement solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility, and make decisions that affect the way we build and integrate algorithms across our product portfolio. Key job responsibilities Use statistical and machine learning models to solve ambiguous transportation problems. Write production-level code to deliver the models. Support model performance monitoring, experimentation and propose enhancements accordingly. About the team The Amazon Freight team under mmPROS is responsible for the cost prediction, pricing and demand forecast related to Amazon's shipper-facing freight business. Amazon carries Truckload (TL), Less-Than-Truckload (LTL) and Intermodal (IM) shipments for external shippers using capacity from Amazon’s Middle Mile transportation network.
US, WA, Bellevue
Have you ever wondered how Amazon predicts when your order will arrive and how we ensure that it actually arrives on at the promised date/time? Have you wondered where all those Amazon semi-trucks on the road are headed? Are you passionate about increasing efficiency and reducing carbon footprint? Does the idea of having worldwide impact on Amazon's logistics network including our planes, trucks, and vans sound exciting to you? If so, then we want to talk with you! At Amazon's Supply Chain Optimization Technologies (SCOT), we are tasked with optimizing the fulfilment on customer orders so that we fulfil all orders worldwide in the most intelligent manner while ensuring Amazon customers get their orders on time. Amazon Fulfillment Planning & Execution (FPX) Science team within SCOT- Fulfilment Optimization group is seeking Manager Research Science with expertise in Machine Learning and/or Optimization and a proven record of leading scientists and solving business problems through scalable ML solutions. FPX Science tackles some of the most mathematically complex challenges in transportation planning and execution space to improve Amazon's operational efficiency worldwide. We own Amazon’s global fulfilment center and transportation planning and execution. The team also owns the short-term network planning and execution that determines the optimal flow of customer orders through Amazon fulfilment network. This includes developing sophisticated math models and controllers that assign orders to fulfilment centers to be picked and packed and then planning the optimal ship method in terms of cost, speed and carbon impact to deliver to the customer. These plans drive downstream decisions that are in the billions of dollars at Amazon Scale worldwide. The systems we build are entirely in-house, and are on the cutting edge of both academic and applied research in large scale supply chain planning, optimization, machine learning and statistics. These systems operate at various scales, from real-time decision system that completes thousands of transactions per seconds, to large scale distributed system that optimize Amazon’s fulfilment network. As Amazon continues to build and expand the first party delivery network, this role will be critical to realize this vision. Your team and tech solution will have large impacts to the physical supply chain of Amazon, and play a key role in improving Amazon consumer business’s long-term profitability. If you are interested in diving into a multi-discipline, high impact space this is the team for you. We’re looking for a passionate, results-oriented, and inventive Scientist who can lead from the front towards developing and deploying ML models for our outbound transportation planning systems. In addition, you will be working on design, development and evaluation of highly innovative ML models for solving complex business problems in the area of outbound transportation planning systems. The position is located at Bellevue, WA, just next to Seattle with beautiful outdoors and great city life. Watch http://bit.ly/amazon-scot to get the big picture. Key job responsibilities As a Manager of Research Science within FPX Science team, you will lead a team of research and applied scientists towards designing and deploy solutions that will likely draw from a range of scientific areas such as supervised, semi-supervised and unsupervised learning, reinforcement learning, advanced statistical modeling, and graph models. You will have an opportunity to be on the forefront of supply chain thought leadership by working on some of the most difficult problems in the industry, with some of the best product managers, research scientists, statisticians, and software engineers to integrate scientific work into production systems. You will bring deep technical expertise in the area of Machine Learning, and will play an integral part in building Amazon's Fulfillment Optimization systems. Other responsibilities include: * Lead a team of research and applied scientists towards design, development and evaluation of highly innovative ML models for solving complex business problems. * Technically lead and mentor the scientists on the team. * Research and apply the latest ML techniques and best practices from both academia and industry. * Use and analytical techniques to create scalable solutions for business problems. * Work closely with software engineering teams to build model implementations and integrate successful models and algorithms in production systems at very large scale. * Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. A day in the life In this critical role, you will be a technical leader in operations research or machine learning with significant scope, impact, and visibility. Your solutions have the potential to drive billions of dollars in impact for Amazon's supply chain globally. As a science manager on the team, you will engage in all facets of the process from ideation, business analysis and scientific research to development and deployment of advanced models. We are seeking someone who wants to lead projects that require innovative thinking and deep technical problem-solving skills to create production-ready machine learning solutions. A successful candidate is able to quickly approach large ambiguous problems, turn high-level business requirements into mathematical models, identify the right solution approach, and contribute to the software development for production systems. Successful candidates must thrive in fast-paced environments, which encourage collaborative and creative problem solving, be able to measure and estimate risks, constructively critique peer research, and align research focuses with the Amazon's strategic needs. We look for individuals who know how to deliver results and show a desire to develop themselves, their team, and their career. About the team Fulfillment Planning & Execution Science team contains a group of scientists with different technical backgrounds including Machine Learning and Operations Research, who will collaborate closely with you on your projects. Our team directly supports critical functional areas across Fulfillment Optimization and the research needs of the corresponding product and engineering teams. We tackle some of the most mathematically complex challenges in facility and transportation planning and execution to improve Amazon's operational efficiency worldwide and at a scale that is unique to Amazon. We often seek the opportunity of applying hybrid techniques in the space of Operations Research and Machine Learning to tackle some of our biggest technical challenges. We disambiguate complex supply chain problems and create ML and optimization solutions to solve those problems at scale.
US, WA, Bellevue
Conversational AI ModEling and Learning (CAMEL) team is part of Amazon Artificial General Intelligence (AGI) organization where our mission is to create a best-in-class Conversational AI that is intuitive, intelligent, and responsive, by developing superior Large Language Models (LLM) solutions and services which increase the capabilities built into the model and which enable utilizing thousands of APIs and external knowledge sources to provide the best experience for each request across millions of customers and endpoints. We are looking for a passionate, talented, and resourceful Applied Scientist in the field of LLM, Artificial Intelligence (AI), Natural Language Processing (NLP) and/or Information Retrieval, to invent and build scalable solutions for a state-of-the-art context-aware conversational AI. A successful candidate will have strong machine learning background and a desire to push the envelope in one or more of the above areas. The ideal candidate would also have hands-on experiences in developing LLM solution, enjoy operating in dynamic environments, be self-motivated to take on challenging problems to deliver big customer impact, moving fast to ship solutions and then iterating on user feedback and interactions. Key job responsibilities As an Applied Scientist, you will leverage your technical expertise and experience to collaborate with other talented applied scientists and engineers to research and develop novel algorithms and modeling techniques to reduce friction and enable natural and contextual conversations. You will analyze, understand and improve user experiences by leveraging Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in artificial intelligence. You will work on core LLM technologies, including developing best-in-class modeling, prompt optimization algorithms to enable Conversation AI use cases. Your work will directly impact our customers in the form of novel products and services .
IN, KA, Bengaluru
Applied Scientist, CMT, Amazon Bangalore Impact As a member of the CMT team, you'll play a key role in the evolution of our Competitive Monitoring systems to solve significantly complex and interesting technical challenges in Large-scale computing, Distributed systems, Web applications, Data mining, Scalability, Security, and Algorithms to name a few. The team's work directly impacts customer experience at a worldwide scale. Innovation Are you seeking an environment where you can drive innovation? Do you want to apply state-of-the-art computer science and advance information retrieval techniques to solve real world problems of competitive data analysis? Does the challenge of building real time, highly scalable solutions for the most complex online business using innovative technology excite you? Opportunity To meet these challenges, the CMT team is looking for passionate, talented and innovative scientists looking to work on cutting edge technology, from Natural Language processing to optimization to image processing and LLMs. In addition to getting the opportunity to participate in research in several domains, you will lead the solution to complex pricing problems in an extremely agile environment. This role will have the opportunity to learn and work on the most cutting edge generative AI solutions. Come be part of this growing, dynamic and challenging space! Key job responsibilities 1. Research the problem domain and come up with various approaches to solve the problem. 2. Be willing to experiment quickly and fail fast. 3. Collaborate with engineers to come up with the right end to end solution to the business problems. 4. Ideate on future roadmap for science in CMT, and CMT in general.