Knowledge-centric hallucination detection

By Xiangkun Hu, Dongyu Ru, Lin Qiu, Qipeng Guo, Tianhang Zhang, Yang Xu, Yun Luo, Pengfei Liu, Zheng Zhang, Yue Zhang
2024
Download Copy BibTeX GitHub
Copy BibTeX
Large Language Models (LLMs) have shown impressive capabilities but also a concerning tendency to hallucinate. This paper presents REFCHECKER, a framework that introduces claim-triplets to represent claims in LLM responses, aiming to detect fine-grained hallucinations. In REFCHECKER, an extractor generates claim-triplets from a response, which are then evaluated by a checker against a reference. We delineate three task settings: Zero, Noisy and Accurate Context, to reflect various real-world use cases. We curated a benchmark spanning various NLP tasks and annotated 11k claim-triplets from 2.1k responses by seven LLMs. REFCHECKER supports both proprietary and open-source models as the ex-tractor and checker. Experiments demonstrate that claim-triplets enable superior hallucination detection, compared to other granularities such as response, sentence and sub-sentence level claims. REFCHECKER outperforms prior methods by 18.2 to 27.2 points on our benchmark and the checking results of REFCHECKER are strongly aligned with human judgments.
Research areas

Latest news

GB, London
Are you a MS or PhD student interested in a 2025 Internship in Data Science? If so, we want to hear from you! We are looking for a customer obsessed Data Scientist Intern who can innovate in a business environment, building and deploying machine learning models to drive step-change innovation and scale it to the EU/worldwide. If this describes you, come and join our Data Science teams at Amazon for an exciting internship opportunity. If you are insatiably curious and always want to learn more, then you’ve come to the right place. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science If you have questions about Amazon Science internships, please feel free to sign up for one of our upcoming informational sessions via the ‘Events Calendar’ in our Science Intern landing page; https://amazonscienceopportunities.splashthat.com/ Key job responsibilities As a Data Science Intern, you will have following key job responsibilities: • Work closely with scientists and engineers to architect and develop new algorithms to implement scientific solutions for Amazon problems. • Work on an interdisciplinary team on customer-obsessed research • Experience Amazon's customer-focused culture • Create and Deliver Machine Learning projects that can be quickly applied starting locally and scaled to EU/worldwide • Build and deploy Machine Learning models using large data-sets and cloud technology. • Create and share with audiences of varying levels technical papers and presentations • Define metrics and design algorithms to estimate customer satisfaction and engagement A day in the life At Amazon, you will grow into the high impact, visionary person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, France, Germany, Ireland, Israel, Italy, Luxembourg, Netherlands, Poland, Romania, Spain and the UK). Please note these are not remote internships.
US, CA, San Francisco
The AGI team is responsibly advancing Amazon’s generative AI technologies, including the company’s most expansive multimodal foundation models. AGI Autonomy works to augment human intelligence by creating real-world digital agents. We are hiring a Senior Applied Science Manager to lead our efforts in agent capabilities to delight consumer and enterprise customers with practical AI solutions. Key job responsibilities -Collaborate across AGI Autonomy to align team goals with the broader Autonomy research program -Identify and prioritize research opportunities to unlock the next set of agent capabilities -Mentor and guide team members to achieve their career goals and objectives -Other management activities, e.g., communicating with stakeholders, structuring work, and growing the team
US, WA, Bellevue
Inventory Planning and Control (IPC) science is seeking a passionate machine learning scientist working in one of its team locations (NYC, Austin or Bellevue) to build the next generation AI-driven decision making systems. IPC owns the core decision models in the space of Buying, Placement, Capacity Control and Planning. Our models decide when, where, and how much we should buy, flow, and hold inventories in our global fulfillment network to meet Amazon’s business goals and to make our customers happy. We do this for hundreds of millions of items and hundreds of product lines worth billions of dollars of world-wide for both our retail and selling partner business. Our systems are built entirely in-house and operate at various scales, from real-time decision system that completes thousands of transactions per seconds, to large scale distributed system that optimizes the inventory decisions over millions of products simultaneously. IPC is also unique in that we are simultaneously developing the science and software of inventory optimization and solving some of the hardest computational/operational challenges in production. Our team members have an opportunity to be on the forefront of supply chain thought leadership by working with some of the best product managers, scientists, and software engineers in the industry. Key job responsibilities This particular role focuses on building and experimenting the cutting edge technologies in deep learning and reinforcement learning to decide the inventory flows across Amazon's global fulfillment network for hundreds of millions of different products. This role requires superior logical thinkers who are able to quickly approach large ambiguous problems, turn high-level business requirements into mathematical models, identify the right solution approach, and contribute to the software development for production systems. To support their proposals, candidates should be able to independently mine and analyze data, and be able to use any necessary programming and statistical analysis software to do so. Successful candidates must thrive in fast-paced environments, which encourage collaborative and creative problem solving, be able to measure and estimate risks, constructively critique peer research, and align research focuses with the Amazon's strategic needs. A day in the life IPC science is at the center of Amazon’s supply chain. In this role, you will have the opportunity to work with partners and stakeholders from Amazon’s retail, selling partner and operation departments worldwide. You will understand their challenges and pain points, and help develop solutions that improve how Amazon manages inventory in our global fulfillment network. To implement your solutions, you will work closely with our in-house product and engineering teams. Your work will have high visibility and impacts to Amazon’s business operation. About the team IPC science team contains a large group of scientists with different technical backgrounds, who will collaborate closely with you on your projects. Our team directly supports 8 functional areas, developing and maintaining various decision optimization and prediction models behind the scene. We promote experimentation and learn by building. We often seek the opportunity of applying hybrid techniques in the space of Operations Research and Machine Learning to tackle some of our biggest technical challenges.
US, MA, North Reading
Are you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Amazon Robotics is looking for a bright and motivated research scientist co-op who is excited to work on complex, real-world challenges. We are interested in someone who has experience applying human factors and ergonomics standards, method, and analysis tools. The ideal candidate would be familiar with the ISO 11228 standard (Ergonomics - Manual Handling) and have experience conducting ergonomic evaluations with common tools, such as the NIOSH lifting equation, ACGIH threshold limit values, Recommended Cumulative Rest Allowance (RCRA) method, and the Liberty Mutual Manual Material Handling tables. We are also seeking candidates who have experience with digital human modeling software, such as Siemens Jack or Process Simulate, wearable/optical motion capture systems, and EMG analysis. Additionally, we are seeking candidates who have experience conducting ergonomics/biomechanics research and publishing peer reviewed articles in reputable journals and/or conference proceedings. The ideal candidate would have experience coding in MATLAB and/or Python, and be able to utilize data science and machine learning to develop novel ergonomic assessment methodologies. Key job responsibilities The role will entail conducting ergonomics research to ensure the safety of Amazon associates, operators, and maintainers. This research will help shape the HFE team’s methodology, allowing us to better guide product design decisions that limit injury risk. You will help to identify fundamental questions pertaining to human capabilities and tolerances in a variety of work environments and provide human factors/ergonomics input to teams across Amazon Robotics, such as Hardware Engineering, Product and Program Management, Safety, Systems Engineering, and Operations. You will work closely with our ergonomics research colleagues in the Workplace Health and Safety (WHS) organization to develop and refine a unified toolkit of ergonomic analysis methodologies. Your responsibilities may include: * Conducting novel HFE research to improve the speed and accuracy of our assessment methodologies. * Staying up-to-date on the latest in ergonomics industry research, and recommending new approaches/technologies that the team should consider adopting. * Contributing to comprehensive assessments of workstations and processes covering biomechanical, physiological, and psychophysical demands. * Continuously striving to gain in-depth knowledge of your profession, as well as branch out to learn about intersecting fields, such as user experience, industrial design, robotics and mechatronics. * Supporting the development of internal guidelines, process documents, and standards to broaden our impact and improve the clarity of our methods for the hardware design organization. * Occasionally traveling to various sites to conduct research or support product assessments (e.g., collect biomechanical/physiological data and collect associate feedback) About the team At Amazon Robotics, we create high tech products that help associates to work faster, safer, and with greater satisfaction. In order to best serve our associate customers, we need to incorporate ergonomic considerations into the design cycle for associate facing hardware development projects. The Human Factors and Ergonomics (HFE) team conducts iterative evaluations of every new Amazon Robotics product, in order to minimize injury potential and enhance the overall associate experience. Utilizing advanced technologies and cutting-edge research, we develop and institutionalize state-of-the-art ergonomic assessment tools that accurately and efficiently predict impacts to associates working in AR fulfillment centers.
US, IL, Chicago
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. In this Data Scientist role you will be capable of using GenAI and other techniques to design, evangelize, and implement and scale cutting-edge solutions for never-before-solved problems. Key job responsibilities As a Senior Data Scientist, you will- - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, TX, Austin
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. In this Data Scientist role you will be capable of using GenAI and other techniques to design, evangelize, and implement and scale cutting-edge solutions for never-before-solved problems. Key job responsibilities As a Data Scientist, you will- - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team ABOUT AWS: Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Seattle
Are you interested in revolutionizing the way people around the world enjoy live sports video? Come and join us and be part of the Prime Video Playback team. As a video scientist, you will: - Drive novel live encoding optimization to ensure the best live sports streaming experience delivered to millions of global customers. - Utilize the state-of-the-art computer vision and machine learning techniques to achieve content adaptive live sports encoding to maximize quality per bits at Amazon scale. - Innovate in video quality measurement, video content analysis, and video compression technologies to lead the video industry/community. Key job responsibilities As a video scientist in the Prime Video Playback, this person shall: - Get familiar with the latest development and advances in video processing, video compression, and computer vision and machine learning to video understanding and analysis - Build research prototypes in live sports video content analysis, objective and subjective video quality measurement, and content-adaptive live video encoding. - Document and present technical proposals and implementations to both internal and external stakeholders and partners. - Work closely with engineering and product team to prioritize technology prototyping, productization and deployment A day in the life As a video scientist in the Prime Video Playback, you will: - Research and prototype innovative ideas in live sports content analysis, quality measurement, and content-adaptive live video encoding. - Drive technical approach and innovation via proof-of-concept prototyping, paper/report writing, technical presentations and patent filing - Collaborate with and influence product and engineering teams for technology productization and deployment About the team Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one application available across thousands of devices. On Prime Video, customers can find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies Road House, The Lord of the Rings: The Rings of Power, Fallout, Reacher, The Boys, and The Idea of You; licensed fan favorites Dawson’s Creek and IF; Prime member exclusive access to coverage of live sports including Thursday Night Football, WNBA, and NWSL, and acclaimed sports documentaries including Bye Bye Barry and Federer; and programming from partners such as Apple TV+, Max, Crunchyroll and MGM+ via Prime Video add-on subscriptions, as well as more than 500 free ad-supported (FAST) Channels. Prime members in the U.S. can share a variety of benefits, including Prime Video, by using Amazon Household. Prime Video is one benefit among many that provides savings, convenience, and entertainment as part of the Prime membership. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles, including blockbusters such as Challengers and The Fall Guy, via the Prime Video Store, and can enjoy content such as Jury Duty and Bosch: Legacy free with ads on Freevee. Customers can also go behind the scenes of their favorite movies and series with exclusive X-Ray access. For more info visit www.amazon.com/primevideo.
US, WA, Bellevue
Do you enjoy solving challenging problems and driving innovations in research? Are you seeking for an environment with a group of motivated and talented scientists like yourself? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? Do you want to play a key role in the future of Amazon transportation and operations? Come and join us at Amazon's Modeling and Optimization team (MOP). Key job responsibilities A Research Scientist in the Modeling and Optimization (MOP) team - provides analytical decision support to Amazon planning teams via applying advanced mathematical and statistical techniques. - collaborates effectively with Amazon internal business customers, and is their trusted partner - is proactive and autonomous in discovering and resolving business pain-points within a given scope - is able to identify a suitable level of sophistication in resolving the different business needs - is confident in leveraging existing solutions to new problems where appropriate and is independent in designing and implementing new solutions where needed - is aware of the limitations of their proposed solutions and is proactive in communicating them to the business, and advances the application of sciences towards Amazon business problems by bringing new methods, ideas, and practices to the team and scientific community. A day in the life - Your will be developing model-based optimization, simulation, and/or predictive tools to identify and evaluate opportunities to improve customer experience, network speed, cost, and efficiency of capital investment. - You will quantify the improvements resulting from the application of these tools and you will evaluate the trade-offs between potentially competing objectives. - You will develop good communication skills and ability to speak at a level appropriate for the audience, will collaborate effectively with fellow scientists, software development engineers, and product managers, and will deliver business value in a close partnership with many stakeholders from operations, finance, IT, and business leadership. About the team - At the Modeling and Optimization (MOP) team, we use mathematical optimization, algorithm design, statistics, and machine learning to improve decision-making capabilities across WW Operations and Amazon Logistics. - We focus on transportation topology, labor and resource planning for fulfillment facilities, routing science, visualization research, data science and development, and process optimization. - We create models to simulate, optimize, and control the fulfillment network with the objective of reducing cost while improving speed and reliability. - We support multiple business lanes, therefore maintain a comprehensive and objective view, coordinating solutions across organizational lines where possible.
US, CA, Sunnyvale
Help re-invent how millions of people watch TV! Fire TV remains the #1 best-selling streaming media player in the US. Our goal is to be the global leader in delivering entertainment inside and outside the home, with the broadest selection of content, devices and experiences for customers. Our science team works at the intersection of Recommender Systems, Information Retrieval, Machine Learning and Natural Language Understanding. We leverage techniques from all these fields to create novel algorithms that allow our customers to engage with the right content at the right time. Our work directly contributes to making our devices delightful to use and indispensable for the household. Key job responsibilities - Drive new initiatives applying Machine Learning techniques to improve our recommendation, search and entity matching algorithms - Perform hands-on data analysis and modeling with large data sets to develop insights that increase device usage and customer experience - Design and run A/B experiments, evaluate the impact of your optimizations and communicate your results to various business stakeholders - Work closely with product managers and software engineers to design experiments and implement end-to-end solutions - Setup and monitor alarms to detect anomalous data patterns and perform root cause analyses to explain and address them - Be a member of the Amazon-wide Machine Learning Community, participating in internal and external MeetUps, Hackathons and Conferences - Help attract and recruit technical talent; mentor junior scientists
US, WA, Seattle
Amazon is investing heavily in building a world class advertising business and developing a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses for driving long-term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities. Sponsored Products DP Experience and Market place org is looking for a strong Applied Scientist who can delight our customers by continually learning and inventing. Our ideal candidate is an experienced Applied Scientist who has a track-record of performing deep analysis and is passionate about applying advanced ML and statistical techniques to solve real-world, ambiguous and complex challenges to optimize and improve the product performance, and who is motivated to achieve results in a fast-paced environment. The position offers an exceptional opportunity to grow your technical and non-technical skills and make a real difference to the Amazon Advertising business. As an Applied Scientist in the Blended Widgets team, you will: * Conduct hands-on data analysis, and run regular A/B experiments, gather data, perform statistical analysis and deep dive, and communicate the impact to senior management * Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative analysis and business judgment * Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving * Collaborate with software engineering teams to integrate successful experimental results into large-scale, highly complex Amazon production systems * Promote the culture of experimentation and applied science at Amazon Team video https://youtu.be/zD_6Lzw8raE We are also open to consider the candidate in New York, or Seattle.