Customer-obsessed science
Research areas
-
June 3, 20264 min readAutomatically fact-checking long, AI-generated research reports poses new challenges — including benchmarking.
-
May 26, 20265 min read
-
-
May 14, 202616 min read
Featured news
-
ICML 2026 Workshop on Scalable Learning and Optimization for Efficient Multimodal AI Agents (SCALE)2026Enterprise environments differ fundamentally from the clean settings assumed in LLM research: knowledge is distributed across heterogeneous sources, often incomplete or inconsistent, and key procedural logic is implicitly encoded in artifacts rather than explicitly documented. In such settings, retrieval-based approaches are insufficient, as no single source contains the full workflow. We propose a replication-driven
-
IEEE ICMA 20262026Deploying computer vision models in Warehouse Facilities traditionally requires extensive resources for camera mounting, image collection, annotation, training, and deployment - a process often needing repetition in each new environment due to camera mounting constraints and environmental variability. This paper explores an innovative approach to streamline this process by conducting the standard procedure
-
Transactions on Machine Learning Research2026Inspired by the success of reinforcement learning (RL) in Large Language Model (LLM) training for domains like math and code, recent work has begun training LLMs to dynamically plan, query, and reason with search engines as tools— a paradigm increasingly referred to as agentic search. Although these methods achieve performance improvement across popular short-form QA benchmarks, many prioritize final answer
-
IEEE ICMA 20262026Efficient sorter diversion control of automated material handling systems (MHS) is critical for optimizing operational efficiency in large-scale warehouse environments. In this study, we use an inbound receiving sorter at a high-volume e-commerce warehouse as our primary use case, where the sorter diversion system relies on cost functions with static weight configurations that fail to adapt to highly dynamic
-
2026LLM serving frameworks are quickly evolving with a complex software stack and a vast number of optimizations. The rapid development process can introduce silent errors where output quality silently degrades without any explicit error signals. Diagnosing silent errors is notoriously difficult due to the substantial semantic gap between the high-level symptoms and the low-level root causes. We observe that
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all