-
2025Large language models (LLMs) have achieved remarkable performance on various natural language tasks. However, they are trained on static corpora and their knowledge can become outdated quickly in the fast-changing world. This motivates the development of knowledge editing methods designed to update certain knowledge in LLMs without changing unrelated others. To make selective edits, previous efforts often
-
NAACL Findings 20252025The next token prediction loss is the dominant self-supervised training objective for large language models and has achieved promising results in a variety of downstream tasks. However, upon closer investigation of this objective, we find that it lacks an understanding of sequence-level signals, leading to a mismatch between training and inference processes. To bridge this gap, we introduce a contrastive
-
ICSE 20252025In this study, we address the issue of API hallucinations in various software engineering contexts. We introduce CloudAPIBench, a new benchmark designed to measure API hallucination occurrences. CloudAPIBench also provides annotations for frequencies of API occurrences in the public domain, allowing us to study API hallucinations at various frequency levels. Our findings reveal that Code LLMs struggle with
-
2025Reasoning and linguistic skills form the cornerstone of human intelligence, facilitating problem-solving and decision-making. Recent advances in Large Language Models (LLMs) have led to impressive linguistic capabilities and emergent reasoning behaviors, fueling widespread adoption across application do-mains. However, LLMs still struggle with complex reasoning tasks, highlighting their systemic limitations
-
2025Large language models (LLMs) have demonstrated remarkable capabilities in handling complex dialogue tasks without requiring use case-specific fine-tuning. However, analyzing live dialogues in real-time necessitates low-latency processing systems, making it impractical to deploy models with billions of parameters due to latency constraints. As a result, practitioners often prefer smaller models with millions
Related content
-
March 27, 2025Training separate models on different datasets and then merging them reduces computational costs by as much as 91%.
-
March 10, 2025Inaugural global university competition focused on advancing secure, trusted AI-assisted software development.
-
February 20, 2025Using large language models to generate training data and updating models through both fine tuning and reinforcement learning improves the success rate of code generation by 39%.
-
February 06, 2025Novel training procedure and decoding mechanism enable model to outperform much larger foundation model prompted to perform the same task.
-
December 11, 2024LLM-augmented clustering enables QualIT to outperform other topic-modeling methods in both topic coherence and topic diversity.
-
December 09, 2024The Amazon AGI SF Lab will focus on developing new foundational capabilities for enabling useful AI agents.