-
2024Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred million parameter scale using very limited pre-training corpora. In this work, we fuel code representation learning with a vast amount of code data via a two-stage pre-training
-
2024It is often advantageous to train models on a subset of the available train examples, because the examples are of variable quality or because one would like to train with fewer examples, without sacrificing performance. We present Gradient Information Optimization (GIO), a scalable, task-agnostic approach to this data selection problem that requires only a small set of (unlabeled) examples representing
-
*SEM 20242024The majority of Neural Semantic Parsing (NSP) models are developed with the assump-tion that there are no concepts outside the ones such models can represent with their target symbols (closed-world assumption). This assumption leads to generate hallucinated outputs rather than admitting their lack of knowledge. Hallucinations can lead to wrong or potentially offensive responses to users. Hence, a mechanism
-
2024Training a supervised news summarization model requires large amounts of high-quality training data consisting of news articles paired with reference summaries. However, obtaining such data is costly, and existing datasets contain considerable amount of noise. We present a new large-scale and high-quality dataset for supervised abstractive news summarization containing 1.3 million training samples, which
-
In this work, we propose sequence-level certainty as a common theme over hallucination in Knowledge Grounded Dialogue Generation (KGDG). We explore the correlation between the level of hallucination in model responses and two types of sequence-level certainty: probabilistic certainty and semantic certainty. Empirical results reveal that higher levels of both types of certainty in model responses are correlated
Related content
-
May 02, 2023ICLR workshop sponsored by Amazon CodeWhisperer features Amazon papers on a novel contrastive-learning framework for causal language models and a way to gauge the robustness of code generation models.
-
April 12, 2023From noisy cars to unreliable signals, researchers have worked to extend the Alexa experience to vehicles on the move.
-
April 06, 2023University teams are competing to help advance the science of conversational embodied AI and robust human AI interaction.
-
April 03, 2023Combining acoustic and lexical information improves real-time voice sentiment analysis.
-
March 31, 2023Attendees explored new avenues of research in areas including robotics and conversational AI via roundtables moderated by researchers from Amazon.
-
March 27, 2023Initiative will advance artificial intelligence and machine learning research within speech, language, and multimodal-AI domains.