-
EACL 20232023Despite significant progress in understanding and improving faithfulness in abstractive summarization, the question of how decoding strategies affect faithfulness is less studied. We present a systematic study of the effect of generation techniques such as beam search and nucleus sampling on faithfulness in abstractive summarization. We find a consistent trend where beam search with large beam sizes produces
-
EACL 20232023Temporal concept drift refers to the problem of data changing over time. In NLP, that would entail that language (e.g. new expressions, meaning shifts) and factual knowledge (e.g. new concepts, updated facts) evolve over time. Focusing on the latter, we benchmark 11 pretrained masked language models (MLMs) on a series of tests designed to evaluate the effect of temporal concept drift, as it is crucial that
-
EACL 20232023Neural models for abstractive summarization tend to generate output that is fluent and wellformed but lacks semantic faithfulness, or factuality, with respect to the input documents. In this paper, we analyze the tradeoff between abstractiveness and factuality of generated summaries across multiple datasets and models, using extensive human evaluations of factuality. In our analysis, we visualize the rates
-
ICML 2021, SDM 20232023Ensuring the privacy of users whose data are used to train Natural Language Processing (NLP) models is necessary to build and maintain customer trust. Differential Privacy (DP) has emerged as the most successful method to protect the privacy of individuals. However, applying DP to the NLP domain comes with unique challenges. The most successful previous methods use a generalization of DP for metric spaces
-
EACL 20232023Opinion summarization provides an important solution for summarizing opinions expressed among a large number of reviews. However, generating aspect-specific and general summaries is challenging due to the lack of annotated data. In this work, we propose two simple yet effective unsupervised approaches to generate both aspect-specific and general opinion summaries by training on synthetic datasets constructed
Related content
-
January 06, 2023Quantization with self-adjustable centroids, contrastive predictive coding for transfer learning, teacher ensembles for differential privacy, and more — Amazon’s speech research features a battery of cutting-edge machine learning techniques.
-
December 23, 2022Program focuses on diversifying tech-industry talent.
-
December 22, 2022A system built on Amazon Translate reduces the workload of human translators.
-
December 20, 2022Ariadna Sanchez, a scientist who works in polyglot text to speech, draws on her musical background to help find novel solutions.
-
December 19, 2022Transfer learning using limited contrastive data improves formality accuracy without compromising performance.
-
December 14, 2022EMNLP papers examine constrained generation of rewrite candidates and automatic selection of information-rich training data.