-
ACL 2022 Workshop on NLP for Conversational AI, Journal of Cryptography2020TLS 1.3 allows two parties to establish a shared session key from an out-of-band agreed Pre Shared Key (PSK). The PSK is used to mutually authenticate the parties, under the assumption that it is not shared with others. This allows the parties to skip the certificate verification steps, saving bandwidth, communication rounds, and latency. We identify a security vulnerability in this TLS 1.3 path, by showing
-
ACL 2020 Workshop on NLP for Conversational AI2020Dialogue response generation models that use template ranking rather than direct sequence generation allow model developers to limit generated responses to pre-approved messages. However, manually creating templates is time consuming and requires domain expertise. To alleviate this problem, we explore automating the process of creating dialogue templates by using unsupervised methods to cluster historical
-
Transactions of the Association for Computational Linguistics2020Recent work has shown that pre-trained language models such as BERT improve robustness to spurious correlations in the dataset. Intrigued by these results, we find that the key to their success is generalization from a small amount of counter examples where the spurious correlations do not hold. When such minority examples are scarce, pre-trained models perform as poorly as models trained from scratch.
-
ACL 20202020The Natural Language Understanding (NLU) component in task oriented dialog systems processes a user’s request and converts it into structured information that can be consumed by downstream components such as the Dialog State Tracker (DST). This information is typically represented as a semantic frame that captures the intent and slot-labels provided by the user. We first show that such a shallow representation
-
EAMT 20202020We present SOCKEYE2, a modernized and streamlined version of the SOCKEYE neural machine translation (NMT) toolkit.New features include a simplified code base through the use of MXNet’s GluonAPI, a focus on state-of-the-art model architectures, and distributed mixed precision training. These improvements result in faster training and inference, higher automatic metric scores, and a shorter path from research
Related content
-
August 07, 2019This year, at the Association for Computational Linguistics’ Workshop on Natural-Language Processing for Conversational AI, my colleagues and I won one of two best-paper awards for our work on slot carryover.
-
July 31, 2019Computerized question-answering systems usually take one of two approaches. Either they do a text search and try to infer the semantic relationships between entities named in the text, or they explore a hand-curated knowledge graph, a data structure that directly encodes relationships among entities.
-
July 22, 2019Using machine learning to train information retrieval models — such as Internet search engines — is difficult because it requires so much manually annotated data. Of course, training most machine learning systems requires manually annotated data, but because information retrieval models must handle such a wide variety of queries, they require a lot of data. Consequently, most information retrieval systems rely primarily on mechanisms other than machine learning.
-
June 27, 2019Earlier this month, Varun Sharma and Akshit Tyagi, two master’s students from the University of Massachusetts Amherst, began summer internships at Amazon, where, like many other scientists in training, they will be working on Alexa’s spoken-language-understanding systems.
-
June 13, 2019Alexa’s ability to respond to customer requests is largely the result of machine learning models trained on annotated data. The models are fed sample texts such as “Play the Prince song 1999” or “Play River by Joni Mitchell”. In each text, labels are attached to particular words — SongName for “1999” and “River”, for instance, and ArtistName for Prince and Joni Mitchell. By analyzing annotated data, the system learns to classify unannotated data on its own.
-
June 11, 2019As Alexa expands into new countries, she usually has to be trained on new languages. But sometimes, she has to be re-trained on languages she’s already learned. British English, American English, and Indian English, for instance, are different enough that for each of them, we trained a new machine learning model from scratch.