Search - Amazon Science

signSGD: compressed optimisation for non-convex problems

Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Animashree Anandkumar

ICML 2018

2018

Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. SIGNSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative `1/`2 geometry of gradients

Machine learning

Structured variational auto-encoded optimization.

Xiaoyu Lu, Javier González, Zhenwen Dai, Neil Lawrence

ICML 2018

2018

We tackle the problem of optimizing a blackbox objective function defined over a highly structured input space. This problem is ubiquitous in machine learning. Inferring the structure of a neural network or the Automatic Statistician (AS), where the kernel combination for a Gaussian process is optimized, are two of many possible examples. We use the AS as a case study to describe our approach, that can

Machine learning

Detecting and correcting for label shift with black box predictors

Zachary Lipton, Yu-Xiang Wang, Alex Smola

ICML 2018

2018

Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels. Motivated by medical diagnosis, where diseases (targets), cause symptoms (observations), we focus on label shift, where the label marginal p(y) changes but the conditional p(x|y) does not. We propose Black Box Shift Estimation (BBSE) to estimate the

Machine learning

Optimal message scheduling for aggregation

Leyuan Wang, Mu Li, Edo Liberty, Alex Smola

SysML 2018

2018

We derive algorithms for producing optimal aggregation schedules for automatically aggregating gradients across di!erent compute units, both CPUs and GPUs, with arbitrary topologies. We show that this can be accomplished by solving a linear program on the spanning tree polytope. We give analytic bounds for the value of the optimal solution for arbitrary graphs. We also propose simple schedules that meet

Machine learning

β-BNN: A rate-distortion perspective on Bayesian neural networks

Shell Hu, Andreas Damianou, Pablo Garcia Moreno

NeurIPS 2018

2018

We propose an alternative training framework for Bayesian neural networks (BNNs), which is motivated by viewing the Bayesian model for supervised learning as an autoencoder for data transmission. Then, a natural objective can be invoked from the rate-distortion theory. Specifically, we end up minimizing the mutual information between the weights and the dataset with a constraint that the negative log-likelihood

Machine learning

Learning to segment inputs for NMT favors character-level processing

Julia Kreutzer, Artem Sokolov

IWSLT 2018

2018

Most modern neural machine translation (NMT) systems rely on presegmented inputs. Segmentation granularity importantly determines the input and output sequence lengths, hence the modeling depth, and source and target vocabularies, which in turn determine model size, computational costs of softmax normalization, and handling of out-of-vocabulary words. However, the current practice is to use static, heuristic-based

Conversational AI

Unsupervised quality estimation without reference corpus for subtitle machine translation using word embeddings

Prabhakar Gupta, Shaktisingh Shekhawat, Keshav Kumar

ICSC 2018

2018

We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for a parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English-source subtitles

Conversational AI

A machine learning approach to detecting start reading location of eBooks

Sravan Bodapati, Sriraghavendra Ramaswamy, Gururaj Narayanan

ICDM 2018

2018

Machine Learning and NLP (Natural Language Processing) have aided the development of new and improved user experience features in many applications. We address the problem of automatically identifying the “Start Reading Location” (SRL) of eBooks, i.e. the location of the logical beginning or start of main content. This improves eBook reading experience by taking users automatically to the logical start

Conversational AI

MLZero: Towards zero touch machine learning

Tom Diethe, Tom Borchert, Eno Thereska, Borja de Balle Pigem, Cédric Archambeau, Neil Lawrence

NeurIPS 2018

2018

This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive Machine

Cloud and systems

Fast Lexically Constrained Decoding with Dynamic Beam Allocation for Neural Machine Translation

Matt Post, David Vilar

NAACL 2018

2018

The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms. Recent work, however, has introduced a new capability: lexically constrained or guided decoding, a modification to beam search that forces the inclusion of pre-specified words and phrases in the output. However, while theoretically sound, existing

Conversational AI

Cross-lingual approaches to reference resolution in spoken dialogue

Amr Sharaf, Arpit Gupta, Hancheng Ge, Chetan Naik, Rylan Conway, Lambert Mathias

NeurIPS 2018

2018

In the slot-filling paradigm, where a user can refer back to slots in the context during the conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In this paper, we build on (Naik et al., 2018), which provides a scalable multi-domain framework for resolving references. However, scaling this approach across languages

Demand-weighted completeness prediction for a knowledge base

Andrew Hopkinson, Amit Gurdasani, Dave Palfrey, Arpit Mittal

NAACL 2018

2018

In this paper we introduce the notion of Demand-Weighted Completeness, allowing estimation of the completeness of a knowledge base with respect to how it is used. Defining an entity by its classes, we employ usage data to predict the distribution over relations for that entity. For example, instances of person in a knowledge base may require a birth date, name and nationality to be considered complete.

Conversational AI

Learning word embeddings for low-resource languages by PU learning

Chao Jiang, Cho-Jui Hsieh, Hsiang-Fu Yu, Kai-Wei Chang

NAACL 2018

2018

Word embedding is a key component in many downstream applications in processing natural languages. Existing approaches often assume the existence of a large collection of text for learning effective word embedding. However, such a corpus may not be available for some low-resource languages. In this paper, we study how to effectively learn a word embedding model on a corpus with only a few million tokens

Conversational AI

Unsupervised induction of linguistic categories with records of reading, speaking, and writing

Maria Barrett, Lea Frermann, Ana Valeria Gonzalez-Garduño, Anders Søgaard

NAACL 2018

2018

When learning POS taggers and syntactic chunkers for low-resource languages, different resources may be available, and often all we have is a small tag dictionary, motivating type-constrained unsupervised induction. Even small dictionaries can improve the performance of unsupervised induction algorithms. This paper shows that performance can be further improved by including data that is readily available

Conversational AI

Contextual topic modeling for conversational agents

Behnam Hedayatnia, Chandra Khatri, Rahul Goel, Anushree Venkatesh, Angeliki Metallinou

SLT 2018

2018

Accurate prediction of conversation topics can be a valuable signal for creating coherent and engaging dialog systems. In this work, we focus on context-aware topic classification methods for identifying topics in free-form human-chatbot dialogs. We extend previous work on neural topic classification and unsupervised topic keyword detection by incorporating conversational context and dialog act features

Conversational AI

Direct optimization of F-measure for retrieval-based personal question answering

Rasool Fakoor, Amanjit Kainth, Siamak Shakeri, Christopher Winestock, Abdel-Rahman Mohamed, Ruhi Sarikaya

SLT 2018

2018

Recent advances in spoken language technologies and the introduction of many customer facing products, have given rise to a wide customer reliance on smart personal assistants for many of their daily tasks. In this paper, we present a system to reduce users’ cognitive load by extending personal assistants with long-term personal memory where users can store and retrieve by voice, arbitrary pieces of information

Conversational AI

Coupled representation learning for domains, intents and slots in spoken language understanding

Jihwan Lee

SLT 2018

2018

Representation learning is an essential problem in a wide range of applications and it is important for performing downstream tasks successfully. In this paper, we propose a new model that learns coupled representations of domains, intents, and slots by taking advantage of their hierarchical dependency in a Spoken Language Understanding system. Our proposed model learns the vector representation of intents

Conversational AI

Parsing Coordination for Spoken Language Understanding System

Sanchit Agarwal, Rahul Goel, Tagyoung Chung, Abhishek Sethi, Arindam Mandal, Spyros Matsoukas

SLT 2018

2018

Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology. The parses contain intents and slots that are directly consumed by downstream domain applications. In this work we discuss expanding such systems to handle compound entities and intents by introducing a domain-agnostic shallow parser that handles linguistic coordination. We show that our model

Conversational AI

Scalable language model adaptation for spoken dialogue systems¬†

Ankur Gandhe, Ariya Rastrow, Björn Hoffmeister

SLT 2018

2018

Language models (LM) for interactive speech recognition systems are trained on large amounts of data and the model parameters are optimized on past user data. New application intents and interaction types are released for these systems over time, imposing challenges to adapt the LMs since the existing training data is no longer sufficient to model the future user interactions. It is unclear how to adapt

Conversational AI

Design Challenges in Robust and Multilingual Named Entity Transliteration

Yuval Merhav, Steve Ash

ICCL 2018

2018

We analyze some of the fundamental design challenges that impact the development of a multilingual state-of-the-art named entity transliteration system, including curating bilingual named entity datasets and evaluation of multiple transliteration methods. We empirically evaluate the transliteration task using the traditional weighted finite state transducer (WFST) approach against two neural approaches:

Conversational AI

Search results

Work with us