Search results

16,559 results found
  • Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Animashree Anandkumar
    ICML 2018
    2018
    Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. SIGNSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative `1/`2 geometry of gradients
  • Xiaoyu Lu, Javier González, Zhenwen Dai, Neil Lawrence
    ICML 2018
    2018
    We tackle the problem of optimizing a blackbox objective function defined over a highly structured input space. This problem is ubiquitous in machine learning. Inferring the structure of a neural network or the Automatic Statistician (AS), where the kernel combination for a Gaussian process is optimized, are two of many possible examples. We use the AS as a case study to describe our approach, that can
  • ICML 2018
    2018
    Faced with distribution shift between training and test set, we wish to detect and quantify the shift, and to correct our classifiers without test set labels. Motivated by medical diagnosis, where diseases (targets), cause symptoms (observations), we focus on label shift, where the label marginal p(y) changes but the conditional p(x|y) does not. We propose Black Box Shift Estimation (BBSE) to estimate the
  • Leyuan Wang, Mu Li, Edo Liberty, Alex Smola
    SysML 2018
    2018
    We derive algorithms for producing optimal aggregation schedules for automatically aggregating gradients across di!erent compute units, both CPUs and GPUs, with arbitrary topologies. We show that this can be accomplished by solving a linear program on the spanning tree polytope. We give analytic bounds for the value of the optimal solution for arbitrary graphs. We also propose simple schedules that meet
  • NeurIPS 2018
    2018
    We propose an alternative training framework for Bayesian neural networks (BNNs), which is motivated by viewing the Bayesian model for supervised learning as an autoencoder for data transmission. Then, a natural objective can be invoked from the rate-distortion theory. Specifically, we end up minimizing the mutual information between the weights and the dataset with a constraint that the negative log-likelihood
  • Julia Kreutzer, Artem Sokolov
    IWSLT 2018
    2018
    Most modern neural machine translation (NMT) systems rely on presegmented inputs. Segmentation granularity importantly determines the input and output sequence lengths, hence the modeling depth, and source and target vocabularies, which in turn determine model size, computational costs of softmax normalization, and handling of out-of-vocabulary words. However, the current practice is to use static, heuristic-based
  • We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for a parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English-source subtitles
  • Sravan Bodapati, Sriraghavendra Ramaswamy, Gururaj Narayanan
    ICDM 2018
    2018
    Machine Learning and NLP (Natural Language Processing) have aided the development of new and improved user experience features in many applications. We address the problem of automatically identifying the “Start Reading Location” (SRL) of eBooks, i.e. the location of the logical beginning or start of main content. This improves eBook reading experience by taking users automatically to the logical start
  • Tom Diethe, Tom Borchert, Eno Thereska, Borja de Balle Pigem, Cédric Archambeau, Neil Lawrence
    NeurIPS 2018
    2018
    This paper describes a reference architecture for self-maintaining systems that can learn continually, as data arrives. In environments where data evolves, we need architectures that manage Machine Learning (ML) models in production, adapt to shifting data distributions, cope with outliers, retrain when necessary, and adapt to new tasks. This represents continual AutoML or Automatically Adaptive Machine
  • Matt Post, David Vilar
    NAACL 2018
    2018
    The end-to-end nature of neural machine translation (NMT) removes many ways of manually guiding the translation process that were available in older paradigms. Recent work, however, has introduced a new capability: lexically constrained or guided decoding, a modification to beam search that forces the inclusion of pre-specified words and phrases in the output. However, while theoretically sound, existing
  • Amr Sharaf, Arpit Gupta, Hancheng Ge, Chetan Naik, Rylan Conway, Lambert Mathias
    NeurIPS 2018
    2018
    In the slot-filling paradigm, where a user can refer back to slots in the context during the conversation, the goal of the contextual understanding system is to resolve the referring expressions to the appropriate slots in the context. In this paper, we build on (Naik et al., 2018), which provides a scalable multi-domain framework for resolving references. However, scaling this approach across languages
  • NAACL 2018
    2018
    In this paper we introduce the notion of Demand-Weighted Completeness, allowing estimation of the completeness of a knowledge base with respect to how it is used. Defining an entity by its classes, we employ usage data to predict the distribution over relations for that entity. For example, instances of person in a knowledge base may require a birth date, name and nationality to be considered complete.
  • NAACL 2018
    2018
    Word embedding is a key component in many downstream applications in processing natural languages. Existing approaches often assume the existence of a large collection of text for learning effective word embedding. However, such a corpus may not be available for some low-resource languages. In this paper, we study how to effectively learn a word embedding model on a corpus with only a few million tokens
  • Maria Barrett, Lea Frermann, Ana Valeria Gonzalez-Garduño, Anders Søgaard
    NAACL 2018
    2018
    When learning POS taggers and syntactic chunkers for low-resource languages, different resources may be available, and often all we have is a small tag dictionary, motivating type-constrained unsupervised induction. Even small dictionaries can improve the performance of unsupervised induction algorithms. This paper shows that performance can be further improved by including data that is readily available
  • Behnam Hedayatnia, Chandra Khatri, Rahul Goel, Anushree Venkatesh, Angeliki Metallinou
    SLT 2018
    2018
    Accurate prediction of conversation topics can be a valuable signal for creating coherent and engaging dialog systems. In this work, we focus on context-aware topic classification methods for identifying topics in free-form human-chatbot dialogs. We extend previous work on neural topic classification and unsupervised topic keyword detection by incorporating conversational context and dialog act features
  • Rasool Fakoor, Amanjit Kainth, Siamak Shakeri, Christopher Winestock, Abdel-Rahman Mohamed, Ruhi Sarikaya
    SLT 2018
    2018
    Recent advances in spoken language technologies and the introduction of many customer facing products, have given rise to a wide customer reliance on smart personal assistants for many of their daily tasks. In this paper, we present a system to reduce users’ cognitive load by extending personal assistants with long-term personal memory where users can store and retrieve by voice, arbitrary pieces of information
  • SLT 2018
    2018
    Representation learning is an essential problem in a wide range of applications and it is important for performing downstream tasks successfully. In this paper, we propose a new model that learns coupled representations of domains, intents, and slots by taking advantage of their hierarchical dependency in a Spoken Language Understanding system. Our proposed model learns the vector representation of intents
  • Typical spoken language understanding systems provide narrow semantic parses using a domain-specific ontology. The parses contain intents and slots that are directly consumed by downstream domain applications. In this work we discuss expanding such systems to handle compound entities and intents by introducing a domain-agnostic shallow parser that handles linguistic coordination. We show that our model
  • Ankur Gandhe, Ariya Rastrow, Björn Hoffmeister
    SLT 2018
    2018
    Language models (LM) for interactive speech recognition systems are trained on large amounts of data and the model parameters are optimized on past user data. New application intents and interaction types are released for these systems over time, imposing challenges to adapt the LMs since the existing training data is no longer sufficient to model the future user interactions. It is unclear how to adapt
  • Yuval Merhav, Steve Ash
    ICCL 2018
    2018
    We analyze some of the fundamental design challenges that impact the development of a multilingual state-of-the-art named entity transliteration system, including curating bilingual named entity datasets and evaluation of multiple transliteration methods. We empirically evaluate the transliteration task using the traditional weighted finite state transducer (WFST) approach against two neural approaches:
IN, TS, Hyderabad
Welcome to the Worldwide Returns & ReCommerce team (WWR&R) at Amazon.com. WWR&R is an agile, innovative organization dedicated to ‘making zero happen’ to benefit our customers, our company, and the environment. Our goal is to achieve the three zeroes: zero cost of returns, zero waste, and zero defects. We do this by developing products and driving truly innovative operational excellence to help customers keep what they buy, recover returned and damaged product value, keep thousands of tons of waste from landfills, and create the best customer returns experience in the world. We have an eye to the future – we create long-term value at Amazon by focusing not just on the bottom line, but on the planet. We are building the most sustainableRead more