Learning algorithms

18 results found

Sort

FastPoint: Scalable deep point processes

Alex Smola, Caner Turkmen, Yuyang (Bernie) Wang

ECML-PKDD 2019

2019

We propose FastPoint, a novel multivariate point process that enables fast and accurate learning and inference. FastPoint uses deep recurrent neural networks to capture complex temporal dependency patterns among different marks, while self-excitation dynamics within each mark are modeled with Hawkes processes. This results in substantially more efficient learning and scales to millions of correlated marks

Information and knowledge management
How we add new skills to Alexa’s name-free skill selector

Young-Bum Kim

May 03, 2019

Using cosine similarity rather than dot product to compare vectors helps prevent "catastrophic forgetting".

Conversational AI
To correct imbalances in training data, don’t oversample: Cluster

Ming Sun

March 11, 2019

In experiments involving sound recognition, technique reduces error rate by 15% to 30%.

Machine learning
New method for compressing neural networks better preserves accuracy

Anish Acharya, Rahul Goel

January 15, 2019

Neural networks have been responsible for most of the top-performing AI systems of the past decade, but they tend to be big, which means they tend to be slow. That’s a problem for systems like Alexa, which depend on neural networks to process spoken requests in real time.

Conversational AI
Learning fashion traits with label uncertainty

Assaf Neuberger, Sharon Alpert, Eli Alshan, Nati Bubis, Eduard Oks

CVPR 2018

2018

We consider the task of predicting subjective fashion traits from images. Specifically, we are interested in understanding which outfit actually better suites the user. Since these traits are highly subjective, they tend to be noisier. One solution is to annotate each example several times, but this makes it hard to collect large amounts of data.

Machine learning
signSGD: compressed optimisation for non-convex problems

Jeremy Bernstein, Yu-Xiang Wang, Kamyar Azizzadenesheli, Animashree Anandkumar

ICML 2018

2018

Training large neural networks requires distributing learning across multiple workers, where the cost of communicating gradients can be a significant bottleneck. SIGNSGD alleviates this problem by transmitting just the sign of each minibatch stochastic gradient. We prove that it can get the best of both worlds: compressed gradients and SGD-level convergence rate. The relative `1/`2 geometry of gradients

Machine learning
Generalization bounds for randomized learning with application to stochastic gradient descent

Ben London

KDD 2017

2017

Randomized algorithms are central to modern machine learning. In the presence of massive datasets, researchers often turn to stochastic optimization to solve learning problems. Of particular interest is stochastic gradient descent (SGD), a first-order method that approximates the learning objective and gradient by a random point estimate. A classical question in learning theory is, if a randomized learner

Machine learning
Learning structured predictors from bandit feedback for interactive NLP

Artem Sokolov, Julia Kreutzer, Chirstopher Lo, Stefan Riezler

ACL 2016

2016

Structured prediction from bandit feedback describes a learning scenario where instead of having access to a gold standard structure, a learner only receives partial feedback in form of the loss value of a predicted structure. We present new learning objectives and algorithms for this interactive scenario, focusing on convergence speed and ease of elicitability of feedback. We present supervised-to-bandit

Conversational AI

Learning algorithms

Work with us