-
Interspeech 20232023Conformer-based end-to-end automatic speech recognition (ASR) models have gained popularity in recent years due to their exceptional performance at scale. However, there are significant computation, memory and latency costs associated with running inference on such models. With the aim of mitigating these issues, we evaluate the efficacy of pruning Conformer layers while fine-tuning only on 20% of the data
-
ACL Findings 20232023Recommending a diversity of product types (PTs) is important for a good shopping experience when customers are looking for products around their high-level shopping interests (SIs) such as hiking. However, the SI-PT connection is typically absent in e-commerce product catalogs and expensive to construct manually due to the volume of potential SIs, which prevents us from establishing a recommender with easily
-
Interspeech 20232023We propose a methodology for information aggregation from the various transformer layer outputs of a generic speech Encoder (e.g. WavLM, HuBERT) for the downstream task of Speech Emotion Recognition (SER). The proposed methodology significantly reduces the dependency of model predictions on linguistic content, while leading to competitive performance without requiring costly Encoder re-training. The proposed
-
ACL 2023 Workshop on Trustworthy Natural Language Processing (TrustNLP)2023The issue of enhancing the robustness of Named Entity Recognition (NER) models against adversarial attacks has recently gained significant attention (Simoncini and Spanakis, 2021; Lin et al., 2021). The existing techniques for robustifying NER models rely on exhaustive perturbation of the input training data to generate adversarial examples, often resulting in adversarial examples that are not semantically
-
ACL 20232023We present a new task setting for attribute mining on e-commerce products, serving as a practical solution to extract open-world attributes without extensive human intervention. Our supervision comes from a high-quality seed attribute set bootstrapped from existing resources, and we aim to expand the attribute vocabulary of existing seed types, and also to discover any new attribute types automatically.
Related content
-
April 01, 2019The idea of using arrays of microphones to improve automatic speech recognition (ASR) is decades old. The acoustic signal generated by a sound source reaches multiple microphones with different time delays. This information can be used to create virtual directivity, emphasizing a sound arriving from a direction of interest and diminishing signals coming from other directions. In voice recognition, one of the more popular methods for doing this is known as “beamforming”.
-
Animation by Nick LittleMarch 28, 2019Audio watermarking is the process of adding a distinctive sound pattern — undetectable to the human ear — to an audio signal to make it identifiable to a computer. It’s one of the ways that video sites recognize copyrighted recordings that have been posted illegally. To identify a watermark, a computer usually converts a digital file into an audio signal, which it processes internally.
-
March 21, 2019Sentiment analysis is the attempt, computationally, to determine from someone’s words how he or she feels about something. It has a host of applications, in market research, media analysis, customer service, and product recommendation, among other things. Sentiment classifiers are typically machine learning systems, and any given application of sentiment analysis may suffer from a lack of annotated data for training purposes.
-
March 20, 2019Although deep neural networks have enabled accurate large-vocabulary speech recognition, training them requires thousands of hours of transcribed data, which is time-consuming and expensive to collect. So Amazon scientists have been investigating techniques that will let Alexa learn with minimal human involvement, techniques that fall in the categories of unsupervised and semi-supervised learning.
-
March 11, 2019In experiments involving sound recognition, technique reduces error rate by 15% to 30%.
-
March 05, 2019The 2018 Alexa Prize featured eight student teams from four countries, each of which adopted distinctive approaches to some of the central technical questions in conversational AI. We survey those approaches in a paper we released late last year, and the teams themselves go into even greater detail in the papers they submitted to the latest Alexa Prize Proceedings. Here, we touch on just a few of the teams’ innovations.