-
ICASSP 20202020Acoustic event classification (AEC) and acoustic event detection (AED) refer to the task of detecting whether specific target events occur in audios. As long short-term memory (LSTM) leads to state-of- the-art results in various speech related tasks, it is employed as a popular solution for AEC as well. This paper focuses on investigating the dynamics of LSTM model on AEC tasks. It includes a detailed analysis
-
ICASSP 20202020Spoken Language Understanding (SLU) systems consist of several machine learning components operating together (e.g. intent classification, named entity recognition and resolution). Deep learning models have obtained state of the art results on several of these tasks, largely attributed to their better modeling capacity. However, an increase in modeling capacity comes with added costs of higher latency and
-
ICASSP 20202020Spoken Language Understanding (SLU) systems typically consist of a set of machine learning models that operate in conjunction to produce an SLU hypothesis. The generated hypothesis is then sent to downstream components for further action. However, it is desirable to discard an incorrect hypothesis before sending it downstream. In this work, we present two designs for SLU hypothesis rejection modules: (i
-
ICASSP 20202020In this paper, we present an end-to-end deep convolutional neural network operating on multi-channel raw audio data to localize multiple simultaneously active acoustic sources in space. Previously reported deep-learning-based approaches work well in localizing a single source directly from multi-channel raw audio but are not easily extendable to localize multiple sources due to the well-known permutation
-
ICASSP 20202020We study few-shot acoustic event detection (AED) in this paper. Few-shot learning enables detection of new events with very limited labeled data. Compared to other research areas like computer vision, few-shot learning for audio recognition has been understudied. We formulate the few-shot AED problem and explore different ways of utilizing traditional supervised methods for this setting as well as a variety
Related content
-
May 12, 2021Ström discusses his career journey in conversational AI, his published research, and where he sees the field of conversational AI headed next
-
May 11, 2021Resnik is a featured speaker at the first virtual Amazon Web Services Machine Learning Summit on June 2.
-
May 4, 2021A new study has found that when compared to curated playlists and silence, personalized AI soundscapes generated by Alexa Fund company Endel are more effective in helping people focus.
-
April 14, 2021ADePT model transforms the texts used to train natural-language-understanding models while preserving semantic coherence.
-
April 9, 2021Matsoukas discusses his focus on automatic speech recognition, natural understanding, and dialogue management, as well as how those research domains are making Alexa more intelligent and useful.
-
April 7, 2021Technique that lets devices convey information in natural language improves on state of the art.