-
SLT 20182018This article presents a whisper speech detector in the far-field domain. The proposed system consists of a long short-term memory (LSTM) neural network trained on log-filterbank energy (LFBE) acoustic features. This model is trained and evaluated on recordings of human interactions with voice-controlled, far-field devices in whisper and normal phonation modes. We compare multiple inference approaches for
-
ICSC 20182018We demonstrate the potential for using aligned bilingual word embeddings to create an unsupervised method to evaluate machine translations without a need for a parallel translation corpus or reference corpus. We explain why movie subtitles differ from other text and share our experimental results conducted on them for four target languages (French, German, Portuguese and Spanish) with English-source subtitles
-
ICDM 20182018Machine Learning and NLP (Natural Language Processing) have aided the development of new and improved user experience features in many applications. We address the problem of automatically identifying the “Start Reading Location” (SRL) of eBooks, i.e. the location of the logical beginning or start of main content. This improves eBook reading experience by taking users automatically to the logical start
-
EUSIPCO 20182018This paper proposes an efficient real-time multirate fast transient-sound detection algorithm on the basis of emerging microphone array configuration intended for multimedia signal processing application systems such as digital smart home. The proposed detection algorithm first extracts the dynamics and periodicity features, then trains the model parameters of these features on Amazon machine learning platform
-
Interspeech 20182018This paper proposes a Region-based Convolutional Recurrent Neural Network (R-CRNN) for audio event detection (AED). The proposed network is inspired by Faster-RCNN [1], a wellknown region-based convolutional network framework for visual object detection. Different from the original Faster-RCNN, a recurrent layer is added on top of the convolutional network to capture the long-term temporal context from
Related content
-
August 21, 2020Watch the recording of Marcu's live interview with Alexa evangelist Jeff Blankenburg.
-
August 20, 2020The team’s non-real-time system is the top performer, while its real-time system finishes third overall and second among real-time systems — despite using only 4% of a CPU core.
-
August 18, 2020New approach scales manageably while achieving state-of-the-art results.
-
August 10, 2020Detecting comic product-related questions could improve customer engagement and Amazon recommendations.
-
August 04, 2020A judge and some of the finalists from the Alexa Prize Grand Challenge 3 talk about the competition, the role of COVID-19, and the future of socialbots.
-
August 04, 2020Team awarded $500,000 prize for performance of its Emora socialbot.