-
2024The sequential process of conceptualization and instantiation is essential to generalizable commonsense reasoning as it allows the application of existing knowledge to unfamiliar scenarios. However, existing works tend to undervalue the step of instantiation and heavily rely on pre-built concept taxonomies and human annotations to collect both types of knowledge, resulting in a lack of instantiated knowledge
-
2024Abstraction ability is crucial in human intelligence, which can also benefit various tasks in NLP study. Existing work shows that LLMs are deficient in abstract ability, and how to improve it remains unexplored. In this work, we design the framework AbsInstruct to enhance LLMs’ abstraction ability through instruction tuning. The framework builds instructions with in-depth explanations to assist LLMs in
-
KDD 2024 Workshop on GenAI Evaluation2024Utilizing Large Language Models (LLM) as chatbots in diverse business scenarios often presents the challenge of maintaining topic continuity. Abrupt shifts in topics can lead to poor user experiences and inefficient utilization of computational resources. In this paper, we present a topic continuity model aimed at assessing whether a response aligns with the initial conversation topic. Our model is built
-
AI-ML Systems 20242024While we can customize large language models (LLMs) on specific domains by finetuning using the domain specific labeled data, performance of the customized models is highly dependent on the quality of the labeled data. Obtaining high-quality labeled data for custom domains often requires considerable human effort and associated costs. However, in many cases, unlabeled data is readily available at little
-
ACM SIGSPATIAL 20242024Determining the precise location of customers is important for an efficient and reliable delivery experience, both for customers and delivery associates. Address text is a primary source of information provided by customers about their location. In this paper, we study the important and challenging task of matching free-form customer address text to determine if two addresses represent the same physical
Related content
-
April 18, 2019Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off. At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.
-
April 11, 2019Multiband dynamics processing, which separately modifies volume in different frequency bands of an audio signal, is known to improve listeners’ audio experiences. But in the context of voice-controlled systems like the Amazon Echo family of products, it can also improve automatic speech recognition by making echo cancellation easier.
-
April 8, 2019Transfer learning is the technique of adapting a machine learning model trained on abundant data to a new context in which training data is sparse. On the Alexa team, we’ve explored transfer learning as a way to bootstrap new functions and to add new classification categories to existing machine learning systems.
-
April 4, 2019Customer interactions with Alexa are constantly growing more complex, and on the Alexa science team, we strive to stay ahead of the curve by continuously improving Alexa’s speech recognition system. Increasingly, keeping pace with Alexa’s expanding capabilities will require automating the learning process, through techniques such as semi-supervised learning, which leverages a small amount of annotated data to extract information from a much larger store of unannotated data.
-
April 1, 2019The idea of using arrays of microphones to improve automatic speech recognition (ASR) is decades old. The acoustic signal generated by a sound source reaches multiple microphones with different time delays. This information can be used to create virtual directivity, emphasizing a sound arriving from a direction of interest and diminishing signals coming from other directions. In voice recognition, one of the more popular methods for doing this is known as “beamforming”.
-
Animation by Nick LittleMarch 28, 2019Audio watermarking is the process of adding a distinctive sound pattern — undetectable to the human ear — to an audio signal to make it identifiable to a computer. It’s one of the ways that video sites recognize copyrighted recordings that have been posted illegally. To identify a watermark, a computer usually converts a digital file into an audio signal, which it processes internally.