-
ACM MMSports 20232023Sports highlights are an important form of media for fans worldwide, as they provide short videos that capture key moments from games, often accompanied by the original commentaries of the game’s announcers. However, traditional forms of presenting sports highlights have limitations in conveying the complexity and nuance of the game. In recent years, the use of Large Language Models (LLMs) for natural language
-
NeurIPS 20232023We introduce Alexa Arena, a user-centric simulation platform to facilitate research in building assistive conversational embodied agents. Alexa Arena features multi-room layouts and an abundance of interactable objects. With user-friendly graphics and control mechanisms, the platform supports the development of gamified robotic tasks readily accessible to general human users, allowing high-efficiency data
-
EMNLP 20232023The growing popularity of conversational AI agents such as Alexa, Google Assistant, and Siri relies on accurate spoken-language comprehension. The query reformulation (QR) method, which reformulates defective user queries, has been broadly adopted to mitigate the challenges posed by understanding the user’s intent from an imperfect spoken recognition result. However, due to the scarcity of non- English
-
EMNLP 20232023Contextual query rewriting (CQR) is a crucial component in Conversational AI agents, leveraging the contextual information from previous user-agent conversations to improve the comprehension of current user intent. However, traditional CQR methods often concentrate on supervised fine-tuning only, neglecting the opportunities to learn from user feedback to align with user preferences. Inspired by recent
-
EMNLP 20232023Embodied task completion is a challenge where an agent in a simulated environment must predict environment actions to complete tasks based on natural language instructions and egocentric visual observations. We propose a variant of this problem where the agent predicts actions at a higher level of abstraction called a plan which more directly tests language understanding and reasoning. We show that multimodal
Related content
-
July 12, 2023Data augmentation, novel loss functions, and weakly supervised training enable a state-of-the art model for recognizing mispronunciations.
-
July 10, 2023Familiar topics such as question answering and natural-language understanding remain well represented, but a new concentration on language modeling and multimodal models reflect the spread of generative AI.
-
July 09, 2023Finding that 70% of attention heads and 20% of feed-forward networks can be excised with minimal effect on in-context learning suggests that large language models are undertrained.
-
July 07, 2023Amazon’s Yang Liu, general chair of this year’s meeting of the Association for Computational Linguistics, on the road ahead for LLMs.
-
July 06, 2023The program exposes students to computer science as they create their own Alexa skills.
-
July 05, 2023Amazon Research Award recipient Shrikanth Narayanan is on a mission to make inclusive human-AI conversational experiences.