-
2024With the rapid development of large language models (LLMs), aligning LLMs with human values and societal norms to ensure their reliability and safety has become crucial. Reinforcement learning with human feedback (RLHF) and Constitutional AI (CAI) have been proposed for LLM alignment. However, these methods require either heavy human annotations or explicitly pre-defined constitutions, which are labor-intensive
-
2024Developing a unified model that can effectively harness heterogeneous resources and respond to a wide range of personalized needs has been a longstanding community aspiration. Our daily choices, especially in domains like fashion and retail, are substantially shaped by multi-modal data, such as pictures and textual descriptions. The vision and language modalities not only offer intuitive guidance but also
-
ConEC: Earnings call dataset with real-world contexts for benchmarking contextual speech recognition2024Knowing the particular context associated with a conversation can help improving the performance of an automatic speech recognition (ASR) system. For example, if we are provided with a list of in-context words or phrases — such as the speaker’s contacts or recent song playlists — during inference, we can bias the recognition process towards this list. There are many works addressing contextual ASR; however
-
2024We present BYOKG, a universal question-answering (QA) system that can operate on any knowledge graph (KG), requires no human-annotated training data, and can be ready to use within a day—attributes that are out-of-scope for current KGQA systems. BYOKG draws inspiration from the remarkable ability of humans to comprehend information present in an unseen KG through exploration—starting at random nodes, inspecting
-
2024Most multimodal large language models (MLLMs) learn language-to-object grounding through causal language modeling where grounded objects are captured by bounding boxes as sequences of location tokens. This paradigm lacks pixel-level representations that are impor- tant for fine-grained visual understanding and diagnosis. In this work, we introduce GROUNDHOG, an MLLM developed by grounding Large Language
Related content
-
October 03, 2023Team TWIZ from NOVA School of Science and Technology awarded $500,000 prize for first-place overall performance.
-
September 20, 2023Leveraging large language models will make interactions with Alexa more natural and engaging.
-
September 12, 2023GauchoChat wins $250,000 first place prize in overall competition; Chirpy Cardinal earns $250,000 for first place in scientific innovation category.
-
August 28, 2023AWS service enables machine learning innovation on a robust foundation.
-
August 23, 2023Senior principal scientist Jasha Droppo on the shared architectures of large language models and spectrum quantization text-to-speech models — and other convergences between the two fields.
-
August 18, 2023Speech recognition predominates, but Amazon's research takes in data representation, dialogue management, question answering, and more.