Customer-obsessed science
Research areas
-
January 14, 2025Key exchange protocols and authentication mechanisms solve distinct problems and must be integrated in a secure communication system.
-
December 24, 2024
-
December 24, 2024
-
-
Featured news
-
ECIR 20252025Traditional Query Auto-completion (QAC) systems optimise for query relevance based on past user interactions. This approach excels at surfacing frequently searched queries, but ensuring a diverse range of suggestions and incorporating new products or trends often requires post-processing heuristics. This limitation stems from relying on user search logs, which may not fully capture the evolving product
-
ICASSP 20252025Audio-Visual Speech-to-Speech Translation (AVS2S) typically prioritizes improving translation quality and naturalness. However, an equally critical aspect in audio-visual content is lip-synchrony—ensuring that the movements of the lips match the spoken content—essential for maintaining realism in dubbed videos. Despite its importance, the inclusion of lip-synchrony constraints in AVS2S models has been largely
-
ICASSP 20252025We propose a lightweight neural front-end framework for on-device speech generation and highlight its benefits towards low-resource language scaling. While data-driven models have shown potential in front-end literature, especially since they can enable fast language expansion, they are often extremely large and of high latency. There is limited work focusing on their usability in real-time settings, and
-
ICASSP 20252025Self-supervised pretraining has transformed speech representation learning, enabling models to generalize across various downstream tasks. However, empirical studies have highlighted two notable gaps. First, different speech tasks require varying levels of acoustic and semantic information, which are encoded at different layers within the model. This adds the extra complexity of layer selection on downstream
-
ICASSP 20252025Speaker Diarization (SD) is a crucial component of modern end-to-end ASR pipelines. Traditional SD systems, which are typically audio-based and operate independently of ASR, often introduce speaker errors, particularly during speaker transitions and overlapping speech. Recently, language models including fine-tuned large language models (LLMs) have shown to be effective as a second-pass speaker error corrector
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all