Interspeech
This year's Interspeech will be held in Graz, Austria, whose famed clock tower was built in the mid-1500s
Photo courtesy of Getty Images

The 16 Alexa-related papers at this year’s Interspeech

At next week’s Interspeech, the largest conference on the science and technology of spoken-language processing, Alexa researchers have 16 papers, which span the five core areas of Alexa functionality: device activation, or recognizing speech intended for Alexa and other audio events that require processing; automatic speech recognition (ASR), or converting the speech signal into text; natural-language understanding, or determining the meaning of customer utterances; dialogue management, or handling multiturn conversational exchanges; and text-to-speech, or generating natural-sounding synthetic speech to convey Alexa’s responses. Two of the papers are also more-general explorations of topics in machine learning.

Device Activation

Model Compression on Acoustic Event Detection with Quantized Distillation
Bowen Shi, Ming Sun, Chieh-Chi Kao, Viktor Rozgic, Spyros Matsoukas, Chao Wang

The researchers combine two techniques to shrink neural networks trained to detect sounds by 88%, with no loss in accuracy. One technique, distillation, involves using a large, powerful model to train a leaner, more-efficient one. The other technique, quantization, involves using a fixed number of values to approximate a larger range of values.

Sub-band Convolutional Neural Networks for Small-footprint Spoken Term Classification
Chieh-Chi Kao, Ming Sun, Yixin Gao, Shiv Vitaladevuni, Chao Wang

Convolutional neural nets (CNNs) were originally designed to look for the same patterns in every block of pixels in a digital image. But they can also be applied to acoustic signals, which can be represented as two-dimensional mappings of time against frequency-based “features”. By restricting an audio-processing CNN’s search only to the feature ranges where a particular pattern is likely to occur, the researchers make it much more computationally efficient. This could make audio processing more practical for power-constrained devices.

A Study for Improving Device-Directed Speech Detection toward Frictionless Human-Machine Interaction
Che-Wei Huang, Roland Maas, Sri Harish Mallidi, Björn Hoffmeister

This paper is an update of prior work on detecting device-directed speech, or identifying utterances intended for Alexa. The researchers find that labeling dialogue turns (distinguishing initial utterances from subsequent utterances) and using signal representations based on Fourier transforms rather than mel-frequencies improve accuracy. They also find that, among the features extracted from speech recognizers that the system considers, confusion networks, which represent word probabilities at successive sentence positions, have the most predictive power.

Automatic Speech Recognition (ASR)

Acoustic Model Bootstrapping Using Semi-Supervised Learning
Langzhou Chen, Volker Leutnant

The researchers propose a method for selecting machine-labeled utterances for semi-supervised training of an acoustic model, the component of an ASR system that takes an acoustic signal as input. First, for each training sample, the system uses the existing acoustic model to identify the two most probable word-level interpretations of the signal at each position in the sentence. Then it finds examples in the training data that either support or contradict those probability estimates, which it uses to adjust the uncertainty of the ASR output. Samples that yield significant reductions in uncertainty are preferentially selected for training.

Improving ASR Confidence Scores for Alexa Using Acoustic and Hypothesis Embeddings
Prakhar Swarup, Roland Maas, Sri Garimella, Sri Harish Mallidi, Björn Hoffmeister

Speech recognizers assign probabilities to different interpretations of acoustic signals, and these probabilities can serve as inputs to a machine learning model that assesses the recognizer’s confidence in its classifications. The resulting confidence scores can be useful to other applications, such as systems that select machine-labeled training data for semi-supervised learning. The researchers append embeddings — fixed-length vector representations — of both the raw acoustic input and the speech recognizer’s best estimate of the word sequence to the inputs to a confidence-scoring network. The result: a 6.5% reduction in equal-error rate (the error rate that results when the false-negative and false-positive rates are set as equal).

Multi-Dialect Acoustic Modeling Using Phone Mapping and Online I-Vectors
Harish Arsikere, Ashtosh Sapru, Sri Garimella

Multi-dialect acoustic models, which help convert multi-dialect speech signals to words, are typically neural networks trained on pooled multi-dialect data, with separate output layers for each dialect. The researchers show that mapping the phones — the smallest phonetic units of speech — of each dialect to those of the others offers comparable results with shorter training times and better parameter sharing. They also show that recognition accuracy can be improved by adapting multi-dialect acoustic models, on the fly, to a target speaker.

Neural Machine Translation for Multilingual Grapheme-to-Phoneme Conversion
Alex Sokolov, Tracy Rohlin, Ariya Rastrow

Grapheme-to-phoneme models, which translate written words into their phonetic equivalents (“echo” to “E k oU”), enable speech recognizers to handle words they haven’t seen before. The researchers train a single neural model to handle grapheme-to-phoneme conversion in 18 languages. The results are comparable to those of state-of-the-art single-language models for languages with abundant training data and better for languages with sparse data. Multilingual models are more flexible and easier to maintain in production environments.

Scalable Multi Corpora Neural Language Models for ASR
Anirudh Raju, Denis Filimonov, Gautam Tiwari, Guitang Lan, Ariya Rastrow

Language models, which compute the probability of a given sequence of words, help distinguish between different interpretations of speech signals. Neural language models promise greater accuracy than existing models, but they’re difficult to incorporate into real-time speech recognition systems. The researchers describe several techniques to make neural language models practical, from a technique for weighting training samples from out-of-domain data sets to noise contrastive estimation, which turns the calculation of massive probability distributions into simple binary decisions.

Natural-Language Understanding

Neural Named Entity Recognition from Subword Units
Abdalghani Abujabal, Judith Gaspers

Named-entity recognition is crucial to voice-controlled systems — as when you tell Alexa “Play ‘Spirit’ by Beyoncé”. A neural network that recognizes named entities typically has dedicated input channels for every word in its vocabulary. This has two drawbacks: (1) the network grows extremely large, which makes it slower and more memory intensive, and (2) it has trouble handling unfamiliar words. The researchers trained a named-entity recognizer that instead takes subword units — characters, phonemes, and bytes — as inputs. It offers comparable performance with a vocabulary of only 332 subwords, versus 74,000-odd words.

Dialogue Management

HyST: A Hybrid Approach for Flexible and Accurate Dialogue State Tracking
Rahul Goel, Shachi Paul, Dilek Hakkani-Tür

Dialogue-based computer systems need to track “slots” — types of entities mentioned in conversation, such as movie names — and their values — such as Avengers: Endgame. Training a machine learning system to decide whether to pull candidate slot values from prior conversation or compute a distribution over all possible slot values improves slot-tracking accuracy by 24% over the best-performing previous system.

Towards Universal Dialogue Act Tagging for Task-Oriented Dialogues
Shachi Paul, Rahul Goel, Dilek Hakkani-Tür

Dialogue-based computer systems typically classify utterances by “dialogue act” — such as requesting, informing, and denying — as a way of gauging progress toward a conversational goal. As a first step in developing a system that will automatically label dialogue acts in human-human conversations (to, in turn, train a dialogue-act classifier), the researchers create a “universal tagging scheme” for dialogue acts. They use this scheme to reconcile the disparate tags used in different data sets.

Topical-Chat: Towards Knowledge-Grounded Open-Domain Conversations
Karthik Gopalakrishnan, Behnam Hedayatnia, Qinlang Chen, Anna Gottardi, Sanjeev Kwatra, Anu Venkatesh, Raefer Gabriel, Dilek Hakkani-Tür

The researchers report a new data set, which grew out of the Alexa Prize competition and is intended to advance research on AI agents that engage in social conversations. Pairs of workers recruited through Mechanical Turk were given information on topics that arose frequently during Alexa Prize interactions and asked to converse about them, documenting the sources of their factual assertions. The researchers used the resulting data set to train a knowledge-grounded response generation network, and they report automated and human evaluation results as state-of-the-art baselines.

Text-to-Speech

Towards Achieving Robust Universal Neural Vocoding
Jaime Lorenzo Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal

A vocoder is the component of a speech synthesizer that takes the frequency-spectrum snapshots generated by other components and fills in the information necessary to convert them to audio. The researchers trained a neural-network-based vocoder using data from 74 speakers of both genders in 17 languages. The resulting “universal vocoder” outperformed speaker-specific vocoders, even on speakers and languages it had never encountered before and unusual tasks such as synthesized singing.

Fine-Grained Robust Prosody Transfer for Single-Speaker Neural Text-to-Speech
Viacheslav Klimkov, Srikanth Ronanki, Jonas Rohnke, Thomas Drugman

The researchers present a new technique for transferring prosody (intonation, stress, and rhythm) from a recording to a synthesized voice, enabling the user to choose whose voice will read recorded content, with inflections preserved. Where earlier prosody transfer systems used spectrograms — frequency spectrum snapshots — as inputs, the researchers’ system uses easily normalized prosodic features extracted from the raw audio.

Machine Learning

Two Tiered Distributed Training Algorithm for Acoustic Modeling
Pranav Ladkat, Oleg Rybakov, Radhika Arava, Sree Hari Krishnan Parthasarathi,I-Fan Chen, Nikko Strom

When neural networks are trained on large data sets, the training needs to be distributed, or broken up across multiple processors. A novel combination of two state-of-the-art distributed-learning algorithms — GTC and BMUF — achieves both higher accuracy and more-efficient training then either, when learning is distributed to 128 parallel processors.

BMUF-GTC.gif._CB436386414_.gif
The researchers' new method splits distributed processors into groups, and within each group, the processors use the highly accurate GTC method to synchronize their models. At regular intervals, designated representatives from all the groups use a different method — BMUF — to share their models and update them accordingly. Finally, each representative broadcasts its updated model to the rest of its group.
Animation by Nick Little

One-vs-All Models for Asynchronous Training: An Empirical Analysis
Rahul Gupta, Aman Alok, Shankar Ananthakrishnan

A neural network can be trained to perform multiple classifications at once: it might recognize multiple objects in an image, or assign multiple topic categories to a single news article. An alternative is to train a separate “one-versus-all” (OVA) classifier for each category, which classifies data as either in the category or out of it. The advantage of this approach is that each OVA classifier can be re-trained separately as new data becomes available. The researchers present a new metric that enables comparison of multiclass and OVA strategies, to help data scientists determine which is more useful for a given application.

Research areas

Related content

IN, TS, Hyderabad
Welcome to the Worldwide Returns & ReCommerce team (WWR&R) at Amazon.com. WWR&R is an agile, innovative organization dedicated to ‘making zero happen’ to benefit our customers, our company, and the environment. Our goal is to achieve the three zeroes: zero cost of returns, zero waste, and zero defects. We do this by developing products and driving truly innovative operational excellence to help customers keep what they buy, recover returned and damaged product value, keep thousands of tons of waste from landfills, and create the best customer returns experience in the world. We have an eye to the future – we create long-term value at Amazon by focusing not just on the bottom line, but on the planet. We are building the most sustainable re-use channel we can by driving multiple aspects of the Circular Economy for Amazon – Returns & ReCommerce. Amazon WWR&R is comprised of business, product, operational, program, software engineering and data teams that manage the life of a returned or damaged product from a customer to the warehouse and on to its next best use. Our work is broad and deep: we train machine learning models to automate routing and find signals to optimize re-use; we invent new channels to give products a second life; we develop highly respected product support to help customers love what they buy; we pilot smarter product evaluations; we work from the customer backward to find ways to make the return experience remarkably delightful and easy; and we do it all while scrutinizing our business with laser focus. You will help create everything from customer-facing and vendor-facing websites to the internal software and tools behind the reverse-logistics process. You can develop scalable, high-availability solutions to solve complex and broad business problems. We are a group that has fun at work while driving incredible customer, business, and environmental impact. We are backed by a strong leadership group dedicated to operational excellence that empowers a reasonable work-life balance. As an established, experienced team, we offer the scope and support needed for substantial career growth. Amazon is earth’s most customer-centric company and through WWR&R, the earth is our customer too. Come join us and innovate with the Amazon Worldwide Returns & ReCommerce team!
GB, MLN, Edinburgh
We’re looking for a Machine Learning Scientist in the Personalization team for our Edinburgh office experienced in generative AI and large models. You will be responsible for developing and disseminating customer-facing personalized recommendation models. This is a hands-on role with global impact working with a team of world-class engineers and scientists across the Edinburgh offices and wider organization. You will lead the design of machine learning models that scale to very large quantities of data, and serve high-scale low-latency recommendations to all customers worldwide. You will embody scientific rigor, designing and executing experiments to demonstrate the technical efficacy and business value of your methods. You will work alongside a science team to delight customers by aiding in recommendations relevancy, and raise the profile of Amazon as a global leader in machine learning and personalization. Successful candidates will have strong technical ability, focus on customers by applying a customer-first approach, excellent teamwork and communication skills, and a motivation to achieve results in a fast-paced environment. Our position offers exceptional opportunities for every candidate to grow their technical and non-technical skills. If you are selected, you have the opportunity to make a difference to our business by designing and building state of the art machine learning systems on big data, leveraging Amazon’s vast computing resources (AWS), working on exciting and challenging projects, and delivering meaningful results to customers world-wide. Key job responsibilities Develop machine learning algorithms for high-scale recommendations problems. Rapidly design, prototype and test many possible hypotheses in a high-ambiguity environment, making use of both quantitative analysis and business judgement. Collaborate with software engineers to integrate successful experimental results into large-scale, highly complex Amazon production systems capable of handling 100,000s of transactions per second at low latency. Report results in a manner which is both statistically rigorous and compellingly relevant, exemplifying good scientific practice in a business environment.
US, CA, Palo Alto
Amazon’s Advertising Technology team builds the technology infrastructure and ad serving systems to manage billions of advertising queries every day. The result is better quality advertising for publishers and more relevant ads for customers. In this organization you’ll experience the benefits of working in a dynamic, entrepreneurial environment, while leveraging the resources of Amazon.com (AMZN), one of the world's leading companies. Amazon Publisher Services (APS) helps publishers of all sizes and on all channels better monetize their content through effective advertising. APS unites publishers with advertisers across devices and media channels. We work with Amazon teams across the globe to solve complex problems for our customers. The end results are Amazon products that let publishers focus on what they do best - publishing. The APS Publisher Products Engineering team is responsible for building cloud-based advertising technology services that help Web, Mobile, Streaming TV broadcasters and Audio publishers grow their business. The engineering team focuses on unlocking our ad tech on the most impactful Desktop, mobile and Connected TV devices in the home, bringing real-time capabilities to this medium for the first time. As a successful Data Scientist in our team, · You are an analytical problem solver who enjoys diving into data, is excited about investigations and algorithms, and can credibly interface between technical teams and business stakeholders. You will collaborate directly with product managers, BIEs and our data infra team. · You will analyze large amounts of business data, automate and scale the analysis, and develop metrics (e.g., user recognition, ROAS, Share of Wallet) that will enable us to continually measure the impact of our initiatives and refine the product strategy. · Your analytical abilities, business understanding, and technical aptitude will be used to identify specific and actionable opportunities to solve existing business problems and look around corners for future opportunities. Your expertise in synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication will enable you to answer specific business questions and innovate for the future. · You will have direct exposure to senior leadership as we communicate results and provide scientific guidance to the business. Major responsibilities include: · Utilizing code (Apache, Spark, Python, R, Scala, etc.) for analyzing data and building statistical models to solve specific business problems. · Collaborate with product, BIEs, software developers, and business leaders to define product requirements and provide analytical support · Build customer-facing reporting to provide insights and metrics which track system performance · Influence the product strategy directly through your analytical insights · Communicating verbally and in writing to business customers and leadership team with various levels of technical knowledge, educating them about our systems, as well as sharing insights and recommendations
US, WA, Seattle
Amazon Advertising operates at the intersection of eCommerce and advertising, and is investing heavily in building a world-class advertising business. We are defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long-term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products to improve both shopper and advertiser experience. With a broad mandate to experiment and innovate, we grow at an unprecedented rate with a seemingly endless range of new opportunities. The Ad Response Prediction team in Sponsored Products organization build advanced deep-learning models, large-scale machine-learning pipelines, and real-time serving infra to match shoppers’ intent to relevant ads on all devices, for all contexts and in all marketplaces. Through precise estimation of shoppers’ interaction with ads and their long-term value, we aim to drive optimal ads allocation and pricing, and help to deliver a relevant, engaging and delightful ads experience to Amazon shoppers. As the business and the complexity of various new initiatives we take continues to grow, we are looking for talented Applied Scientists to join the team. Key job responsibilities As a Applied Scientist II, you will: * Conduct hands-on data analysis, build large-scale machine-learning models and pipelines * Work closely with software engineers on detailed requirements, technical designs and implementation of end-to-end solutions in production * Run regular A/B experiments, gather data, perform statistical analysis, and communicate the impact to senior management * Establish scalable, efficient, automated processes for large-scale data analysis, machine-learning model development, model validation and serving * Provide technical leadership, research new machine learning approaches to drive continued scientific innovation * Be a member of the Amazon-wide Machine Learning Community, participating in internal and external MeetUps, Hackathons and Conferences
US, WA, Seattle
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! In Prime Video READI, our mission is to automate infrastructure scaling and operational readiness. We are growing a team specialized in time series modeling, forecasting, and release safety. This team will invent and develop algorithms for forecasting multi-dimensional related time series. The team will develop forecasts on key business dimensions with optimization recommendations related to performance and efficiency opportunities across our global software environment. As a founding member of the core team, you will apply your deep coding, modeling and statistical knowledge to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on retrieving, cleansing and preparing large scale datasets, training and evaluating models and deploying them to production where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with complete independence and are often assigned to focus on areas where the business and/or architectural strategy has not yet been defined. You must be equally comfortable digging in to business requirements as you are drilling into design with development teams and developing production ready learning models. You consistently bring strong, data-driven business and technical judgment to decisions. You will work with internal and external stakeholders, cross-functional partners, and end-users around the world at all levels. Our team makes a big impact because nothing is more important to us than delivering for our customers, continually earning their trust, and thinking long term. You are empowered to bring new technologies to your solutions. If you crave a sense of ownership, this is the place to be.
US, WA, Bellevue
mmPROS Surface Research Science seeks an exceptional Applied Scientist with expertise in optimization and machine learning to optimize Amazon's middle mile transportation network, the backbone of its logistics operations. Amazon's middle mile transportation network utilizes a fleet of semi-trucks, trains, and airplanes to transport millions of packages and other freight between warehouses, vendor facilities, and customers, on time and at low cost. The Surface Research Science team delivers innovation, models, algorithms, and other scientific solutions to efficiently plan and operate the middle mile surface (truck and rail) transportation network. The team focuses on large-scale problems in vehicle route planning, capacity procurement, network design, forecasting, and equipment re-balancing. Your role will be to build innovative optimization and machine learning models to improve driver routing and procurement efficiency. Your models will impact business decisions worth billions of dollars and improve the delivery experience for millions of customers. You will operate as part of a team of innovative, experienced scientists working on optimization and machine learning. You will work in close collaboration with partners across product, engineering, business intelligence, and operations. Key job responsibilities - Design and develop optimization and machine learning models to inform our hardest planning decisions. - Implement models and algorithms in Amazon's production software. - Lead and partner with product, engineering, and operations teams to drive modeling and technical design for complex business problems. - Lead complex modeling and data analyses to aid management in making key business decisions and set new policies. - Write documentation for scientific and business audiences. About the team This role is part of mmPROS Surface Research Science. Our mission is to build the most efficient and optimal transportation network on the planet, using our science and technology as our biggest advantage. We leverage technologies in optimization, operations research, and machine learning to grow our businesses and solve Amazon's unique logistical challenges. Scientists in the team work in close collaboration with each other and with partners across product, software engineering, business intelligence, and operations. They regularly interact with software engineering teams and business leadership.
US, WA, Seattle
Come be a part of a rapidly expanding $35 billion dollar global business. At Amazon Business, a fast-growing startup passionate about building solutions, we set out every day to innovate and disrupt the status quo. We stand at the intersection of tech & retail in the B2B space developing innovative purchasing and procurement solutions to help businesses and organizations thrive. At Amazon Business, we strive to be the most recognized and preferred strategic partner for smart business buying. Bring your insight, imagination and a healthy disregard for the impossible. Join us in building and celebrating the value of Amazon Business to buyers and sellers of all sizes and industries. Unlock your career potential. We are seeking an Applied Scientist who has a solid background in applied Machine Learning and Data Science, deep passion for building data-driven products, ability to formulate data insights and scientific vision, and has a proven track record of executing complex projects and delivering business impact. Key job responsibilities • Data driven insights to accelerate acquisition of new members. • Develop and implement personalized marketing strategies and campaigns tailored to individual customer preferences, behaviors, and demographics to enhance engagement and drive customer loyalty. • Develop, implement, and optimize marketing attribution models to accurately measure the impact of various marketing channels and campaigns, and create valuation frameworks to assess the ROI and contribution of each channel to overall business objectives. • Work with a group of both applied scientists and software engineers to deliver machine-learning and data science solutions to production. • Advance team's engineering craftsmanship and drive continued scientific innovation as a thought leader and practitioner. • Mentor talented members, provide technical and career development guidance to both scientists and engineers in the organization. About the team The Marketing Science team applies scientific methods and research techniques to enhance our understanding of AB consumer behavior, market trends, and the effectiveness of marketing strategies. Our goal is to develop and advance theories and models that can be used to make informed decisions in marketing and to provide insights into consumer decision-making processes. Additionally, we seek to identify and explore emerging trends and technologies in marketing, and to develop innovative approaches for addressing the challenges and opportunities in the field.
US, WA, Seattle
Amazon’s eCommerce Foundation (eCF) organization provides the core technologies that drive and power Amazon's Stores, Digital, and Other (SDO) businesses. Millions of customer page views and orders per day are enabled by the systems eCF builds from the ground up. CloudTune, within eCF, empowers growth and business agility needs by automatically and efficiently managing AWS capacity and business processes needed to safely meet Amazon’s customer demand. CloudTune serves its primary customers, internal software teams, through forecast driven automation of cost controllership, capacity management and scaling. We predict expected load, and drive procurement and allocation of AWS capacity for new product launches and high velocity events like Prime Day and Cyber Monday. CloudTune, in partnership with Region Flexibility, is driving an SDO-wide program to diversify our use of AWS regions beyond DUB, IAD, and PDX regions. The objective of the Diversify AWS Region Usage (DARU) program is to mitigate the risk of capacity concentration by encouraging teams to design workloads that are region-flexible, utilize AWS automation such as Flexible Fleets to access multiple capacity pools, and optimize workload placement so SDO efficiently utilizes AWS. This is a strategic, highly visible, multi-year program which spans all Amazon business. CloudTune is looking for a Data Scientist to join our forecasting team and support DARU program. The team develops sophisticated algorithms that involve learning from large amounts of past data, such as actual sales, website traffic, merchandising activities, promotions, similar products and product attributes to forecast the demand for our compute infrastructure. These forecasts are used to determine the level of investment in capital expenditures, promotional activity, engineering efficiency projects and determining financial performance. As a Data Scientist CloudTune, you will work with other scientists, software engineers, data engineers, and product managers on a variety of important applied machine learning problems in the area of time series modeling. You will be an expert at communicating insights and recommendations to audiences of varying levels of technical sophistication. You will lead the design, implementation, and delivery of data science solutions for complex capacity planning problems. Key job responsibilities - Research and develop new methodologies for capacity demand forecasting. - Translate analytic insights into concrete, actionable recommendations for business or product improvement. Develop and present these as papers to senior stakeholders. - Given anecdotes about anomalies or generate automatic scripts to define anomalies, deep dive to explain why they happen, and identify fixes. - Drive scalable solutions for multi-year capacity demand forecasting horizons. - Play an integral role in developing a roadmap to expand and enhance demand forecasting for cloud compute resources. - Create and track accuracy and performance metrics (both technical and business metrics). - Create, enhance, and maintain technical documentation, and present to other scientists, engineers and business leaders.
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and innovative applied scientist with a strong background in deep learning and speech processing techniques. You will have an enormous opportunity to impact the customer experience, design, architecture, and implementation of an industry-leading product used every day by people you know. As an Applied Scientist, you will leverage Amazon’s heterogeneous data sources and large-scale computing resources to develop novel machine learning algorithms to advance the state of the art in speech and audio processing. Key job responsibilities · Conduct applied research project(s) effectively and know when to ask for help and when to work independently · Engage with an experienced cross-disciplinary staff to conceive and design innovative solutions for consumer products. · Actively participate and contribute to research activities including publications and patents · Work closely with an internal inter-disciplinary team, and outside partners to drive key aspects of product definition, execution and test. · Be proactive, flexible and able to succeed within an open collaborative peer environment
US, CA, Santa Clara
AWS AI/ML is looking for world class scientists and engineers to work on foundation models, large-scale representation learning, and distributed learning methods and systems. At AWS AI/ML you will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and innovate on new representation learning solutions. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Large-scale foundation models have been the powerhouse in many of the recent advancements in computer vision, natural language processing, automatic speech recognition, recommendation systems, and time series modeling. Developing such models requires not only skillful modeling in individual modalities, but also understanding of how to synergistically combine them, and how to scale the modeling methods to learn with huge models and on large datasets. Join us to work as an integral part of a team that has diverse experiences in this space. We actively work on these areas: Hardware-informed efficient model architecture, training objective and curriculum design Distributed training, accelerated optimization methods Continual learning, multi-task/meta learning Reasoning, interactive learning, reinforcement learning Robustness, privacy, model watermarking Model compression, distillation, pruning, sparsification, quantization A day in the life Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful start-ups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.