Fitzgerald keynote.png
Amazon senior applied scientist Jack FitzGerald, delivering a keynote talk at the joint Language Intelligence @ Work and SEMANTiCS conference in Vienna, Austria.

Scaling multilingual virtual assistants to 1,000 languages

Self-supervised training, distributed training, and knowledge distillation have delivered remarkable results, but they’re just the tip of the iceberg.

Yesterday at the joint Language Intelligence @ Work and SEMANTiCS conference in Vienna, Austria, Amazon senior applied scientist Jack FitzGerald delivered a keynote talk on multilingual virtual assistants and the path toward a massively multilingual future. This is an edited version of his talk.

The evolution of human-computer interaction paradigms

In the past 50 years, computing technology has progressed from text-based terminal inputs, to graphical user interfaces, to predominantly web-based applications, through the mobile era, and finally into the era of a voice user interface and ambient computing.

Interface timeline.png
A brief history of computing interfaces.

Each of these paradigms has its own challenges with respect to multilingualism, whether it was the migration from ASCII to Unicode or proper character rendering on a website. However, I would argue that a voice AI system is the most difficult paradigm yet with respect to massive multilingualism.

The first reason is that the input space for voice interface commands is unbounded: the user can phrase each command in hundreds of different ways, all of which are valid. Another reason is that even within a single language, there can be many different dialects and accents.

Related content
Amazon Visiting Academic Barbara Poblete helps to build safer, more-diverse online communities — and to aid disaster response.

Most important, the coupling between language and culture is inescapable. Whether it’s the level of formality used, preferred activities, or religious differences, there isn’t a one-size-fits-all solution. Instead, we must adapt the virtual assistant to understand cultural context and say only things that are appropriate for a given locale.

Voice AI systems today

A typical voice AI system includes automatic-speech-recognition models, which convert raw audio into text; natural-language understanding models, which determine the user’s intent and recognize named entities; a central service for arbitration and dialogue management, which routes commands to the proper services or skills; and finally, a text-to-speech model, which issues the output. Additional tasks might include expansion of the underlying knowledge graph and semantic parsing, localization of touch screen content, or local information services.

Alexa overview.png
An overview of Alexa’s design.

Let’s look at some of the operational considerations for supporting multiple languages in such models. One is the training data: they must be topically exhaustive, meaning that they cover the full spectrum of possible user utterances, and they must be culturally exhaustive — for instance, covering all of the holidays a user might celebrate. They must also remain up-to-date, and it’s not always easy to add something new to the model without regression on existing functionalities.

A second consideration is in-house testing. Though in many cases one can get away with synthetic or otherwise artificial data for model training, for testing it’s important to have realistic utterances. Those typically need to come from humans, and collecting them can be a major expense. It’s also useful to perform live, interactive testing, which requires people who can speak and understand each language that the system supports.

Related content
New approach corrects for cases when average improvements are accompanied by specific regressions.

Finally, it’s important to have the ability to support users and process their feedback. In most cases, this again requires staff who understand each of the supported languages.

Ultimately, human-based processes are not very scalable if our goal is to support thousands of languages. Instead, we must turn to technology to the greatest extent possible.

Multilingual modeling today

One of the leading reasons for the current success of multilingual text models is self-supervision.

In traditional supervised learning, a model would be trained from scratch on the desired task. If we wanted a model that would classify the sentiment of a product review, for example, we would manually annotate a bunch of product reviews, and we would use that dataset to train the model.

Today, however, we make use of transfer learning, in which text models are pretrained on terabytes of text data that don’t require manual annotation. Instead, the training procedure leverages the structure inherent to the text itself.

Self-supervision signals.png
Self-supervised-training objectives.

We’ll call this self-supervised pretraining With the masked-language-modeling training objective, for instance, the model is fed the input “for [MASK] out loud!”, and it must predict that “[MASK]” should be filled with the word “crying”. Other objectives, such as causal language modeling, span filling, deshuffling, and denoising can also be used.

Because the datasets required for self-supervised pretraining are unlabeled and monolingual, we can leverage troves of data, such as Common Crawl web scrapes, every Wikipedia page in existence, thousands of books and news articles, and more. Couple these large datasets with highly parallelizable architectures such as transformers, which can be trained on over a thousand GPUs with near linear scaling, and we can build models with tens or hundreds of billions of dense parameters. Such has been the focus for many people in the field for the past few years, including the Alexa Teacher Model team.

One incredible consequence of the transfer learning paradigm is called zero-shot learning. In the context of multilingual modeling, it works like this: the modeler begins by pretraining the model on some set of languages, using self-supervision. As an example, suppose that the modeler trains a model on English, French, and Japanese using every Wikipedia article in those three languages.

Related content
New end-to-end approach to zero-shot video classification dramatically outperforms predecessors.

The next step is to adapt the model to a particular task using labeled data. Suppose that the modeler has a labeled dataset for intent classification, but only in English. The modeler can go ahead and fine-tune the model on the English data, then run it on the remaining languages.

Despite the fact that the model was never trained to do intent classification with French or Japanese data, it can still classify intents in those languages, by leveraging what it learned about those languages during pretraining. Given that the acquisition of labeled data is often a bottleneck, this property of language models is highly valuable for language expansion. Of course, zero-shot learning is just the extreme end of a continuum: transfer learning helps even out performance when the labeled data in different languages is imbalanced.

Zero-shot multilingual.png
Zero-shot learning for multilingual adaptation.

The next step up the data efficiency ladder is performing tasks without any additional training or fine tuning, using only a couple of labeled records or none at all. This is possible through “in-context learning,” which was popularized in the GPT-3 paper.

To perform in-context learning, simply take a pretrained model and feed it the appropriate prompts. Think of a prompt is a hint to the model about the task it should perform. Suppose that we want the model to summarize a passage. We might prefix the passage with the word “Passage” and a colon and follow it with the word “Summary” and a colon. The model would then generate a summary of the passage.

Related content
In the past few years, advances in artificial intelligence have captured our imaginations and led to the widespread use of voice services on our phones and in our homes.

This is the zero-shot in-context learning case, meaning that no fine-tuning is performed, and no labeled data are needed. To improve task performance, we can feed a few examples to the model before asking it to perform the task. Though this does require some labeled data, the amount is small, usually in the tens of examples only.

Our Alexa Teacher Model team recently trained and tested a 20-billion-parameter sequence-to-sequence model that was multilingual and showed nice performance for in-context learning. For example, we showed state-of-the-art performance on machine translation with in-context learning. The model can achieve competitive BLEU scores even for some low-resource languages, which is incredible given that no parallel data was used during pretraining, and no labeled data besides a single example was used at any step in the process.

We were particularly proud of the relatively small size of this model, which could compete with much larger models because it was trained on more data. (The Chinchilla model from OpenAI showed a similar result.) Though a large model trained on a smaller dataset and a smaller model trained on a larger dataset may use the same total compute at training time, the smaller model will require less compute and memory during inference, which is a key factor in real applications.

Given that models demonstrate multilingual understanding even without labeled data or parallel data, you may be wondering what’s happening inside of the model. Since the days of word2vec and earlier, we’ve represented characters, words, sentences, documents, and other inputs as vectors of floats, also known as embeddings, hidden states, and representations. Concepts cluster in certain areas of the representational space.

Related content
Training a product discovery system on many languages at once improves performance in all of them.

As humans, we can think only in three dimensions, whereas these representations are high-dimensional, but you can visualize this clustering in two or three dimensions as a reductive approximation. All the languages the model supports would cluster the concept of sitting in a chair in one region of the representational space; the concept of the ocean would inhabit a different cluster; and so forth.

Indeed, Pires et al. have shown that synonymous words across languages cluster together in the mBERT model. When examining 5,000 sentence pairs from the WMT16 dataset, they found that, given a sentence and its embedding in one language, the correct translation from another language is the closest embedding to the source embedding up to 75% of the time.

This manner of clustering can also be manipulated by changing the objective function. In their work on speech-to-text-modeling, Adams et al., from Johns Hopkins, were seeing undesirable clustering by language, rather than by phonemes, in the representational space. They were able to correct by adding training objectives around phoneme prediction and language identification.

The Alexa Teacher Model distillation pipeline

Once we have multilingual models, how do we adapt them to a real system? At the recent KDD conference, we presented a paper describing the Alexa Teacher Model pipeline, consisting of the following steps.

First, a multilingual model with billions of parameters is trained on up to a trillion tokens taken from Common Crawl web scrapes, Wikipedia articles, and more. Second, the models are further trained on in-domain, unlabeled data from a real system. Third, the model is distilled into smaller sizes that can be used in production. The final models can then be fine-tuned using labeled data and deployed.

ATM pipeline.png
The Alexa Teacher Model (AlexaTM) pipeline. The Alexa Teacher Model is trained on a large set of GPUs (left), then distilled into smaller variants (center), whose size depends on their uses. The end user adapts a distilled model to its particular use by fine-tuning it on in-domain data (right).

In tests, we found that our model was more accurate than a publicly available pretrained model fine-tuned on labeled data, and it significantly reduced customer dissatisfaction relative to a model trained by a smaller teacher model (85 million parameters, say, instead of billions). In short, we’ve verified that we can leverage the additional learning capacity of large, multilingual models for production systems requiring low latency and low memory consumption.

Scaling to 1,000 languages

I mentioned the fascinating ability of language models to learn joint representations of multiple languages without labeled or parallel data. This ability is crucial for us to scale to many languages. However, as we scale, we need test data that we can trust so that we can evaluate our progress.

Related content
MASSIVE dataset and Massively Multilingual NLU (MMNLU-22) competition and workshop will help researchers scale natural-language-understanding technology to every language on Earth.

Toward this end, my team at Amazon recently released a new benchmark for multilingual natural-language understanding called MASSIVE, which is composed of one million labeled records spanning 51 languages, 18 domains, 60 intents, and 55 slots. All of the data were created by native speakers of the languages. We also released a GitHub repository with code that can be used as a baseline for creating multilingual NLU models, as well as leaderboards on eval.ai.

Now, you may retort that 51 languages is still a long ways from 1,000 languages. This is true, but we purposefully chose our languages in order to maximize typological diversity while staying within our budget. Our languages span 29 language genera, 14 language families, and 21 distinct scripts or alphabets. The diversity of the chosen languages allows a modeler to test technology that should scale to many more languages within each represented genus, family, and script.

That said, we certainly have some major gaps in language coverage, including across native North and South American languages, African languages, and Australian languages. Yet we are optimistic that our fellow researchers across the field will continue to produce new labeled benchmark datasets for the world’s thousands of low-resource languages.

Massive languages.cropped.png
The 51 languages of MASSIVE, including scripts and genera.

Another difficulty with our current modeling approaches is that they rely on data sources such as web scrapes, encyclopedic articles, and news articles, which are highly skewed toward a small set of languages. Wang, Ruder, and Neubig recently presented some fascinating work leveraging bilingual lexicons — corpora consisting of word-level translations — to improve language model performance for low-resource languages. Lexicons cover a far greater portion of the world’s languages than our typical data sources for language modeling, making this an exciting approach.

Related content
Self-learning system uses customers’ rephrased requests as implicit error signals.

Researchers, missionaries, and businesspeople have been created fundamental linguistic resources for decades, from Bible translations to the Unimorph corpus. The Unimorph datasets are used for the SIGMORPHON shared task, in which a model must predict the correct formulation of word given that word’s root and certain morphological transformations, such as part of speech, tense, and person. We must find more ways to leverage such resources when creating massively multilingual voice AI systems.

As a final technique for scaling to many more languages, we can consider what we in Alexa call “self-learning.” Some of my Alexa colleagues published a paper showing that we can mine past utterances to improve overall system performance. For example, if a user rephrases a request as part of a multiturn interaction, as shown on the left in the figure below, or if different users provide variations for the same desired goal, as shown on the right, then we can make soft assumptions that the different formulations are synonymous.

All of these cases can be statistically aggregated to form new training sets to update the system, without the need to manually annotate utterances. In a multilingual system, such technology is particularly valuable after the initial launch of a language, both to improve performance generally and to adapt to changes in the lexicon.

Self-learning.png
Alexa’s self-learning mechanism.

The road ahead

I hope that you share my wonder at the current state of the art — the scale of language-model training, the magic of zero-shot learning, and the distillation of knowledge into compact models that can run in latency-sensitive systems. All of this is incredible, but we’ve only scratched the surface of supporting the world’s 7,000 languages.

To move into the next era of massive multilingualism, we must build new and increasingly powerful models that can take advantage of low-cost data, particularly unlabeled monolingual data. We must also build models that can leverage existing and upcoming linguistic resources, such as bilingual lexicons and morphological-transformation databases. And finally, we must expand available language resources across more languages and domains, including more unlabeled monolingual corpora, more parallel resources, and more realistic, labeled, task-specific datasets.

Increased multilingualism is a win for all people everywhere. Each language provides a unique perspective on the world in which we live. A rich plurality of perspectives leads to a deeper understanding of our fellow people and of all creation.

Keep building.

Research areas

Related content

US, CA, San Francisco
We are seeking a highly motivated PhD Research Scientist Intern to join our robotics teams at Amazon. This internship offers a unique opportunity to work on cutting-edge robotics projects that directly impact millions of customers worldwide. You will collaborate with world-class experts, tackle groundbreaking research problems, and contribute to the development of innovative solutions that shape the future of robotics and artificial intelligence. As a Research Scientist intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes, and work with massive datasets. You'll find yourself at the forefront of innovation, working with large language models, multi-modal models, and modern reinforcement learning techniques, especially as applied to real-world robots. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions in robotics and AI. You'll then immerse yourself in a world of data and algorithms, leveraging your expertise in large language models and multi-modal systems to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Research Scientist Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA, and San Francisco, CA. We are particularly interested in candidates with expertise in: Robotics, Computer Vision, Artificial Intelligence, Causal Inference, Time Series, Large Language Models, Multi-Modal Models, and Reinforcement Learning. In this role, you gain hands-on experience in applying cutting-edge analytical and AI techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights and advanced AI models to drive operational excellence in robotics, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail, and have the ability to thrive in a fast-paced, ever-changing environment. A day in the life Work alongside global experts to develop and implement novel scalable algorithms in robotics, incorporating large language models and multi-modal systems. Develop modeling techniques that advance the state-of-the-art in areas of robotics, particularly focusing on modern reinforcement learning for real-world robotic applications. Anticipate technological advances and work with leading-edge technology in AI and robotics. Collaborate with Amazon scientists and cross-functional teams to develop and deploy cutting-edge robotics solutions into production, leveraging the latest in language models and multi-modal AI. Contribute to technical white papers, create technical roadmaps, and drive production-level projects that support Amazon Science in the intersection of robotics and advanced AI. Embrace ambiguity, maintain strong attention to detail, and thrive in a fast-paced, ever-changing environment at the forefront of AI and robotics research.
US, MA, Westborough
Are you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Amazon Robotics is seeking Research Science Interns and Co-ops with a passion for robotic research to work on cutting edge algorithms for robotics. Our team works on challenging and high-impact projects within robotics. Examples of projects include allocating resources to complete a million orders a day, coordinating the motion of thousands of robots, autonomous navigation in warehouses, identifying objects and damage, and learning how to grasp all the products Amazon sells. As an Research Science Intern/Co-op at Amazon Robotics, you will be working on one or more of our robotic technologies such as autonomous mobile robots, robot manipulators, and computer vision identification technologies. The intern/co-op project(s) and the internship/co-op location are determined by the team the student will be working on. Please note that by applying to this role you would be considered for Research Scientist summer intern, spring co-op, and fall co-op roles on various Amazon Robotics teams. These teams work on robotics research within areas such as computer vision, machine learning, robotic manipulation, navigation, path planning, perception, optimization and more. Learn more about Amazon Robotics: https://amazon.jobs/en/teams/amazon-robotics
US, NY, New York
Amazon is looking for an Applied Scientist to help build the next generation of sourcing and vendor experience systems. The Optimal Sourcing Systems (OSS) owns the optimization of inventory sourcing and the orchestration of inbound flows from vendors worldwide. We source inventory from thousands of vendors for millions of products globally while orchestrating the inbound flow for billions of units. Our goals are to increase reliable access to supply, improve supply chain-driven vendor experience, and reduce end-to-end supply chain costs, all in service of maximizing Long-Term Free Cash Flow (LTFCF) for Amazon. As an Applied Scientist, you will work with software engineers, product managers, and business teams to understand the business problems and requirements, distill that understanding to crisply define the problem, and design and develop innovative solutions to address them. Our team is highly cross-functional and employs a wide array of scientific tools and techniques to solve key challenges, including optimization, causal inference, and machine learning/deep learning. Some critical research areas in our space include modeling buying decisions under high uncertainty, vendors' behavior and incentives, supply risk and enhancing visibility and reliability of inbound signals. Key job responsibilities You will be a science tech leader for the team. As a Applied Scientist you will: - Set the scientific strategic vision for the team. You - - lead the decomposition of problems and development of roadmaps to execute on it. - Set an example for other scientists with exemplary scientific analyses; maintainable, extensible, and well-tested code; and simple, intuitive, and effective solutions. - Influence team business and engineering strategies. - Exercise sound judgment to prioritize between short-term vs. long-term and business vs. technology needs. - Communicate clearly and effectively with stakeholders to drive alignment and build consensus on key initiatives. - Foster collaborations between scientists across Amazon researching similar or related problems. - Actively engage in the development of others, both within and outside the team. - Engage with the broader scientific community through presentations, publications, and patents.
US, CA, San Francisco
If you are interested in this position, please apply on Twitch's Career site https://www.twitch.tv/jobs/en/ About Us: Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day. We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and smash (or elegantly solve) problems together. We’re on a quest to empower live communities, so if this sounds good to you, see what we’re up to on LinkedIn and X, and discover the projects we’re solving on our Blog. Be sure to explore our Interviewing Guide to learn how to ace our interview process. About the Role Data is central to Twitch's decision-making process, and data scientists are a critical component to evangelize data-driven decision making in all of our operations. As a data scientist at Twitch, you will be on the ground floor with your team, shaping the way product performance is measured, defining what questions should be asked, and scaling analytics methods and tools to support our growing business, leading the way for high quality, high velocity decisions for your team. For this role, we're looking for an experienced product data scientist who will help develop the strategy and evaluate/improve product initiatives within our Creator product team. You will be responsible to define and track KPIs, design experiments, evaluate A/B tests, implement data instrumentation, and inform on investment. Our ideal candidate is a "full-stack" data powerhouse who uses data to drive decision making to make the best products for our creators and their communities. Your input will be core to decision making across all major product strategies and initiatives that our team builds. You will work closely with product managers, technical program managers, engineering, data scientists, and organization leadership within and outside of the Creator organization. You Will - Inform product strategies by defining and updating core metrics for each initiative - Establish analytical framework for your team: ad-hoc analysis, automated dashboards, and self-service reporting tools to surface key data to stakeholders - Evaluate and forecast impact of product features on creators, viewers, and the entire Twitch ecosystem - Design A/B experiments to drive product direction with iterative innovation and measurement - Drive the team's analysis roadmap and prioritize the most valuable projects - Tackle complex and ambiguous analytic projects, resolve ambiguity and accurately identify the trade-offs between speed and quality and apply or route work as necessary - Dive deep into the data to understand how creator and viewer behaviors change with the evolution of our product - Act as our team's thought leader on best practices and move towards long-term vision of sustainable and thriving data processes - Own data collection and product instrumentation implementation and quality assurance - Work hand-in-hand with business, product, engineering, and design to proactively influence and inform teammates' decisions throughout the product life cycle - Distill ambiguous product or business questions, find clever ways to answer them, and to quantify the uncertainty Perks - Medical, Dental, Vision & Disability Insurance - 401(k) - Maternity & Parental Leave - Flexible PTO - Amazon Employee Discount About the team Twitch is all about community, and our Community Team is a core pillar of what makes Twitch, Twitch. Teams within Community are responsible for a myriad of product areas impacting the creator, viewer, and moderator journeys on our platform. As a member of our team, you'll build solutions that improve g the experience of millions of daily active users on our platform and create tools that keep both streamers and viewers engaged and connected on our platform.
US, NY, New York
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of autonomous agents. Agents with full autonomy need to be trustworthy and verifiable. The team develops AI systems that exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. Such agents sense, plan, and act effectively in interactive and previously unseen environments. To accomplish this goal we are seeking scientists with expertise in large language models, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
US, NY, New York
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of autonomous agents. Agents with full autonomy need to be trustworthy and verifiable. The team develops AI systems that exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. Such agents sense, plan, and act effectively in interactive and previously unseen environments. To accomplish this goal we are seeking scientists with expertise in large language models, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
US, CA, Santa Clara
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of structure-aware next generation systems that can reason over heterogenous data assets and reduce hallucination making AI systems reliable. The team develops AI systems that utilize structure exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. To accomplish this goal we are seeking scientists with expertise in large language models, graph machine learning, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for graph retrieval augmented generation, agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. A day in the life Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
AU, NSW, Sydney
AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. The Generative Artificial Intelligence (AI) Innovation Center team at AWS provides opportunities to innovate in a fast-paced organization that contributes to game-changing projects and technologies leveraging cutting-edge generative AI algorithms. As an Applied Scientist, you'll partner with technology and business teams to build solutions that surprise and delight our customers. We’re looking for Applied Scientists capable of using generative AI and other ML techniques to design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. Key job responsibilities - Collaborate with scientists and engineers to research, design and develop cutting-edge generative AI algorithms to address real-world challenges - Work across customer engagement to understand what adoption patterns for generative AI are working and rapidly share them across teams and leadership - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths for generative AI - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction. A day in the life Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. What if I don’t meet all the requirements? That’s okay! We hire people who have a passion for learning and are curious. You will be supported in your career development here at AWS. You will have plenty of opportunities to build your technical, leadership, business and consulting skills. Your onboarding will set you up for success, including a combination of formal and informal training. You’ll also have a chance to gain AWS certifications and access mentorship programs. You will learn from and collaborate with some of the brightest technical minds in the industry today.
AU, VIC, Melbourne
Are you excited about leveraging state-of-the-art Computer Vision algorithms and large datasets to solve real-world problems? Join Amazon as an Applied Scientist Intern and be at the forefront of AI innovation! As an Applied Scientist Intern, you'll work in a fast-paced, cross-disciplinary team of pioneering researchers. You'll tackle complex problems, developing solutions that either build on existing academic and industrial research or stem from your own innovative thinking. Your work may even find its way into customer-facing products, making a real-world impact. Key job responsibilities - Develop novel solutions and build prototypes - Work on complex problems in Computer Vision and Machine Learning - Contribute to research that could significantly impact Amazon's operations - Collaborate with a diverse team of experts in a fast-paced environment - Collaborate with scientists on writing and submitting papers to Tier-1 conferences (e.g., CVPR, ICCV, NeurIPS, ICML) - Present your research findings to both technical and non-technical audiences Key Opportunities: - Collaborate with leading machine learning researchers - Access cutting-edge tools and hardware (large GPU clusters) - Address challenges at an unparalleled scale - Become a disruptor, innovator, and problem solver in the field of computer vision - Potentially deliver solutions to production in customer-facing applications - Opportunities to become an FTE after the internship Join us in shaping the future of AI at Amazon. Apply now and turn your research into real-world solutions!
US, WA, Seattle
Amazon Prime is looking for an ambitious Economist to help create econometric insights for world-wide Prime. Prime is Amazon's premiere membership program, with over 200M members world-wide. This role is at the center of many major company decisions that impact Amazon's customers. These decisions span a variety of industries, each reflecting the diversity of Prime benefits. These range from fast-free e-commerce shipping, digital content (e.g., exclusive streaming video, music, gaming, photos), and grocery offerings. Prime Science creates insights that power these decisions. As an economist in this role, you will create statistical tools that embed causal interpretations. You will utilize massive data, state-of-the-art scientific computing, econometrics (causal, counterfactual/structural, time-series forecasting, experimentation), and machine-learning, to do so. Some of the science you create will be publishable in internal or external scientific journals and conferences. You will work closely with a team of economists, applied scientists, data professionals (business analysts, business intelligence engineers), product managers, and software engineers. You will create insights from descriptive statistics, as well as from novel statistical and econometric models. You will create internal-to-Amazon-facing automated scientific data products to power company decisions. You will write strategic documents explaining how senior company leaders should utilize these insights to create sustainable value for customers. These leaders will often include the senior-most leaders at Amazon. The team is unique in its exposure to company-wide strategies as well as senior leadership. It operates at the cutting-edge of utilizing data, econometrics, artificial intelligence, and machine-learning to form business strategies. A successful candidate will have demonstrated a capacity for building, estimating, and defending statistical models (e.g., causal, counterfactual, time-series, machine-learning) using software such as R, Python, or STATA. They will have a willingness to learn and apply a broad set of statistical and computational techniques to supplement deep-training in one area of econometrics. For example, many applications on the team use structural econometrics, machine-learning, and time-series forecasting. They rely on building scalable production software, which involves a broad set of world-class software-building skills often learned on-the-job. As a consequence, already-obtained knowledge of SQL, machine learning, and large-scale scientific computing using distributed computing infrastructures such as Spark-Scala or PySpark would be a plus. Additionally, this candidate will show a track-record of delivering projects well and on-time, preferably in collaboration with other team members (e.g. co-authors). Candidates must have very strong writing and emotional intelligence skills (for collaborative teamwork, often with colleagues in different functional roles), a growth mindset, and a capacity for dealing with a high-level of ambiguity. Endowed with these traits and on-the-job-growth, the role will provide the opportunity to have a large strategic, world-wide impact on the customer experiences of Prime members.