Text normalization with only 3% as much training data

Proteno model dramatically increases the efficiency of the first step in text-to-speech conversion.

With services like Alexa, which use synthesized speech for output, text normalization (TN) is usually the first step in the process of text-to-speech conversion. TN takes raw text as input— say, the string 6-21-21 — and expands it into a verbalized form that a text-to-speech model can use to produce the final speech — “twenty first of June twenty twenty one”.

Historically, TN algorithms relied on hard-coded rules, which didn’t generalize across languages and were hard to maintain: a typical rule-based TN system for a single language might have thousands of rules, which evolve over time and whose development requires linguistic expertise.

Text Normalization.png
Text normalization converts the output of computational processes — such as the natural-language-understanding models that handle Alexa customers' requests — into a form that will make sense when read out as synthesized speech.
Credit: Glynis Condon

More recently, academic and industry researchers have begun developing machine-learning-based TN models. But these have drawbacks, too. 

Sequence-to-sequence models occasionally make unacceptable errors, such as converting “$5” to “five pounds”. Semiotic-classification models require domain-specific information classes created by linguistic experts — classes such as emoticonor telephone number — which limits their generalizability. And both types of models require large amounts of training data, which makes it difficult to scale them across languages.

At this year’s meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), my colleagues and I are presenting a new text normalization model, called Proteno, that addresses these challenges.

We evaluated Proteno on three languages, English, Spanish, and Tamil. There’s a large body of research on TN in English, but no TN datasets existed for Spanish and Tamil. Consequently, we created our own datasets, which we have publicly released for use by other TN researchers.

Proteno specifies only a few, low-level normalization classes — such as ordinal number, cardinal number, or Roman numeral — which generalize well across languages. From the data, Proteno then learns a huge variety of additional, fine-grained classes. 

In our experiments on English, for instance, we used eight predefined classes, and Proteno automatically generated another 2,658. By contrast, semiotic-classification models typically have about 20 classes.

Proteno also uses a simple but effective scheme for tokenization, or splitting texts up into smaller chunks. Prior tokenization techniques required linguistic knowledge or data-heavy training; Proteno’s tokenization technique, by contrast, simply breaks text up at spaces and at transitions between Unicode classes, such as letternumeral, or punctuation mark. Consequently, it can generalize across languages, it enables the majority of normalizations to be learned from the data, and it reduces the incidence of unacceptable errors. 

Together, these techniques also allow Proteno to make do with much less training data than previous machine learning approaches. In our experiments, Proteno offered comparable performance with the previous state of the art in English — while requiring only 3% as much training data. 

Because there were no prior TN models trained on Spanish and Tamil, we had no benchmarks for our experiments. But on comparable amounts of training data, the Proteno models trained on Tamil and Spanish achieved accuracies comparable to that of the one trained on English (99.1% for Spanish, 96.7% for Tamil, and 97.4% for English).

Methods

Proteno treats TN as a sequence classification problem, where most of the classes are learned. The figure below illustrates Proteno’s training and run-time processing pipelines, which have slightly different orders.

Proteno pipeline (new).png

The training pipeline consists of the following steps:

  • Tokenization: Previous tokenization methods relied on language-specific rules devised by linguists. For instance, the string 6-21-21 would be treated as a single token of the type date. We propose a granular tokenization mechanism that is language independent and applicable to any space-separated language. The text to be normalized is first split at its spaces and then further split wherever there is a change in the Unicodeclass. The string 6-21-21 thus becomes five tokens, and we count on Proteno to learn how to handle them properly.
  • Annotation: The tokenized, unnormalized text is annotated token by token, which gives us a one-to-one mapping between each unnormalized token and its ground-truth normalization. This data will be used to train the model.
  • Class generation: Each token is then mapped to a class. A class may receive tokens only of a particular type; so, for instance, the class corresponding to dollars will not accept the type pounds, and vice versa. This prevents the model from making unacceptable errors. Each class also has an associated normalization function.

    There are two kinds of classes:

    1. Predefined: We define a limited number of classes (around 8-10) containing basic normalization rules. A small subset of these (3-5) contain language-specific rules, such as how to distinguish cardinal and ordinal uses of a number. Other classes — such as self, digit, and Roman numerals — remain similar across many languages.
    2. Auto-generated (AG): The model also generates classes automatically by analyzing the unnormalized-to-normalized token mappings in the dataset. If no existing class (pre-coded or AG) can generate the target normalization for a token in the training data, a new class is automatically generated. For instance, if the dataset includes the annotation “12→December", and if none of the existing classes can generate this normalization, then the class “12_to_December_AG" is created. This class accepts only “12", and its normalization function returns “December". AGs enable Proteno to learn the majority of normalizations automatically from data.
  • Classification: We model TN as a sequence-tagging problem, where the input is a sequence of unnormalized tokens and the output is the sequence of classes that can generate the normalized text. We experimented with four different types of classifiers: conditional random fields (CRFs), bi-directional long-short-term-memory models (bi-LSTMs), bi-LSTM-CRF combinations, and Transformers.

Datasets

As the goal of Proteno is to be applicable to multiple languages, we evaluated it across three languages, English, Spanish, and Tamil. English had significantly more auto-generated classes than Tamil or Spanish, as written English tends to use many more abbreviations than the other two languages. 

LanguageTotal predefined classesLanguage-specific predefined classesAuto-generated classes
Spanish105279
Tamil8374
English842,658
Proteno v. SOA.png
Proteno’s performance on 11 classes found in existing datasets, compared to the performance of two state-of-the-art models trained on 32 times as much data.

To benchmark Proteno’s performance in English, we could compare it to earlier models on only 11 of the 13 predefined classes found in existing datasets; differences in tokenization schemes meant that there were no logical mappings for the other two classes.

These results indicate that Proteno is a strong candidate for doing TN with low data annotation requirements while curbing unacceptable errors, which would make it a robust and scalable solution for production text-to-speech models.

Research areas

Related content

US, CA, San Francisco
We are seeking a highly motivated PhD Research Scientist Intern to join our robotics teams at Amazon. This internship offers a unique opportunity to work on cutting-edge robotics projects that directly impact millions of customers worldwide. You will collaborate with world-class experts, tackle groundbreaking research problems, and contribute to the development of innovative solutions that shape the future of robotics and artificial intelligence. As a Research Scientist intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes, and work with massive datasets. You'll find yourself at the forefront of innovation, working with large language models, multi-modal models, and modern reinforcement learning techniques, especially as applied to real-world robots. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions in robotics and AI. You'll then immerse yourself in a world of data and algorithms, leveraging your expertise in large language models and multi-modal systems to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Research Scientist Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA, and San Francisco, CA. We are particularly interested in candidates with expertise in: Robotics, Computer Vision, Artificial Intelligence, Causal Inference, Time Series, Large Language Models, Multi-Modal Models, and Reinforcement Learning. In this role, you gain hands-on experience in applying cutting-edge analytical and AI techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights and advanced AI models to drive operational excellence in robotics, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail, and have the ability to thrive in a fast-paced, ever-changing environment. A day in the life Work alongside global experts to develop and implement novel scalable algorithms in robotics, incorporating large language models and multi-modal systems. Develop modeling techniques that advance the state-of-the-art in areas of robotics, particularly focusing on modern reinforcement learning for real-world robotic applications. Anticipate technological advances and work with leading-edge technology in AI and robotics. Collaborate with Amazon scientists and cross-functional teams to develop and deploy cutting-edge robotics solutions into production, leveraging the latest in language models and multi-modal AI. Contribute to technical white papers, create technical roadmaps, and drive production-level projects that support Amazon Science in the intersection of robotics and advanced AI. Embrace ambiguity, maintain strong attention to detail, and thrive in a fast-paced, ever-changing environment at the forefront of AI and robotics research.
US, MA, Westborough
Are you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to any of these and you’ll fit right in here at Amazon Robotics. We are a smart team of doers that work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers’ experiences in ways we can’t even imagine yet. We invent new improvements every day. We are Amazon Robotics and we will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Amazon Robotics is seeking Research Science Interns and Co-ops with a passion for robotic research to work on cutting edge algorithms for robotics. Our team works on challenging and high-impact projects within robotics. Examples of projects include allocating resources to complete a million orders a day, coordinating the motion of thousands of robots, autonomous navigation in warehouses, identifying objects and damage, and learning how to grasp all the products Amazon sells. As an Research Science Intern/Co-op at Amazon Robotics, you will be working on one or more of our robotic technologies such as autonomous mobile robots, robot manipulators, and computer vision identification technologies. The intern/co-op project(s) and the internship/co-op location are determined by the team the student will be working on. Please note that by applying to this role you would be considered for Research Scientist summer intern, spring co-op, and fall co-op roles on various Amazon Robotics teams. These teams work on robotics research within areas such as computer vision, machine learning, robotic manipulation, navigation, path planning, perception, optimization and more. Learn more about Amazon Robotics: https://amazon.jobs/en/teams/amazon-robotics
US, NY, New York
Amazon is looking for an Applied Scientist to help build the next generation of sourcing and vendor experience systems. The Optimal Sourcing Systems (OSS) owns the optimization of inventory sourcing and the orchestration of inbound flows from vendors worldwide. We source inventory from thousands of vendors for millions of products globally while orchestrating the inbound flow for billions of units. Our goals are to increase reliable access to supply, improve supply chain-driven vendor experience, and reduce end-to-end supply chain costs, all in service of maximizing Long-Term Free Cash Flow (LTFCF) for Amazon. As an Applied Scientist, you will work with software engineers, product managers, and business teams to understand the business problems and requirements, distill that understanding to crisply define the problem, and design and develop innovative solutions to address them. Our team is highly cross-functional and employs a wide array of scientific tools and techniques to solve key challenges, including optimization, causal inference, and machine learning/deep learning. Some critical research areas in our space include modeling buying decisions under high uncertainty, vendors' behavior and incentives, supply risk and enhancing visibility and reliability of inbound signals. Key job responsibilities You will be a science tech leader for the team. As a Applied Scientist you will: - Set the scientific strategic vision for the team. You - - lead the decomposition of problems and development of roadmaps to execute on it. - Set an example for other scientists with exemplary scientific analyses; maintainable, extensible, and well-tested code; and simple, intuitive, and effective solutions. - Influence team business and engineering strategies. - Exercise sound judgment to prioritize between short-term vs. long-term and business vs. technology needs. - Communicate clearly and effectively with stakeholders to drive alignment and build consensus on key initiatives. - Foster collaborations between scientists across Amazon researching similar or related problems. - Actively engage in the development of others, both within and outside the team. - Engage with the broader scientific community through presentations, publications, and patents.
US, CA, San Francisco
If you are interested in this position, please apply on Twitch's Career site https://www.twitch.tv/jobs/en/ About Us: Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day. We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and smash (or elegantly solve) problems together. We’re on a quest to empower live communities, so if this sounds good to you, see what we’re up to on LinkedIn and X, and discover the projects we’re solving on our Blog. Be sure to explore our Interviewing Guide to learn how to ace our interview process. About the Role Data is central to Twitch's decision-making process, and data scientists are a critical component to evangelize data-driven decision making in all of our operations. As a data scientist at Twitch, you will be on the ground floor with your team, shaping the way product performance is measured, defining what questions should be asked, and scaling analytics methods and tools to support our growing business, leading the way for high quality, high velocity decisions for your team. For this role, we're looking for an experienced product data scientist who will help develop the strategy and evaluate/improve product initiatives within our Creator product team. You will be responsible to define and track KPIs, design experiments, evaluate A/B tests, implement data instrumentation, and inform on investment. Our ideal candidate is a "full-stack" data powerhouse who uses data to drive decision making to make the best products for our creators and their communities. Your input will be core to decision making across all major product strategies and initiatives that our team builds. You will work closely with product managers, technical program managers, engineering, data scientists, and organization leadership within and outside of the Creator organization. You Will - Inform product strategies by defining and updating core metrics for each initiative - Establish analytical framework for your team: ad-hoc analysis, automated dashboards, and self-service reporting tools to surface key data to stakeholders - Evaluate and forecast impact of product features on creators, viewers, and the entire Twitch ecosystem - Design A/B experiments to drive product direction with iterative innovation and measurement - Drive the team's analysis roadmap and prioritize the most valuable projects - Tackle complex and ambiguous analytic projects, resolve ambiguity and accurately identify the trade-offs between speed and quality and apply or route work as necessary - Dive deep into the data to understand how creator and viewer behaviors change with the evolution of our product - Act as our team's thought leader on best practices and move towards long-term vision of sustainable and thriving data processes - Own data collection and product instrumentation implementation and quality assurance - Work hand-in-hand with business, product, engineering, and design to proactively influence and inform teammates' decisions throughout the product life cycle - Distill ambiguous product or business questions, find clever ways to answer them, and to quantify the uncertainty Perks - Medical, Dental, Vision & Disability Insurance - 401(k) - Maternity & Parental Leave - Flexible PTO - Amazon Employee Discount About the team Twitch is all about community, and our Community Team is a core pillar of what makes Twitch, Twitch. Teams within Community are responsible for a myriad of product areas impacting the creator, viewer, and moderator journeys on our platform. As a member of our team, you'll build solutions that improve g the experience of millions of daily active users on our platform and create tools that keep both streamers and viewers engaged and connected on our platform.
US, NY, New York
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of autonomous agents. Agents with full autonomy need to be trustworthy and verifiable. The team develops AI systems that exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. Such agents sense, plan, and act effectively in interactive and previously unseen environments. To accomplish this goal we are seeking scientists with expertise in large language models, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
US, NY, New York
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of autonomous agents. Agents with full autonomy need to be trustworthy and verifiable. The team develops AI systems that exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. Such agents sense, plan, and act effectively in interactive and previously unseen environments. To accomplish this goal we are seeking scientists with expertise in large language models, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
US, CA, Santa Clara
The Think Forward Lab team at Deep Science for Systems & Services (DS3), AWS AI/ML is looking for world class scientists and engineers to join its group working on deployment of structure-aware next generation systems that can reason over heterogenous data assets and reduce hallucination making AI systems reliable. The team develops AI systems that utilize structure exhibit autonomous proficiency across a wide range of domains, demonstrating competency in many (complex) tasks previously performed by human knowledge workers. To accomplish this goal we are seeking scientists with expertise in large language models, graph machine learning, user alignment, neuro-symbolic AI, synthetic data generation and agentic environments. This is a role that combines science knowledge, technical strength, and product focus. It will be your job to develop novel generative AI-based agentic systems and algorithms while working with the engineering team to integrate them into different projects in the AWS AI portfolio of services. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Key job responsibilities You will be a hands on contributor to science at Amazon. You will help raise the scientific bar by mentoring, educating, and publishing in your field. You will help build the scientific roadmap for graph retrieval augmented generation, agents, neuro-symbolic AI and LLMs. You will be a technical leader in your domain. You will be a strong mentor and lead for your team. A day in the life Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. About the team The DS3 org encompasses scientists who work closely with different AWS AI/ML product services, innovating on the behalf of our customers customers. About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Utility Computing (UC) AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (IoT), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying.
AU, NSW, Sydney
AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. The Generative Artificial Intelligence (AI) Innovation Center team at AWS provides opportunities to innovate in a fast-paced organization that contributes to game-changing projects and technologies leveraging cutting-edge generative AI algorithms. As an Applied Scientist, you'll partner with technology and business teams to build solutions that surprise and delight our customers. We’re looking for Applied Scientists capable of using generative AI and other ML techniques to design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. Key job responsibilities - Collaborate with scientists and engineers to research, design and develop cutting-edge generative AI algorithms to address real-world challenges - Work across customer engagement to understand what adoption patterns for generative AI are working and rapidly share them across teams and leadership - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths for generative AI - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction. A day in the life Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. What if I don’t meet all the requirements? That’s okay! We hire people who have a passion for learning and are curious. You will be supported in your career development here at AWS. You will have plenty of opportunities to build your technical, leadership, business and consulting skills. Your onboarding will set you up for success, including a combination of formal and informal training. You’ll also have a chance to gain AWS certifications and access mentorship programs. You will learn from and collaborate with some of the brightest technical minds in the industry today.
AU, VIC, Melbourne
Are you excited about leveraging state-of-the-art Computer Vision algorithms and large datasets to solve real-world problems? Join Amazon as an Applied Scientist Intern and be at the forefront of AI innovation! As an Applied Scientist Intern, you'll work in a fast-paced, cross-disciplinary team of pioneering researchers. You'll tackle complex problems, developing solutions that either build on existing academic and industrial research or stem from your own innovative thinking. Your work may even find its way into customer-facing products, making a real-world impact. Key job responsibilities - Develop novel solutions and build prototypes - Work on complex problems in Computer Vision and Machine Learning - Contribute to research that could significantly impact Amazon's operations - Collaborate with a diverse team of experts in a fast-paced environment - Collaborate with scientists on writing and submitting papers to Tier-1 conferences (e.g., CVPR, ICCV, NeurIPS, ICML) - Present your research findings to both technical and non-technical audiences Key Opportunities: - Collaborate with leading machine learning researchers - Access cutting-edge tools and hardware (large GPU clusters) - Address challenges at an unparalleled scale - Become a disruptor, innovator, and problem solver in the field of computer vision - Potentially deliver solutions to production in customer-facing applications - Opportunities to become an FTE after the internship Join us in shaping the future of AI at Amazon. Apply now and turn your research into real-world solutions!
US, WA, Seattle
Amazon Prime is looking for an ambitious Economist to help create econometric insights for world-wide Prime. Prime is Amazon's premiere membership program, with over 200M members world-wide. This role is at the center of many major company decisions that impact Amazon's customers. These decisions span a variety of industries, each reflecting the diversity of Prime benefits. These range from fast-free e-commerce shipping, digital content (e.g., exclusive streaming video, music, gaming, photos), and grocery offerings. Prime Science creates insights that power these decisions. As an economist in this role, you will create statistical tools that embed causal interpretations. You will utilize massive data, state-of-the-art scientific computing, econometrics (causal, counterfactual/structural, time-series forecasting, experimentation), and machine-learning, to do so. Some of the science you create will be publishable in internal or external scientific journals and conferences. You will work closely with a team of economists, applied scientists, data professionals (business analysts, business intelligence engineers), product managers, and software engineers. You will create insights from descriptive statistics, as well as from novel statistical and econometric models. You will create internal-to-Amazon-facing automated scientific data products to power company decisions. You will write strategic documents explaining how senior company leaders should utilize these insights to create sustainable value for customers. These leaders will often include the senior-most leaders at Amazon. The team is unique in its exposure to company-wide strategies as well as senior leadership. It operates at the cutting-edge of utilizing data, econometrics, artificial intelligence, and machine-learning to form business strategies. A successful candidate will have demonstrated a capacity for building, estimating, and defending statistical models (e.g., causal, counterfactual, time-series, machine-learning) using software such as R, Python, or STATA. They will have a willingness to learn and apply a broad set of statistical and computational techniques to supplement deep-training in one area of econometrics. For example, many applications on the team use structural econometrics, machine-learning, and time-series forecasting. They rely on building scalable production software, which involves a broad set of world-class software-building skills often learned on-the-job. As a consequence, already-obtained knowledge of SQL, machine learning, and large-scale scientific computing using distributed computing infrastructures such as Spark-Scala or PySpark would be a plus. Additionally, this candidate will show a track-record of delivering projects well and on-time, preferably in collaboration with other team members (e.g. co-authors). Candidates must have very strong writing and emotional intelligence skills (for collaborative teamwork, often with colleagues in different functional roles), a growth mindset, and a capacity for dealing with a high-level of ambiguity. Endowed with these traits and on-the-job-growth, the role will provide the opportunity to have a large strategic, world-wide impact on the customer experiences of Prime members.