Improving LLM pretraining with better data organization

“Best-fit packing” adapts bin-packing to avoid unnecessary truncation of training documents, improving LLM performance across a wide range of tasks and reducing hallucination.

The documents used to train a large language model (LLM) are typically concatenated to form a single “superdocument”, which is then divided into sequences that match the model's context length. This improves training efficiency but often results in unnecessary truncations, where individual documents are broken up across successive sequences.

Related content
Contiguous parameter management and prefetched activation offloading expand the MiCS tool kit.

In paper we’re presenting at this year’s International Conference on Machine Learning (ICML 2024), titled “Fewer truncations improve language modeling”, we report an in-depth study of this common concatenation-chunking document-processing method. We found that it severely impairs the model's ability to understand contextual coherence and factual consistency. This not only affects the model's performance on downstream tasks but also increases the risk of hallucinations.

To address this issue, we propose best-fit packing, an innovative document-processing strategy that optimizes document combinations to eliminate unnecessary text truncations. In experiments, we compared a model trained using best-fit packing to one trained in the ordinary way on six downstream tasks: reading comprehension, natural-language inference, context following, summarization, commonsense and closed-book question answering, and program synthesis. We found that best-fit packing monotonically improves performance on an array of 22 sub-tasks, by as much as 15% (program synthesis) to 17% (context following). Importantly, best-fit packing also reduces closed-domain hallucination effectively by up to 58.3%.

Best-fit packing.png
A comparison of best-fit packing (left), which seeks to minimize document truncation, with the standard approach to large-language-model training, which concatenates training documents and then divides them into fixed-length sequences.

Consequences of truncation

In the analysis reported in our paper, we identified several problems caused by document truncation, including undefined names, ungrounded content, and missing knowledge.

Related content
Prompt engineering enables researchers to generate customized training examples for lightweight “student” models.

Undefined names: In programming languages like Python, truncation may separate definitions of variables from their invocations, introducing syntax errors and causing some variables to be undefined. As a consequence, the model may learn misleading patterns and possibly hallucinate on downstream tasks.

Ungrounded content: Truncation damages data integrity. In the example below, for instance, a reference (“the earthquake on Monday morning”) is separated from its antecedent, resulting in unfaithful generation.

Missing knowledge: Truncation hinders knowledge acquisition. In the example below, the model cannot learn the location of the ICML conference because the conference name and location occur in different training sequences.

Truncation errors.png
Examples of three common truncation errors: (a) undefined names, (b) ungrounded content, and (c) missing knowledge.

Best-fit packing

To address this issue, we propose optimizing the assignment of documents to training sequences so as to eliminate unnecessary truncations, while minimally increasing the number of sequences relative to concatenation. This is a variation of the well-known bin-packing problem, which is NP-hard in general, but we use a heuristic called the best-fit-decreasing (BFD) algorithm that tends to work well in practice. We thus call our method best-fit packing.

The normal implementation of BFD has quasi-linear time complexity, which is not efficient enough for LLM pretraining, which typically involves millions of documents. By taking advantage of the unique nature of pretraining data, however, we were able to optimize BFD so that it scales linearly with data size, ensuring its applicability to large-scale pretraining datasets. Further, we show that in practical applications, best-fit packing generates approximately the same number of training sequences as the traditional method, while significantly reducing data loss caused by truncation.

Truncations per document.png
Truncations per document as a function of document length, for both best-fit packing (pack) and concatenation (concat), for natural-language data (top) and programming-language data (bottom). The natural-language data is evaluated with context lengths of both 2,000 and 8,000.

Curious to know how we achieve it? Let’s dive deep!

Best-fit packing — an example

Following the standard bin-packing nomenclature, we call each training sequence a “bin”, and each bin has a capacity equal to the LLM’s context size. The goal is to assign a combination of whole documents to each bin so as to minimize the wasted bin capacity.

First, we divide any document that’s larger than the LLM context into context-length chunks, plus a remainder. Then we sort the documents (and document fragments) from largest to smallest. Finally, we work our way down the sorted list, assigning each document to the bin whose available space is as close to the document size as possible.

Related content
Novel “checkpointing” scheme that uses CPU memory reduces the time wasted on failure recovery by more than 92%.

To maximize efficiency, we use three data structures to manage the assignment of documents to bins: a binary tree and two tables. We can use this design because (1) the maximum bin size is the model’s context size, so the tree won’t be too deep, and (2) we do not need to distinguish bins with the same remaining capacity, which simplifies the the tree. Instead, we use the tables to map bin capacities to bins.

Consider a simple example, in which the context size (the bin size) is eight. The binary tree has eight leaves, corresponding to the eight possibilities for available space in any given bin. (In a real LLM, the context size is on the order of thousands of tokens, so the tree would have thousands of leaves.)

Each parent node of the tree has an associated number, indicating the size of the largest available bin slot among its descendants. The number associated with the parent’s right child is always greater than or equal to the number associated with the left child.

Initially, the value of the rightmost node in each layer of the tree is eight, and all the other nodes have values of zero. This means that all the available bin slots have a capacity of eight.

Best-fit initialization.png
The initial states of the three data structures we use to implement best-fit packing. The rightmost node of each layer of the tree has a value of eight, and all other nodes have values of zero, indicating that all the bins are empty (i.e., are at maximum capacity).

Now consider a later state, when four documents of size eight, six, six, and four have been packed. The two bins containing documents of size six have available slots of size two (8 – 6), and the bin containing a document of size four has an available slot of size four (8 – 4). These sizes are represented by the numbers two and four at leaves two and four of the tree. Multiple bins remain empty, so leaf eight has a value of eight, too.

Note that the value two at leaf two indicates only that at least one bin slot of size two is available; it doesn’t indicate how many such slots there are or where they can be found. That information is contained in the tables.

Tree after packing.png
The state of the data structures after four documents of sizes six, six, four, and eight have been packed.

Now consider a document of size three, which we wish to assign to a bin. To find the best available bin slot, simply go left at each node of the tree, unless going left leads to a node whose value is less than the document size, in which case, go right.

Document packing.png
Tree traversal identifies the available bin slot that best fits the new document.

The best fit for a document of size three is a slot of size four, and in the “space-to-bins” table, we see that there is one bin — bin three — with a slot of that size. So there we place the document.

Finally, we update all three data structures to reflect the new placement:

Data structure update.png
Data structure updates after the document (item four) of size three has been packed. The tree leaf corresponding to slot sizes of four is reset to zero, and the tree leaf corresponding to slot sizes of one is set to one. The tables are updated accordingly.

Results

To evaluate the effect of bin-packing on downstream tasks, we pretrained models of 7 billion and 13 billion parameters with context lengths of 2,000 and 8,000 on text and code using both best-fit packing and concatenation. We then tested both sets of models on our six downstream tasks. On average, across multiple datasets, context lengths, and metrics, best-fit packing offered better performance on all six tasks. The biggest gains came in reading comprehension (+4.7%), natural-language inference (+9.3%), context following (+16.8%), and program synthesis (+15.0%).

Related content
In a series of papers, Amazon researchers performed a theoretical analysis of a simplified problem that led to a learnable learning-rate scheduler, applied that scheduler to a more complex neural model, and distilled the results into a practical algorithm.

We also found that best-fit packing helped prevent closed-domain hallucination, particularly in program synthesis tasks, where it reduced "undefined name" errors by up to 58.3%, indicating a more complete understanding of program structure and logic.

Additionally, models trained with best-fit packing were better at following instructions, such as adhering to length constraints. And best-fit packing helped the model acquire “tail knowledge” that is truncation sensitive due to scarcity in training data. Indeed, this result suggests a possible reason for why LLMs struggle to learn long-tail knowledge.

While the experiments conducted in our paper primarily focused on LLM pretraining, best-fit packing is broadly applicable to fine tuning as well. Determining the benefits it can offer during fine tuning is an intriguing topic for future study.

Research areas

Related content

US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a highly skilled and experienced Sr. Applied Scientist, to support the development and implementation of state-of-the-art algorithms and models for supervised fine-tuning and reinforcement learning through human feedback and complex reasoning; with a focus across text, image, and video modalities. As an Sr. Applied Scientist, you will play a critical role in supporting the development of Generative AI (Gen AI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities Collaborate with cross-functional teams of engineers, product managers, and scientists to identify and solve complex problems in Gen AI Design and execute experiments to evaluate the performance of different algorithms (PT, SFT, RL) and models, and iterate quickly to improve results Think big about the arc of development of Gen AI over a multi-year horizon, and identify new opportunities to apply these technologies to solve real-world problems Communicate results and insights to both technical and non-technical audiences, including through presentations and written reports About the team We are passionate scientists dedicated to pushing the boundaries of innovation in Gen AI with focus on Software Development use cases.
IN, HR, Gurugram
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced ML systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real-world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for India Consumer Businesses. Machine Learning, Big Data and related quantitative sciences have been strategic to Amazon from the early years. Amazon has been a pioneer in areas such as recommendation engines, ecommerce fraud detection and large-scale optimization of fulfillment center operations. As Amazon has rapidly grown and diversified, the opportunity for applying machine learning has exploded. We have a very broad collection of practical problems where machine learning systems can dramatically improve the customer experience, reduce cost, and drive speed and automation. These include product bundle recommendations for millions of products, safeguarding financial transactions across by building the risk models, improving catalog quality via extracting product attribute values from structured/unstructured data for millions of products, enhancing address quality by powering customer suggestions We are developing state-of-the-art machine learning solutions to accelerate the Amazon India growth story. Amazon India is an exciting place to be at for a machine learning practitioner. We have the eagerness of a fresh startup to absorb machine learning solutions, and the scale of a mature firm to help support their development at the same time. As part of the India Machine Learning team, you will get to work alongside brilliant minds motivated to solve real-world machine learning problems that make a difference to millions of our customers. We encourage thought leadership and blue ocean thinking in ML. Key job responsibilities Use machine learning and analytical techniques to create scalable solutions for business problems Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes Design, develop, evaluate and deploy, innovative and highly scalable ML models Work closely with software engineering teams to drive real-time model implementations Work closely with business partners to identify problems and propose machine learning solutions Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model maintenance Work proactively with engineering teams and product managers to evangelize new algorithms and drive the implementation of large-scale complex ML models in production Leading projects and mentoring other scientists, engineers in the use of ML techniques About the team International Machine Learning Team is responsible for building novel ML solutions that attack India first (and other Emerging Markets across MENA and LatAm) problems and impact the bottom-line and top-line of India business. Learn more about our team from https://www.amazon.science/working-at-amazon/how-rajeev-rastogis-machine-learning-team-in-india-develops-innovations-for-customers-worldwide
US, CA, Sunnyvale
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Principal Applied Scientist with a strong deep learning background, to lead the development of industry-leading technology with multimodal systems. As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance).
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the next level. We focus on creating entirely new products and services with a goal of positively impacting the lives of our customers. No industries or subject areas are out of bounds. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As a Research Scientist, you will work with a unique and gifted team developing exciting products for consumers and collaborate with cross-functional teams. Our team rewards intellectual curiosity while maintaining a laser-focus in bringing products to market. At the edge of both academic and applied research in this product area, you have the opportunity to work together with some of the most talented scientists, engineers, and product managers. Here at Amazon, we embrace our differences. We are committed to furthering our culture of inclusion. We have thirteen employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We are constantly learning through programs that are local, regional, and global. Amazon’s culture of inclusion is reinforced within our 16 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Our team highly values work-life balance, mentorship and career growth. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We care about your career growth and strive to assign projects and offer training that will challenge you to become your best. Key job responsibilities * Partner with laboratory science teams on design and analysis of experiments * Originate and lead the development of new data collection workflows with cross-functional partners * Develop and deploy scalable bioinformatics analysis and QC workflows * Evaluate and incorporate novel bioinformatic approaches to solve critical business problems
US, CA, Sunnyvale
As a Principal Scientist within the Artificial General Intelligence (AGI) organization, you are a trusted part of the technical leadership. You bring business and industry context to science and technology decisions, set the standard for scientific excellence, and make decisions that affect the way we build and integrate algorithms. A Principal Applied Scientist will solicit differing views across the organization and are willing to change your mind as you learn more. Your artifacts are exemplary and often used as reference across organization. You are a hands-on scientific leader; develop solutions that are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility; and tackle intrinsically hard problems, acquiring expertise as needed. Principal Applied Scientists are expected to decompose complex problems into straightforward solutions. You amplify your impact by leading scientific reviews within your organization or at your location; and scrutinize and review experimental design, modeling, verification and other research procedures. You also probe assumptions, illuminate pitfalls, and foster shared understanding; align teams toward coherent strategies; and educate keeping the scientific community up to date on advanced techniques, state of the art approaches, the latest technologies, and trends. AGI Principal Applied Scientists help managers guide the career growth of other scientists by mentoring and play a significant role in hiring and developing scientists and leads. You will play a critical role in driving the development of Generative AI (GenAI) technologies that can handle Amazon-scale use cases and have a significant impact on our customers' experiences. Key job responsibilities You will be responsible for defining key research directions, inventing new machine learning techniques, conducting rigorous experiments, and ensuring that research is translated into practice. You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. A Principal Applied Scientist will participate in organizational planning, hiring, mentorship and leadership development. You will also be build scalable science and engineering solutions, and serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance). A day in the life About the team Amazon’s AGI team is focused on building foundational AI to solve real-world problems at scale, delivering value to all existing businesses in Amazon, and enabling entirely new services and products for people and enterprises around the world.
LU, Luxembourg
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models and speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, Spain, South Africa, UAE, and UK). Please note these are not remote internships.
US, WA, Seattle
Revolutionize the Future of AI at the Frontier of Applied Science Are you a brilliant mind seeking to push the boundaries of what's possible with artificial intelligence? Join our elite team of researchers and engineers at the forefront of applied science, where we're harnessing the latest advancements in natural language processing, deep learning, and generative AI to reshape industries and unlock new realms of innovation. As an Applied Science Intern, you'll have the unique opportunity to work alongside world-renowned experts, gaining invaluable hands-on experience with cutting-edge technologies such as large language models, transformers, and neural networks. You'll dive deep into complex challenges, fine-tuning state-of-the-art models, developing novel algorithms for named entity recognition, and exploring the vast potential of generative AI. This internship is not just about executing tasks – it's about being a driving force behind groundbreaking discoveries. You'll collaborate with cross-functional teams, leveraging your expertise in statistics, recommender systems, and question answering to tackle real-world problems and deliver impactful solutions. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for LLM & GenAI Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA; Pittsburgh, PA. Key job responsibilities We are particularly interested in candidates with expertise in: LLMs, NLP/NLU, Gen AI, Transformers, Fine-Tuning, Recommendation Systems, Deep Learning, NER, Statistics, Neural Networks, Question Answering. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of LLMs and GenAI. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on recommendation systems, question answering, deep learning and generative AI. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Collaborate with cross-functional teams to tackle complex challenges in natural language processing, computer vision, and generative AI. - Fine-tune state-of-the-art models and develop novel algorithms to push the boundaries of what's possible. - Explore the vast potential of generative AI and its applications across industries. - Attend cutting-edge research seminars and engage in thought-provoking discussions with industry luminaries. - Leverage state-of-the-art computing infrastructure and access to the latest research papers to fuel your innovation. - Present your groundbreaking work and insights to the team, fostering a culture of knowledge-sharing and continuous learning.
US, WA, Seattle
Unlock the Future with Amazon Science! Calling all visionary minds passionate about the transformative power of machine learning! Amazon is seeking boundary-pushing graduate student scientists who can turn revolutionary theory into awe-inspiring reality. Join our team of visionary scientists and embark on a journey to revolutionize the field by harnessing the power of cutting-edge techniques in bayesian optimization, time series, multi-armed bandits and more. At Amazon, we don't just talk about innovation – we live and breathe it. You'll conducting research into the theory and application of deep reinforcement learning. You will work on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. You will propose and deploy solutions that will likely draw from a range of scientific areas such as supervised, semi-supervised and unsupervised learning, reinforcement learning, advanced statistical modeling, and graph models. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Machine Learning Applied Science Internships in, but not limited to Arlington, VA; Bellevue, WA; Boston, MA; New York, NY; Palo Alto, CA; San Diego, CA; Santa Clara, CA; Seattle, WA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Programming/Scripting Languages, Statistics, Reinforcement Learning, Causal Inference, Large Language Models, Time Series, Graph Modeling, Supervised/Unsupervised Learning, Deep Learning, Predictive Modeling In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Reinforcement Learning and Optimization within Machine Learning. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on developing novel RL algorithms and applying them to complex, real-world challenges. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. - Design, development and evaluation of highly innovative ML models for solving complex business problems. - Research and apply the latest ML techniques and best practices from both academia and industry. - Think about customers and how to improve the customer delivery experience. - Use and analytical techniques to create scalable solutions for business problems.
US, WA, Seattle
Shape the Future of Human-Machine Interaction Are you a master of natural language processing, eager to push the boundaries of conversational AI? Amazon is seeking exceptional graduate students to join our cutting-edge research team, where they will have the opportunity to explore and push the boundaries of natural language processing (NLP), natural language understanding (NLU), and speech recognition technologies. Imagine waking up each morning, fueled by the excitement of tackling complex research problems that have the potential to reshape the world. You'll dive into production-scale data, exploring innovative approaches to natural language understanding, large language models, reinforcement learning with human feedback, conversational AI, and multimodal learning. Your days will be filled with brainstorming sessions, coding sprints, and lively discussions with brilliant minds from diverse backgrounds. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated.. Join us at the forefront of applied science, where your contributions will shape the future of AI and propel humanity forward. Seize this extraordinary opportunity to learn, grow, and leave an indelible mark on the world of technology. Amazon has positions available for Natural Language Processing & Speech Applied Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: NLP/NLU, LLMs, Reinforcement Learning, Human Feedback/HITL, Deep Learning, Speech Recognition, Conversational AI, Natural Language Modeling, Multimodal Learning. In this role, you will work alongside global experts to develop and implement novel, scalable algorithms and modeling techniques that advance the state-of-the-art in areas at the intersection of Natural Language Processing and Speech Technologies. You will tackle challenging, groundbreaking research problems on production-scale data, with a focus on natural language processing, speech recognition, text-to-speech (TTS), text recognition, question answering, NLP models (e.g., LSTM, transformer-based models), signal processing, information extraction, conversational modeling, audio processing, speaker detection, large language models, multilingual modeling, and more. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life - Develop novel, scalable algorithms and modeling techniques that advance the state-of-the-art in natural language processing, speech recognition, text-to-speech, question answering, and conversational modeling. - Tackle groundbreaking research problems on production-scale data, leveraging techniques such as LSTM, transformer-based models, signal processing, information extraction, audio processing, speaker detection, large language models, and multilingual modeling. - Collaborate with cross-functional teams to solve complex business problems, leveraging your expertise in NLP/NLU, LLMs, reinforcement learning, human feedback/HITL, deep learning, speech recognition, conversational AI, natural language modeling, and multimodal learning. - Thrive in a fast-paced, ever-changing environment, embracing ambiguity and demonstrating strong attention to detail.
US, WA, Seattle
Do you enjoy solving challenging problems and driving innovations in research? Do you want to create scalable optimization models and apply machine learning techniques to guide real-world decisions? We are looking for builders, innovators, and entrepreneurs who want to bring their ideas to reality and improve the lives of millions of customers. As a Research Science intern focused on Operations Research and Optimization intern, you will be challenged to apply theory into practice through experimentation and invention, develop new algorithms using modeling software and programming techniques for complex problems, implement prototypes and work with massive datasets. As you navigate through complex algorithms and data structures, you'll find yourself at the forefront of innovation, shaping the future of Amazon's fulfillment, logistics, and supply chain operations. Imagine waking up each morning, fueled by the excitement of solving intricate puzzles that have a direct impact on Amazon's operational excellence. Your day might begin by collaborating with cross-functional teams, exchanging ideas and insights to develop innovative solutions. You'll then immerse yourself in a world of data, leveraging your expertise in optimization, causal inference, time series analysis, and machine learning to uncover hidden patterns and drive operational efficiencies. Throughout your journey, you'll have access to unparalleled resources, including state-of-the-art computing infrastructure, cutting-edge research papers, and mentorship from industry luminaries. This immersive experience will not only sharpen your technical skills but also cultivate your ability to think critically, communicate effectively, and thrive in a fast-paced, innovative environment where bold ideas are celebrated. Amazon has positions available for Operations Research Science Internships in, but not limited to, Bellevue, WA; Boston, MA; Cambridge, MA; New York, NY; Santa Clara, CA; Seattle, WA; Sunnyvale, CA. Key job responsibilities We are particularly interested in candidates with expertise in: Optimization, Causal Inference, Time Series, Algorithms and Data Structures, Statistics, Operations Research, Machine Learning, Programming/Scripting Languages, LLMs In this role, you will gain hands-on experience in applying cutting-edge analytical techniques to tackle complex business challenges at scale. If you are passionate about using data-driven insights to drive operational excellence, we encourage you to apply. The ideal candidate should possess the ability to work collaboratively with diverse groups and cross-functional teams to solve complex business problems. A successful candidate will be a self-starter, comfortable with ambiguity, with strong attention to detail and the ability to thrive in a fast-paced, ever-changing environment. A day in the life Develop and apply optimization, causal inference, and time series modeling techniques to drive operational efficiencies and improve decision-making across Amazon's fulfillment, logistics, and supply chain operations Design and implement scalable algorithms and data structures to support complex optimization systems Leverage statistical methods and machine learning to uncover insights and patterns in large-scale operations data Prototype and validate new approaches through rigorous experimentation and analysis Collaborate closely with cross-functional teams of researchers, engineers, and business stakeholders to translate research outputs into tangible business impact