Amazon scientists help SK telecom create Korean-based natural language processor

AWS services used to process massive amounts of data needed to develop the sophisticated, open-source artificial language model.

Korean is a major world language, spoken by some 80 million people. Although it has a long history, dating back to what is believed to be its start in Manchuria, Korean is what linguists call an “isolate,” with no apparent link to other languages, such as English has with French and Latin.

SK telecom logo

But now Korean is part of the revolution in natural language processing, a branch of artificial intelligence that helps computers recognize and interpret human language. In late April, Amazon announced that Korean mobile telecommunications company SK telecom, working with Amazon Web Services researchers, have released the first open-source, advanced Korean language Generative Pre-trained Transformer-2 (GPT-2) model, called KoGPT-2.

GPT-2 is a language model that has been trained to predict – to “generate” – the completion of a sentence or a paragraph based on as little as a one-word prompt. It was developed in 2019 by OpenAI, an AI research firm. The GPT-2 model is similar to the next-word prediction on your smartphone keyboard, but much larger and more sophisticated.

KoGPT-2 is an open-source GPT-2 model pre-trained with Korean texts to improve machine learning (ML) performance in the Korean language. It can be used for chatbots, search engines, and other purposes.

In creating KoGPT-2, a team of deep-learning engineers from the Amazon Machine Learning (ML) Solutions Lab at AWS was paired with the Conversational AI Team from the SK telecom AI Center. Using AWS services such as Amazon Elastic Compute Cloud, Amazon Elastic Fabric Adaptor, and Amazon FSx for Lustre, the researchers built KoGPT-2 using a large Korean-language data set provided by SK telecom.

We wanted to help scale out SK telecom’s burgeoning natural language efforts by training the state of the art KoGPT-2 model.
Kim Tae Yoon, conversational AI team leader, SK telecom

Natural language processing models utilize a large collection of language samples to train a computer on the structure of the language, the meaning of words, and more. GPT-2 requires a particularly large dataset for its algorithm to infer the intent of someone speaking to it or writing a question. In the original GPT-2, OpenAI used some 1.5 billion parameters on a text corpus with more than 40 gigabytes of internet data. GPT-2 is trained with the objective of predicting the next word, given all of the previous words within some text.

OpenAI researchers have described the GPT-2 model as “chameleon-like”, saying it adapts to the style and context of the conditioning text. This allows researchers and engineers to generate coherent sentence about topics of their choosing. GPT-2 has already proved itself to be astonishingly powerful, with an ability to generate perfectly plausible text with a prompt of just a few words or a generalized scenario. GPT-2 has mimicked a writer creating a new Lord of the Rings battle scene, posed as a presidential speechwriter, and performed other linguistic feats.

To train KoGPT-2, SK telecom created a corpus of 125 million sentences and more than 1.6 billion words, drawing on data from the Korean Wiki Project, Korean news sources, and other sources.

That posed a formidable technical challenge, says Muhyun Kim, a senior data scientist in the Amazon ML Solutions Lab. “We needed a lot of computing power to train the model,” he says. “We used 64 GPUs (graphics processing units) for one week. Before that, though, we did a lot of experimentation to find the right configuration for analyzing the data and to troubleshoot possible errors.”

Muhyun
Muhyun Kim, senior data scientist

“Without human expertise, however, nothing can happen. Our experience helped us work with SK telecom to refine their models and speed up training. AWS is perfect for training a large model like KoGPT-2. It’s easy to use and offers a tremendous amount of bandwidth. But even if the network is fast, if the storage is slow, training will be slow. With Amazon FSx for Lustre we were able to accelerate the entire process,” Muhyun added.

SK telecom also used GluonNLP, an open-source deep-learning toolkit for natural language processing, to speed up the model-training process.

“GluonNLP offers various tokenizers and data pipeline utilities, which make it easy to train state-of-the-art models on custom data sets. We adopted techniques such as mixed precision training, efficient GPU kernels for activation functions, and integration with Amazon Elastic Fabric Adaptor, which significantly accelerated large scale distributed training with GluonNLP,” said Haibin Lin, an applied scientist from the AWS MXNet team.

With the Amazon ML Solutions Lab implementing and providing the large-scale infrastructure to make training feasible, SKT AI Center’s Conversational AI Team provided the key ingredients and linguistic expertise. As mentioned above, the team painstakingly created the dataset to train the model. They also implemented the code to make model training possible in the first place, as well as trained the KoGPT-2 model.

Kim Tae Yoon
Kim Tae Yoon, team leader, SK telecom

“We wanted to help scale out SK telecom’s burgeoning natural language efforts by training the state of the art KoGPT-2 model,” added Kim Tae Yoon, the leader of the Conversational AI Team from SK telecom. “Open source and contribution back to the growing Korean NLP community are core values of our team, so it was only natural to open source this model,” added Tae Yoon.

From a practical standpoint KoGPT-2 will give SK telecom’s customers a surprisingly human-like experience when speaking with a chatbot or finding answers to questions.

KoGPT-2 is available from the GitHub repository of SKT AI Center (https://github.com/SKT-AI/KoGPT2) under a Modified MIT License. AWS also has released a Git repository with guidance on how to deploy the KoGPT-2 model into Amazon SageMaker.

Research areas

Related content

US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.