A technical introduction to OpenAI's GPT-3 language model

Introduced in May 2020, Generative Pre-trained Transformer 3 (GPT-3) is OpenAI's groundbreaking third-generation predictive language model. Widely considered the world's most powerful NLP technology, it has sparked many discussions in the tech and business communities about its potential use cases and impact on existing business processes and applications.

While some aspects of the GPT-3 model described in the paper "Language Models are Few-Shot Learners" may seem too technical to non-AI researchers, it is worth zooming in on some of the key features of the model in order to better understand what it does and how it can be used in practice.

What is GPT-3?

Strictly speaking, GPT-3 is a family of autoregressive language models which include GPT-3 Small, GPT-3 Medium, GPT-3 Large, GPT-3 XL, GPT-3 2.7B, GPT-3 6.7B, GPT-3 13B, and GPT-3 175B. Introduced in the paper "Language Models are Few-Shot Learners", these models share the same transformer-based architecture similar to that of their predecessor GPT-2. All GPT-3 models are trained on a mixture of datasets consisting of the Common Crawl, WebText2, Books1 and Books2, and English-language Wikipedia datasets.

GPT-3 175B is the largest of all GPT-3 models and is commonly referred to as "the GPT-3". With 175 billion trainable parameters, it is about two orders of magnitude larger than the 1.5 billion parameter GPT-2.

According to OpenAI's paper, GPT-3 175B outperforms other large-scale models on a number of NLP tasks. Being a meta-learning model, it is capable of both recognizing and rapidly adapting to the desired task at inference time after having developed a broad set of skills and pattern recognition abilities during unsupervised pre-training.

OpenAI API, or GPT-3-as-a-Service

In June 2020, OpenAI launched the API product that can be used to access the AI models developed by the company, including those based on GPT-3. Available in a private beta, the OpenAI API is equipped with a general purpose text in–text out interface and enables users to experiment with GTP-3-based models, explore its strengths and weaknesses, and integrate it into their own products.

Bottom line

The GPT-3 autoregressive language model made its debut in May 2020 and marked an important milestone in NLP research. Trained on a large internet-based text corpus, it boasts 175 billion parameters and is two orders of magnitude larger than its predecessor GPT-2.

A number of models based on GPT-3 are available via OpenAI API, OpenAI's commercial product released in private beta in June 2020.

See also

Turkish Alphabet Poster, English-Labeled
$17.99

The Turkish alphabet chart.

Подмосковье Morphemic Analysis Poster
$14.99

A poster featuring the morphemic analysis of the Russian word Подмосковье.

Octaazacubane Molecule Poster, 2D Structure, English-Labeled
$19.99

A poster featuring the 2D structure of the octaazacubane molecule.

phoneme IPA Transcription Poster
$14.99

A poster featuring the phonetic transcription of the word phoneme in the International Phonetic Alphabet (IPA).

Notes Poster, English-Labeled
$19.99

A poster featuring the names of musical notes.

SVMs in practice

A primer on support vector machines (SVMs) and their applications.

A beginner's guide to graph embeddings

Understanding what graph embeddings are and why they are important for graph analytics.

seq2seq Trainer

Train sequence-to-sequence models online.

DALL·E Client

Create images from text using OpenAI's DALL·E.

Face Detector

A demo of a face detection service.

All prices listed are in United States Dollars (USD). Visual representations of products are intended for illustrative purposes. Actual products may exhibit variations in color, texture, or other characteristics inherent to the manufacturing process. The products' design and underlying technology are protected by applicable intellectual property laws. Unauthorized reproduction or distribution is prohibited.