A large language model (LLM) is used to train artificial intelligence (AI) to understand and generate text, just like a human. Learn about large language models, their popular applications, and how LLMs differ from other machine learning models.
How large language models work
Large language models are deep learning algorithms designed to train AI programs. LLMs are a type of transformer model or neural network that looks for patterns in sequential data sets (such as words in a sentence) to determine context. The algorithm returns an appropriate, human-like response when given a text prompt.
The most popular applications of LLMs are AI chatbots. Examples of large language models include GPT-4o, which powers the popular ChatGPT, and PaLM2, the algorithm behind Google Gemini. They live up to their name: LLMs are typically so large that they can’t be run on a single computer, so they run as a web service rather than a standalone program.
Transformer models consist of layers that can be stacked to create increasingly complex algorithms. LLMs rely in particular on two key features of transformer models: positional encoding and self-attention.