Large Language Models

Overview

Large language models are sophisticated computer programs that process and generate natural language. Anyone using generative AI tools to produce text-based content interacts with an LLM, usually through a chatbot or other user interface.

By finding patterns in the sequence of words and subwords (e.g., prefixes) in massive amounts of text, these models can predict the most likely next word in a generated sequence and repeat this prediction until the output is complete. Predictions improve with more data, and more specific outputs can be created using specialized data.

Beyond data, engineers build and improve LLMs through refinements to architecture—the way data is processed—and training, where humans can manually refine LLM predictions. Despite these efforts, LLMs may hallucinate—generating false or misleading information—and may propagate stereotypes, discrimination, and bias in their outputs.

1440 Findings

Hours of research by our editors, distilled into minutes of clarity.

LLMs explained simply and quickly
Large language models generate responses by predicting one token at a time, based on patterns they’ve learned from massive datasets. This explainer covers their core components: data, architecture, training, and how transformer models use context to simulate conversation.
Found via 1440 Open in new tab
EOS and maximum length tokens tell an LLM when to stop generating a response
Tokens are words or subwords (e.g., prefixes) that a large language model processes to generate a response. During training, the model is taught that if it ever generates a special token called end-of-sequence, its response is complete.
Found via Louis Bouchard Open in new tab

The Internet Archive is a nonprofit digital library of the internet
Founded in 1996, the Archive promotes open access and digital preservation of online information and, as of 2024, contains more than 145 petabytes of web pages, books, audio recordings, software programs, and other digital content.
Found via IA_Publications Open in new tab
LLMs are the backbone of AI tools that process and generate natural language
Large language models identify patterns in massive datasets of books, code, and unlabeled text from which to generate coherent responses. GPT-3 was trained on about 45 terabytes of data and uses 175 billion parameters—each parameter being a value the model adjusts as it learns.
Found via IBM Technology Open in new tab
Tokenization breaks text into pieces that LLMs use to process language
Instead of understanding the meaning of words, large language models break up sentences into words and subwords, assigning each a set of numbers called tokens. Grammar rules are then "learned" by finding associations between these sets of numbers.
Found via Microsoft Open in new tab
Visualize how transformer architecture generates an LLM's outputs
Transformer architecture is the neural network framework upon which large language models predict and generate text based on user inputs. This interactive lets you explore the "thinking" a GPT does to generate responses based on your input.
Found via Georgia Institute of Technology Open in new tab
LLMs are trained via modeling, imitation learning, and reinforcement learning
Training large language models begins with pretraining the model to predict the next word in a sequence based on finding patterns in massive amounts of text. Patterns are then fine-tuned to model human-written dialogue and to align responses with user preferences.
Found via Ari Seff Open in new tab
Temperature is a value that controls how predictable or creative an LLM's output is
In large language models, temperature adjusts the randomness in text generation by modifying how often words with lower probabilities are selected next in the sequence. Lower temperatures produce more obvious outputs while higher temperatures produce more creative responses.
Found via IBM Open in new tab

Agentic AI systems can proactively achieve user goals without user direction
Unlike large language models, which do not take follow-up actions after generating their output, agentic AI uses an initial prompt to take multiple steps, learn from its environment and outcomes, and continue working without follow-up human prompting.
Found via IBM Technology Open in new tab
AI models incorporate a randomness parameter when generating responses to prompts
AI models use this parameter to prevent repeated outputs by sometimes choosing less likely next words during sequential generation. However, this randomness—alongside insufficient data and training—may cause hallucinations of incorrect results.
Found via Google Cloud Open in new tab

Explore Science & Technology

Since our ancient human relatives began using stone tools to perform tasks, humans have harnessed scientific knowledge and new technologies to expand the boundaries of our understanding of the natural world. From quantum computing and microplastics to artificial intelligence and memory, explore these topics and more with our concise yet informative overviews and expert-curated resources.

View All Science & Technology

Large Language Models

Overview

1440 Findings

LLMs explained simply and quickly

EOS and maximum length tokens tell an LLM when to stop generating a response

The Internet Archive is a nonprofit digital library of the internet

LLMs are the backbone of AI tools that process and generate natural language

Tokenization breaks text into pieces that LLMs use to process language

Visualize how transformer architecture generates an LLM's outputs

LLMs are trained via modeling, imitation learning, and reinforcement learning

Temperature is a value that controls how predictable or creative an LLM's output is

Agentic AI systems can proactively achieve user goals without user direction

AI models incorporate a randomness parameter when generating responses to prompts

Explore Science & Technology

Featured Topics