Large Language Models

Overview

Large language models are sophisticated computer programs that process and generate natural language. Anyone using generative AI tools to produce text-based content interacts with an LLM, usually through a chatbot or other user interface.

1440 Findings

Hours of research by our editors, distilled into minutes of clarity.

  • LLMs explained simply and quickly

    Large language models generate responses by predicting one token at a time, based on patterns they’ve learned from massive datasets. This explainer covers their core components: data, architecture, training, and how transformer models use context to simulate conversation.

  • EOS and maximum length tokens tell an LLM when to stop generating a response

    Tokens are words or subwords (e.g., prefixes) that a large language model processes to generate a response. During training, the model is taught that if it ever generates a special token called end-of-sequence, its response is complete.

  • LLMs are the backbone of AI tools that process and generate natural language

    Large language models identify patterns in massive datasets of books, code, and unlabeled text from which to generate coherent responses. GPT-3 was trained on about 45 terabytes of data and uses 175 billion parameters—each parameter being a value the model adjusts as it learns.

  • Transformer architecture was the breakthrough that made AI chatbots possible

    It changed how text was translated from a sequential, "one word at a time" method to one where every word in a text is processed in parallel. Advances in positional encoding and self-attention also helped models recognize context and word order better.

  • LLMs are trained via modeling, imitation learning, and reinforcement learning

    Training large language models begins with pretraining the model to predict the next word in a sequence based on finding patterns in massive amounts of text. Patterns are then fine-tuned to model human-written dialogue and to align responses with user preferences.

  • Agentic AI systems can proactively achieve user goals without user direction

    Unlike large language models, which do not take follow-up actions after generating their output, agentic AI uses an initial prompt to take multiple steps, learn from its environment and outcomes, and continue working without follow-up human prompting.

Explore Science & Technology

Dive into the dynamic world of science and technology, where curiosity has brought about extraordinary understandings of the universe and creativity has led to the breakthroughs and innovations that have transformed our world. Explore a wide range of topics, from the natural and physical sciences to cutting-edge technologies and the people who shaped them, each of which is accompanied by carefully curated resources meant to inform, engage, and inspire those eager to uncover the nature of reality.

View All Science & Technology