Science & Technology

Large Language Models

Related to Generative AI, OpenAI, and Deepfakes

What We Learned

Background

Providing the foundation for tools like OpenAI's ChatGPT and Google's Gemini, large language models are sophisticated computer programs that process and generate natural language.

By finding patterns in the sequence of words and subwords (prefixes, for example) in massive amounts of text, these models can predict the most likely next word in a generated sequence and repeat this prediction until the output is complete. Predictions improve with more data, and more specific outputs can be created using specialized data.

AI chatbots are user-friendly interfaces that act as intermediaries between users and the LLM framework.

How Are LLMs Created?

Engineers follow three key steps to build and improve an LLM: data, architecture, and training.

Data can be any digital resource containing text, including online textbooks, website articles, and podcast transcripts. As of 2024, the Internet Archive estimates there are 44 million digital books and texts and 835 billion websites on the internet, and hundreds of trillions of words from these sources can be in an LLM’s data set.

The software that processes this data is called transformer architecture, which Google researchers developed in 2017 to improve language translation.

First, transformers assign each word or subword in a block of text a set of numbers called a token. All tokens are then reviewed in parallel, identifying which go together most often to “learn” patterns in meaning, grammar, and context. These patterns are coded as parameters—numeric values that can be changed to improve output quality.

After the model sets parameters from data analysis alone, humans train the LLM, ranking the quality of its outputs and identifying boundaries for what should be excluded from outputs (e.g., violence). The model then adjusts its parameters, changing the associations between specific tokens to mimic natural language accurately. See how ChatGPT is trained here.

How Are LLMs Used?

Anyone using generative AI tools to produce text-based content interacts with an LLM, usually through a chatbot or other user interface.

When prompting, the chatbot sends the input to the LLM, which sends back unique responses by combining user instructions (e.g., “Using an engaging tone …”) and a randomization parameter called temperature, which sometimes tells the model to choose less probable next tokens.

LLMs have also been fine-tuned, where an existing model’s parameters are adjusted based on new data to accomplish a specific task more effectively. For instance, a business can fine-tune an LLM with product-specific troubleshooting knowledge so that its customer service chatbot can provide instructions on fixing a hardware problem in a friendly tone.

Over time, the LLM framework will be more commonly used in agentic AI, where a system takes actions to fulfill a goal given by a user, such as booking a vacation. While built on generative AI, these systems would not require subsequent human direction or refinement.

Limitations of LLMs

Because of the temperature parameter, insufficient data, or inadequate training, LLMs may hallucinate, generating false or misleading information. Due to the complex nature of transformer architecture, explicitly identifying what led to the hallucination is often impossible.

Language models are limited to text-based generation; image, audio, and video generators use other tools, such as diffusion models.

They are also confined to the information in their data and may propagate stereotypes, discrimination, and bias in their outputs (listen to an expert discuss the issue here). LLMs need ongoing updates and training to incorporate new information and ensure accuracy.

Free Newsletter

Show Example

1440 Science & Technology

The best Science & Technology content from across the internet in your inbox every week.

Unsubscribe at any time. Terms & Privacy

Dive Deeper

Relevant articles, podcasts, videos, and more from around the internet — curated and summarized by our team

Sam Altman standing on a stage with the caption OpenAI's Sam Altman live at ted 2025
Open link on youtube.com

In 2022, OpenAI's CEO, Sam Altman, popularized the use of large language models (LLMs) with the release of ChatGPT. Where does he see the technology going next? In this conversation at TED2025, Altman discusses the growth and future of AI as agentic systems and how they may impact human creativity, safety, and morality.

Georgia Institute of Technology

Visualization of transformer architecture

masked self-attention calculation using query, key, and value matrices
Open link on poloclub.github.io

Transformer architecture is the neural network framework upon which large language models predict and generate text based on user inputs. But how exactly does it make its predictions? This interactive visualizes what powers deep-learning models and lets you explore the "thinking" a GPT does based on your input to generate responses.

The Wall Street Journal

The chips that power AI

a person holding a wafer of computer chips in front of a background of a zoomed in computer chip with the caption inside amazon's AI chip lab
Open link on youtube.com

Generative AI relies on multiple data streams processed in parallel to optimize computing workloads. This is only accomplished through specialized computer chips, called graphics processing units. This video shows how these chips are manufactured in Amazon's chip lab and discusses the future market for this technology.

a wizard child is wearing a robe, scarf, and wizard hat while holding a wand that emits sparks from its end
Open link on gandalf.lakera.ai

Even with robust human training to prevent the inclusion of specific information in outputs, large language models may be fooled into generating dangerous or sensitive information if correctly prompted. In this game, a wizard withholds eight passwords, which you must prompt them to reveal with ever-increasing LLM safeguards. Good luck!

NVIDIA Technical Blog

A guide to diffusion models

diffusion model turning noise into a picture of a futuristic building
Open link on developer.nvidia.com

While text-based AI tools rely on large language models to recognize patterns in text, diffusion models generate images, video, audio, and 3D models by adding and removing noise from a dataset. This article explores how these models use a reward system to generate media and how they can be customized to match a user’s style.

series of shapes in rows, including spheres, rings, cones, cubes, and cylinders
Open link on zdnet.com

Large language models are confined to their data sets and unable to connect to external data sources and tools. Enter the model context protocol, which enables integration and simplifies the creation and maintenance of AI agents to complete complex, system-specific tasks. This article explains MCP and how it will lead to universal AI integration.

Explore all Large Language Models

Search and uncover even more interesting information in our vast database of curated Large Language Models resources