Search
Showing results for “Large Language Models”
Jump to a topic
Large Language ModelsLarge language models are sophisticated computer programs that process and generate natural language. Anyone using generative AI tools to produce text-based content interacts with an LLM, usually through a chatbot or other user interface.
By finding patterns in the sequence of words and subwords (e.g., prefixes) in massive amounts of text, these models can predict the most likely next word in a generated sequence and repeat this prediction until the output is complete. Predictions improve with more data, and more specific outputs can be created using specialized data.
Beyond data, engineers build and improve LLMs through refinements to architecture—the way data is processed—and training, where humans can manually refine LLM predictions. Despite these efforts, LLMs may hallucinate—generating false or misleading information—and may propagate stereotypes, discrimination, and bias in their outputs.Explore Large Language Models
What we've found
Large language models struggle to play video games, despite coding them easilyAs of 2025, given clear tasks, specific feedback, and test results to learn from, LLMs have been successful at coding simple video games that resemble pre-existing ones. However, given the variety of games, the lack of training datasets, and the need for spatial awareness, which LLMs find challenging, some of these models experience greater difficulty navigating and "winning" games. IEEE SpectrumExplore rankings of large language modelsThe SEAL LLM Leaderboards provide benchmarks for LLMs by assessing them against agentic, frontier, and safety tasks, providing insights into each model's strengths, deficiencies, and failures. SEAL LeaderboardModel context protocol allows large language models to connect to external dataUsing a unified interface, MCP simplifies how AI systems securely connect to external data, tools, and services. The protocol enables AI agents to perform tasks without ongoing human refinement or updates to manually integrate new data. ZDNETLarge language models are examples of foundation modelsSuch models are trained on massive amounts of data, which enable them to perform a variety of tasks, rather than being provided task-specific data to complete a narrow function. Such models come with high compute costs and potential trust issues due to the nature of their unorganized training data. IBMThe evolution of the em dashThe curious form of punctuation was originally used to replicate humans' naturally broken—and sometimes scattered—way of speaking. It became especially popular with the rise of the novel in the 19th century, and was used by writers such as Charles Dickens, Jane Austen, and Charlotte Brontë. After her death, Emily Dickinson's use of the punctuation mark became her poetry's most recognizable trait. But as large language models like ChatGPT began using the em dash, real humans began viewing it as a tell of non-human writing. This podcast episode dives into the strange turn of events. Understanding how Claude Code and other AI coding agents functionUnder human oversight, a supervising large language model interprets user tasks and delegates work to subordinate LLMs, which can generate code, fix bugs, and run tests, often most effectively for proofs-of-concept. Incremental backups and versioning are crucial when using such agents, which can lose details during their work as a result of compressing context history to work around memory limitations. Ars TechnicaAnthropic's interpretability team discusses how AI models' thinking mimics biologyJust as humans engage in complex behaviors to survive and reproduce, the team of experts in neuroscience, virology, mathematics, and other disciplines argues that large language models develop complex mechanisms to achieve their goals. These mechanisms can be manipulated to assess how they modify outputs—much like stimulating individual neurons—to uncover how the LLM "thinks." AnthropicHow Anthropic's Frontier Red Team assesses the dangers of its AI toolsThe team of researchers deliberately attempts to breach the safeguards in Claude to determine whether the large language model can be used to commit cyberattacks, develop biological weapons, or enable other harmful behavior. Unlike many other red teams, Anthropic shares its findings and how it resolved them publicly to help build transparency and credibility. FortuneWatch how a vending machine run with Anthropic's AI went out of businessAs a stress-test of its large language models, a version of Claude—Claudius—managed the device as a small business for a month to identify potential challenges to AI business automation. Through successive prompts, Claudius was convinced to give away items for free to combat capitalism and to stock up on previously prohibited items, such as a PlayStation 5. The Wall Street JournalHow Model Context Protocol allows AI systems to communicate with databasesDeveloped by Anthropic, the open-source software serves as an integration layer that standardizes how large language models connect to real-time information sources or tools. Just as a USB-C port serves as a universal standard for many peripherals, MCP streamlines these integrations and provides context about external sources to LLMs in a way that AI can understand. IBM TechnologyBreaking down Anthropic's products and why their models are named after literary formsSince March 2024, the Claude series of large language models includes three varieties—Haiku, Sonnet, and Opus—named to metaphorically illustrate increasing complexity and computing power. Claude Code can perform software engineering tasks for developers and, as of January 2026, writes nearly all of Anthropic's code alongside Opus 4.5. IBM'Context rot' explains why AI conversations degrade the longer they runResearch shows that as a large language model's context window fills up, performance declines, and models begin over-indexing on content at the start and end of a conversation while losing track of what's in the middle. The practical fix is simpler than you'd think: start a fresh chat more often. Product TalkAI data center growth has created shortages and price hikes in computer partsThe parallel processing needed to power large language models for AI chatbots and other tools drove demand for high-speed graphics processing units to spike in 2025, enabling capital-rich technology companies to price out consumers. Similar demands for the computer memory used in storage drives are expected to keep shortages from easing through 2028. Ars TechnicaExplore persistent 3D environments with MarbleMarble is a spatial AI system built by World Labs, a San Francisco startup focusing on world models (rather than large language models). Using a text prompt (or a single image or video), users can generate a "world" and move through it, edit it, or export it. The company and project are part of the broader spatial intelligence area of AI, where AI can simulate environments. MarbleAI scraping for LLM training data has significantly strained Wikipedia's infrastructureFrom January 2024 to April 2025, the site's bandwidth increased by 50% as automated bots downloaded terabytes of data for the large language models powering AI tools. The Wikimedia Foundation found that bots accounted for 65% of the highest demand requests (e.g., videos) despite representing just 35% of page views. Ars TechnicaTransformer architecture can recognize and predict patterns in languageThe underlying software powering text generation in AI tools associates each word or subword—called a token—with a set of values corresponding to how often it appears near other tokens. By recognizing these associations in prompts, the LLM can infer meaning. Financial TimesAI is learning to be funnyExperts consider humor a particular challenge for large language models to learn, given the skill's complex linguistic play. In an experiment, a stand-up comedian performed half AI-produced jokes, and half human-produced jokes, to no discernible difference in the audience's laughter. Undark MagazineTechnologies labeled “AI” have historically lost that title after widespread adoptionGenerative AI is the latest entry in a recurring cycle where emerging tools start as "AI" until they become common software, like databases or machine learning. Generative AI and large language models may be the next platform shift after smartphones and the Web. SuperAILLMs explained simply and quicklyLarge language models generate responses by predicting one token at a time, based on patterns they’ve learned from massive datasets. This explainer covers their core components: data, architecture, training, and how transformer models use context to simulate conversation. 1440Trick an LLM into revealing sensitive informationEven with robust training, large language models may be fooled into providing sensitive information if correctly prompted. In this game, a wizard withholds eight passwords, which you must prompt them to reveal with ever-increasing LLM safeguards. LakeraDiffusion models generate nontext media by adding and removing noise from a datasetUnlike text-based AI tools, which rely on large language models to recognize patterns in text, diffusion models use a reward system to generate images, video, audio, and 3D models. They can also be customized to match a user’s style. NVIDIA Technical BlogAgentic AI systems can proactively achieve user goals without user directionUnlike large language models, which do not take follow-up actions after generating their output, agentic AI uses an initial prompt to take multiple steps, learn from its environment and outcomes, and continue working without follow-up human prompting. IBM TechnologyTemperature is a value that controls how predictable or creative an LLM's output isIn large language models, temperature adjusts the randomness in text generation by modifying how often words with lower probabilities are selected next in the sequence. Lower temperatures produce more obvious outputs while higher temperatures produce more creative responses. IBMEOS and maximum length tokens tell an LLM when to stop generating a responseTokens are words or subwords (e.g., prefixes) that a large language model processes to generate a response. During training, the model is taught that if it ever generates a special token called end-of-sequence, its response is complete. Louis BouchardLLMs are trained via modeling, imitation learning, and reinforcement learningTraining large language models begins with pretraining the model to predict the next word in a sequence based on finding patterns in massive amounts of text. Patterns are then fine-tuned to model human-written dialogue and to align responses with user preferences. Ari SeffTokenization breaks text into pieces that LLMs use to process languageInstead of understanding the meaning of words, large language models break up sentences into words and subwords, assigning each a set of numbers called tokens. Grammar rules are then "learned" by finding associations between these sets of numbers. MicrosoftLLMs are the backbone of AI tools that process and generate natural languageLarge language models identify patterns in massive datasets of books, code, and unlabeled text from which to generate coherent responses. GPT-3 was trained on about 45 terabytes of data and uses 175 billion parameters—each parameter being a value the model adjusts as it learns. IBM TechnologySam Altman acknowledges the fears around creativity, copyright, and misuse of AIIn 2022, OpenAI's CEO, Sam Altman, popularized the use of large language models with the release of ChatGPT. He has emphasized the importance of involving society in shaping AI’s safety frameworks before models begin to act independently online. TEDVisualize how transformer architecture generates an LLM's outputsTransformer architecture is the neural network framework upon which large language models predict and generate text based on user inputs. This interactive lets you explore the "thinking" a GPT does to generate responses based on your input. Georgia Institute of TechnologyDeepfakes use AI to produce manipulated media to mislead viewersUnlike most content created from large language and diffusion models, deepfakes intentionally misrepresent real people or events with the intent to deceive. As the products of this technology become more realistic, they pose increasing risks to public trust, security, and privacy. 1440Artificial general intelligence equals or surpasses human intelligenceAGI incorporates sensory perception, memory, and advanced logical inference systems to move beyond the narrow tasks seen in large language models and chatbots. One of the most significant hurdles in developing AGI is designing systems that learn and can flexibly apply learning across domains. Lex FridmanGenerative AI tools excel at pattern recognition, not contextual accuracyLarge language models are trained on vast amounts of unstructured data, from which they develop parameters for grammar and associations between words. These connections can introduce errors due to inapplicable reasoning when used on new data in unfamiliar contexts. The EconomistNeural networks are modeled after an idealization of the brainThese networks—including the backbone of ChatGPT-3, which has approximately 175 billion parameters—naturally discover the implicit rules of grammar and syntax within language by identifying patterns. Deviations in word patterns add creativity to the text generation. Stephen Wolfram
Try another search?