By finding patterns in the sequence of words and subwords (e.g., prefixes) in massive amounts of text, these models can predict the most likely next word in a generated sequence and repeat this prediction until the output is complete. Predictions improve with more data, and more specific outputs can be created using specialized data.
Beyond data, engineers build and improve LLMs through refinements to architecture—the way data is processed—and training, where humans can manually refine LLM predictions. Despite these efforts, LLMs may hallucinate—generating false or misleading information—and may propagate stereotypes, discrimination, and bias in their outputs.