Posted by

AI scraping for LLM training data has significantly strained Wikipedia's infrastructure

From January 2024 to April 2025, the site's bandwidth increased by 50% as automated bots downloaded terabytes of data for the large language models powering AI tools. The Wikimedia Foundation found that bots accounted for 65% of the highest demand requests (e.g., videos) despite representing just 35% of page views.

Similar Posts

Showing 1440 posts similar to AI scraping for LLM training data has significantly strained Wikipedia's infrastructure

You've reached the end.

AI scraping for LLM training data has significantly straine… | 1440