Home ยป Insufficient Infrastructure: Wikipedia Faces Escalating Data Siphoning by AI Bots

Insufficient Infrastructure: Wikipedia Faces Escalating Data Siphoning by AI Bots

The Wikimedia Foundation, the organization responsible for overseeing Wikipedia, has released a report on the impact of AI bots that have been scraping content from the project, affecting the resources of the system prepared to accommodate human users. The influx of AI bots has increased significantly since the beginning of 2024, with Wikimedia stating that the content most affected is multimedia content, including images, videos, and various files. This particular graphic content has increased by 50% from the existing 144 million files.

The structure of Wikimedia’s system already accounts for higher-than-normal levels of traffic, citing an incident where former U.S. President Jimmy Carter passed away in December 2024. The traffic to Jimmy Carter’s page and a 1.5-hour video file of his farewell address increased twofold. Although Wikimedia had resources in place to handle such scenarios, this time it proved insufficient due to the prolonged increase in bot activity.

Wikimedia already has a system in place to manage popular content through caching. However, the influx of bots has brought more access to less-read content, which Wikimedia refers to as high-cost assets, as they need to be pulled from the main system. Currently, 65% of high-cost assets are generated by bots, leading the engineering team to block this type of traffic in some cases.

Wikimedia maintains that the project’s policy remains open for everyone to access content for free. However, with an increasing occurrence of such incidents, the project is beginning to draft guidelines to restore a more normalcy to the fundamental structure of the system.

TLDR: Wikimedia reports increased AI bots impacting multimedia content, exceeding system capacity, leading to the need for guidelines to restore system normalcy.

More Reading

Post navigation

Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

AI-Powered Wikipedia Tests Web Data Accuracy Using ChatGPT’s Information Verification Technology

Wikipedia Releases Dataset for AI Training via Kaggle Platform

Strategies for Navigating Global Elections Unveiled by OpenAI