Prior to this, Reddit announced the blocking of data scraping from websites, which is believed to be related to the issue of using the data to train AI. The company has made agreements with some service providers such as Google, with reported figures of $60 million annually. However, many companies have not made such agreements and therefore cannot access the data.
It was discovered that other search services like Bing do not return search results that are from the Reddit website when filtering results for the past week (site:reddit.com). Similarly, DuckDuckGo also does not show any results. Meanwhile, Google continues to display the most recent results as usual.
A Reddit representative confirmed the block on access, but did not provide details of the agreements with each service provider. They mentioned blocking bots that scrape inaccurate data, especially for training AI, all in compliance with public content guidelines. They have also updated the robots.txt file accordingly. Reddit continues to collaborate with service providers who follow the guidelines.
Engadget cited sources related to news that Bing has not yet reached an agreement with Reddit due to disagreement on certain conditions.
TLDR: Reddit announced blocking data scraping, signed agreements with some providers, Bing hasn’t reached an agreement, forums discuss AI training data access.
Leave a Comment