One reason why many creators are migrating from platform X to Bluesky is due to the policy regarding data usage for artificial intelligence training. While Bluesky claims they will not use user-generated content to train AI like platform X, it is noted that Bluesky does not explicitly prohibit the extraction of post content data, which could potentially be used for AI training. This was evident when employees of Hugging Face recently published a dataset of one million posts obtained from Bluesky via API, before promptly removing it and apologizing for the breach of transparency and consent in data usage. Bluesky later clarified that their platform is public and data access follows guidelines set in robots.txt like other websites. Additionally, Bluesky will allow users to specify whether they consent to their data being used for AI training, but cannot enforce data usage beyond their platform. It ultimately falls on developers to respect the correct usage of such data. Currently, Bluesky is in discussions with engineering and legal teams to establish future practices.
TLDR: Creators are shifting to Bluesky from X due to AI data usage policies, though concerns arise regarding potential data extraction for training, leading to clarifications and discussions for future protocols.
Leave a Comment