Source URL: https://www.wired.com/story/applebot-extended-apple-ai-scraping/
Source: Wired
Title: Major Sites Are Saying No to Apple’s AI Scraping
Feedly Summary: This summer, Apple gave websites more control over whether the company could train its AI models on their data. Major publishers and platforms like The New York Times and Facebook have already opted out.
AI Summary and Description: Yes
Summary: The text discusses Apple’s introduction of a tool, Applebot-Extended, allowing publishers to opt out of their data being used for training AI models. This shift reflects changing perceptions of web crawlers’ roles in AI data collection, particularly among major media outlets who are increasingly concerned about intellectual property rights.
Detailed Description:
The article provides critical insight into the evolving relationship between tech companies and content publishers regarding data usage for AI training. Here are the major points:
– **Introduction of Applebot-Extended**:
– Apple has launched a tool enabling publishers to prevent their data from being used in AI training, highlighting a proactive approach to address concerns about data ownership and rights.
– **Adoption by Major Outlets**:
– Several significant publishers, including The New York Times and Facebook, have opted to exclude their data, signaling a growing trend of caution around AI training data usage.
– **Historical Context**:
– Applebot, the original bot introduced in 2015, has historically crawled websites for search functionalities but is now repurposed to gather data for AI model training.
– **Robots.txt Enforcement**:
– Publishers can modify their robots.txt files to block Applebot-Extended, which reflects traditional practices around data scraping and compliance norms, although these are technically not legally enforceable.
– **Current Trends**:
– Recent analyses show that a low percentage of high-traffic websites have blocked Applebot-Extended, indicating that while some are resistant, many either remain unaware of or indifferent to these new options.
– **Broader Implications**:
– This situation underscores the ongoing tensions between AI development and content creators’ rights—an area likely to become increasingly contested as AI models continue to evolve.
– **Future Considerations**:
– As more publishers become aware of their rights to protect their data, there may be changes in how AI entities gather data, potentially leading to stricter compliance and governance in AI data practices.
Overall, the emergence of this tool and the reactions it garnered signal a crucial conversation at the intersection of AI, privacy, and intellectual property that security and compliance professionals must monitor closely.