Tag: robots.txt

  • Hacker News: Ask HN: Is there any license that is designed to exclude LLMs?

    Source URL: https://news.ycombinator.com/item?id=42170746 Source: Hacker News Title: Ask HN: Is there any license that is designed to exclude LLMs? Feedly Summary: Comments AI Summary and Description: Yes Summary: The text expresses concerns about content harvesting by LLMs (Large Language Models) and discusses potential licensing solutions, highlighting the struggle to protect digital content. The insights are…

  • Hacker News: Bluesky says it won’t train AI on your posts

    Source URL: https://www.theverge.com/2024/11/15/24297442/bluesky-no-intention-train-generative-ai-posts Source: Hacker News Title: Bluesky says it won’t train AI on your posts Feedly Summary: Comments AI Summary and Description: Yes Summary: Bluesky has publicly declared its commitment to not using user content for training generative AI tools, contrasting with competitors like X, who have updated terms allowing such practices. This distinction…

  • Hacker News: Nearly 90 % of our AI crawler traffic is from TikTok/ByteDance

    Source URL: https://www.haproxy.com/blog/nearly-90-of-our-ai-crawler-traffic-is-from-tiktok-parent-bytedance-lessons-learned Source: Hacker News Title: Nearly 90 % of our AI crawler traffic is from TikTok/ByteDance Feedly Summary: Comments AI Summary and Description: Yes Summary: The text highlights the significant and growing impact of AI crawlers, specifically Bytespider from Bytedance, on web traffic, and discusses the implications of such activity for content-heavy businesses.…

  • The Register: Major publishers sue Perplexity AI for scraping without paying

    Source URL: https://www.theregister.com/2024/10/22/publishers_sue_perplexity_ai/ Source: The Register Title: Major publishers sue Perplexity AI for scraping without paying Feedly Summary: We sell that to OpenAI – how dare you steal it and make stuff up Major US news publishers Dow Jones & Co and NYP Holdings have sued AI search engine startup Perplexity for scraping their content…

  • Wired: New Cloudflare Tools Let Sites Detect and Block AI Bots for Free

    Source URL: https://www.wired.com/story/cloudflare-tools-detect-block-ai-bots/ Source: Wired Title: New Cloudflare Tools Let Sites Detect and Block AI Bots for Free Feedly Summary: “The path we’re on isn’t sustainable,” Cloudflare CEO Matthew Prince tells WIRED, in reference to rampant AI scraping. Here’s his plan to course-correct. AI Summary and Description: Yes Summary: Cloudflare is launching a suite of…

  • Hacker News: AI Has Created a Battle over Web Crawling

    Source URL: https://spectrum.ieee.org/web-crawling Source: Hacker News Title: AI Has Created a Battle over Web Crawling Feedly Summary: Comments AI Summary and Description: Yes Summary: The text addresses the evolving dynamics of data usage in generative AI, highlighting the implications of restrictive data access policies for AI model training and the potential implications for AI companies.…

  • Hacker News: Major Sites Are Saying No to Apple’s AI Scraping

    Source URL: https://www.wired.com/story/applebot-extended-apple-ai-scraping/ Source: Hacker News Title: Major Sites Are Saying No to Apple’s AI Scraping Feedly Summary: Comments AI Summary and Description: Yes Summary: The article discusses Apple’s introduction of a tool, Applebot-Extended, which allows publishers to opt out of data usage for AI training. This change signals a shift in attitudes towards web…