Tag: ethical compliance

  • Simon Willison’s Weblog: Releasing the largest multilingual open pretraining dataset

    Source URL: https://simonwillison.net/2024/Nov/14/releasing-the-largest-multilingual-open-pretraining-dataset/#atom-everything Source: Simon Willison’s Weblog Title: Releasing the largest multilingual open pretraining dataset Feedly Summary: Releasing the largest multilingual open pretraining dataset Common Corpus is a new “open and permissible licensed text dataset, comprising over 2 trillion tokens (2,003,039,184,047 tokens)" released by French AI Lab PleIAs. This appears to be the largest available…

  • Hacker News: A Summary of Ilya Sutskevers AI Reading List

    Source URL: https://tensorlabbet.com/ Source: Hacker News Title: A Summary of Ilya Sutskevers AI Reading List Feedly Summary: Comments AI Summary and Description: Yes Summary: This text provides a detailed overview of a curated reading list from Ilya Sutskever that spans various foundational topics in machine learning, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs),…

  • Wired: A New Group Is Trying to Make AI Data Licensing Ethical

    Source URL: https://www.wired.com/story/dataset-providers-alliance-ethical-generative-ai-licensing/ Source: Wired Title: A New Group Is Trying to Make AI Data Licensing Ethical Feedly Summary: The Dataset Providers Alliance calls for creators and rights holders to be able to opt in to having their material used for training purposes. AI Summary and Description: Yes Summary: The text discusses the evolving landscape…