Tag: open science
-
Simon Willison’s Weblog: Releasing the largest multilingual open pretraining dataset
Source URL: https://simonwillison.net/2024/Nov/14/releasing-the-largest-multilingual-open-pretraining-dataset/#atom-everything Source: Simon Willison’s Weblog Title: Releasing the largest multilingual open pretraining dataset Feedly Summary: Releasing the largest multilingual open pretraining dataset Common Corpus is a new “open and permissible licensed text dataset, comprising over 2 trillion tokens (2,003,039,184,047 tokens)" released by French AI Lab PleIAs. This appears to be the largest available…
-
Hacker News: Liquid Foundation Models: Our First Series of Generative AI Models
Source URL: https://www.liquid.ai/liquid-foundation-models Source: Hacker News Title: Liquid Foundation Models: Our First Series of Generative AI Models Feedly Summary: Comments AI Summary and Description: Yes Summary: The text introduces Liquid Foundation Models (LFMs), a new generation of generative AI models, emphasizing their novel architectural design and performance efficiency compared to traditional transformer models. LFMs are…