Simon Willison’s Weblog: Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast

Nov 22, 2024

—

Source URL: https://simonwillison.net/2024/Nov/22/weeknotes/#atom-everything
Source: Simon Willison’s Weblog
Title: Weeknotes: asynchronous LLMs, synchronous embeddings, and I kind of started a podcast

Feedly Summary: These past few weeks I’ve been bringing Datasette and LLM together and distracting myself with a new sort-of-podcast crossed with a live streaming experiment.

Project: interviewing people about their projects
Datasette Public Office Hours
Async LLM
Various embedding models
Blog entries
Releases
TILs

Project: interviewing people about their projects
My response to the recent US election was to stress-code, and then to stress-podcast. On the morning after the election I started a video series called Project (I guess you could call it a “vlog"?) where I interview people about their interesting data projects. The first episode was with Rajiv Sinclair talking about his project VERDAD, tracking misinformation on US broadcast radio. The second was with Philip James talking about Civic Band, his project to scrape and search PDF meeting minutes and agendas from US local municipalities.
I was a guest on another podcast-like thing too: an Ars Technica Live sesison with Benj Edwards, which I wrote about in Notes from Bing Chat—Our First Encounter With Manipulative AI.
Datasette Public Office Hours
I also started a new thing with Alex Garcia called Datasette Public Office Hours, which we plan to run approximately once every two weeks as a live-streamed Friday conversation about Datasette and related projects. I wrote up our first session in Visualizing local election results with Datasette, Observable and MapLibre GL. The Civic Band interview was part of our second session – I still need to write about the rest of that session about sqlite-vec, embeddings and some future Datasette AI features, but you can watch the full video on YouTube.
Async LLM
I need to write this up in full, but last weekend I quietly released LLM 0.18 with a huge new feature: plugins can now provide asynchronous versions of their models, ready to be used with Python’s asyncio. I built this for Datasette, which is built entirely around ASGI and needs to be able to run LLM models asynchronously to enable all sorts of interesting AI features.
LLM provides async OpenAI models, and I’ve also versions of the llm-gemini, llm-claude-3 and llm-mistral plugins that enable async models as well.
Here’s the documentation, but the short version is that you can now do this:
import llm

model = llm.get_async_model("claude-3.5-sonnet")

async for chunk in model.prompt(
"Five surprising names for a pet pelican"
):
print(chunk, end="", flush=True)
I’ve also been working on adding token accounting to LLM, to keep track of how many input and output tokens a prompt has used across multiple different models. I have an alpha release with that but it’s not yet fully stable.
The reason I want that is that I need it for both Datasette and Datasette Cloud. I want the ability to track token usage and grant users a free daily allowance of tokens that gets cut off once they’ve exhausted it. That’s an active project right now, more on that once it’s ready to ship in a release.
Various embedding models
LLM doesn’t yet offer asynchronous embeddings (see issue #628) but I’ve found myself hacking on a few different embeddings plugins anyway:

llm-gguf now supports embedding models distributed as GGUF files. This means you can use the excitingly small (just 30.8MB) mxbai-embed-xsmall-v1 with LLM.

llm-nomic-api-embed added support for the Nomic Embed Vision models. These work like CLIP in that you can embed both images and text in the same space, allowing you to do similarity search of a text string against a collection of images.

Blog entries

Notes from Bing Chat—Our First Encounter With Manipulative AI
Project: Civic Band – scraping and searching PDF meeting minutes from hundreds of municipalities
Qwen2.5-Coder-32B is an LLM that can code well that runs on my Mac
Visualizing local election results with Datasette, Observable and MapLibre GL
Project: VERDAD – tracking misinformation in radio broadcasts using Gemini 1.5
Claude 3.5 Haiku

Releases

llm-gemini 0.4.2 – 2024-11-22LLM plugin to access Google’s Gemini family of models

llm-nomic-api-embed 0.3 – 2024-11-21Create embeddings for LLM using the Nomic API

llm-gguf 0.2 – 2024-11-21Run models distributed as GGUF files using LLM

llm 0.19a2 – 2024-11-21Access large language models from the command-line

llm-mistral 0.9a0 – 2024-11-20LLM plugin providing access to Mistral models using the Mistral API

llm-claude-3 0.10a0 – 2024-11-20LLM plugin for interacting with the Claude 3 family of models

asgi-csrf 0.11 – 2024-11-15ASGI middleware for protecting against CSRF attacks

sqlite-utils 3.38a0 – 2024-11-08Python CLI utility and library for manipulating SQLite databases

asgi-proxy-lib 0.2a0 – 2024-11-06An ASGI function for proxying to a backend over HTTP

llm-lambda-labs 0.1a0 – 2024-11-04Run prompts against LLMs hosted by lambdalabs.com

llm-groq-whisper 0.1a0 – 2024-11-01Transcribe audio using the Groq.com Whisper API

TILs

Running cog automatically against GitHub pull requests – 2024-11-06

Generating documentation from tests using files-to-prompt and LLM – 2024-11-05

Tags: podcasts, projects, datasette, weeknotes, embeddings, llm

AI Summary and Description: Yes

**Summary:** The text details various initiatives and projects focused on integrating Large Language Models (LLMs) with tools like Datasette, emphasizing new features, podcasts, and interviews that enhance understanding and usage of AI technologies in practical applications. These insights are particularly valuable for professionals working in AI security, cloud computing, and infrastructure development, as they highlight the evolving landscape of AI tools and community engagement.

**Detailed Description:**

The text explores a series of projects merging Datasette—a platform for publishing data— with Large Language Models (LLMs). Key highlights include interviews with project creators, live-streamed public sessions, and significant upgrades to LLM functionalities. The author discusses:

– **Podcast and Interview Projects:**
– Conducting a video series where interesting data projects are showcased, like:
– **VERDAD:** A project aimed at tracking misinformation in US broadcast radio by Rajiv Sinclair.
– **Civic Band:** Philip James’s initiative to scrape and search PDF meeting minutes from local municipalities.
– A guest appearance in an Ars Technica Live session discussing manipulative AI.

– **Datasette Public Office Hours:**
– A bi-weekly livestreamed discussion focused on Datasette and related projects.
– Visualizations of local election results using Datasette with tools like MapLibre GL.

– **Async LLM Integration:**
– Release of LLM 0.18 featuring asynchronous model functionalities suitable for Python’s asyncio, enhancing user experience with AI features.
– New plugins for async OpenAI models and other LLMs are introduced.

– **Token Accounting:**
– A current project to implement token management in LLM for user allowances within Datasette and Datasette Cloud, reflecting a need for operational oversight in AI usage.

– **Embedding Models:**
– Development of plugins to support embedding models, including GGUF file distributions and image-text similarity search capabilities akin to CLIP.

– **Recent Releases and Blog Entries:**
– Various LLM plugins released recently, including those for Google’s Gemini models, Nomic APIs, and an array of other utility tools.
– Blog entries discuss notable engagements with AI, such as interactions with Bing Chat and visual analytics for election data.

**Key Points:**
– Integration of LLMs with user-facing tools pushes the envelope of what is possible in data interrogation and manipulation.
– Community engagement through podcasts and interactive sessions fosters collaboration and knowledge sharing in the tech community.
– The evolving capabilities of AI programs like token accounting and async functionalities speak to the pressing need for robust infrastructure in AI development.
– The focus on applications that track misinformation reflects a growing awareness of the societal implications of AI technologies.

This comprehensive overview of the ongoing projects and advancements provides practical insights for security and compliance professionals as they navigate the complexities of AI integration in organizational infrastructures.

.NET 1 2 2024 4 5 Haiku 5-Sonnet a access Act advancement advancements AI AI development AI integration AI models AI technologies AI tool analytics API APIs Application applications Arch art as async attack audio Auto awareness backend benj bing by C capabilities chat Claude Claude 3.5 Claude-3 CLIP Cloud cloud computing code collaboration command community community engagement compliance compliance professionals Computing cross D data database databases dataset datasette day development documentation e edge edwards election election results embedding model embeddings end exp features first g Gemini Gemini 1.5 Gemini model Gemini models Gen git GitHub Go Google Groq hack hacking Haiku high Highlight http HTTPS image implications in information infrastructure infrastructure development insights integration inter interaction IRS ite Just k knowledge knowledge sharing l Labor Lambda language language model language models large language model large language models led library Lite llm llms lm low mac management manipulation middleware Mila misinformation Mistral model models multi my no NPU o oE of on open openai operation organization oversight pdf phi plugin plugins podcast podcasts practical applications professionals projects prompt prompts proxy public publishing Py Python Qwen rack Ray RCE response s scraping search search capabilities sec security security and compliance self sharing Sig Sim SoC societal implications source sql sqlite SSE T Tails tech community technologies to token token accounting token management token usage tokens tools Tor tracking trie two up upgrade US Election usage user user experience uth Vision Vision Models visualization visualizations web weeknotes whisper x YouTube