Simon Willison’s Weblog: LLM 0.19

Source URL: https://simonwillison.net/2024/Dec/1/llm-019/
Source: Simon Willison’s Weblog
Title: LLM 0.19

Feedly Summary: LLM 0.19
I just released version 0.19 of LLM, my Python library and CLI utility for working with Large Language Models.
I released 0.18 a couple of weeks ago adding support for calling models from Python asyncio code. 0.19 improves on that, and also adds a new mechanism for models to report their token usage.
LLM can log those usage numbers to a SQLite database, or make then available to custom Python code.
My eventual goal with these features is to implement token accounting as a Datasette plugin so I can offer AI features in my SaaS platform without worrying about customers spending unlimited LLM tokens.
Those 0.19 release notes in full:

Tokens used by a response are now logged to new input_tokens and output_tokens integer columns and a token_details JSON string column, for the default OpenAI models and models from other plugins that implement this feature. #610
llm prompt now takes a -u/–usage flag to display token usage at the end of the response.
llm logs -u/–usage shows token usage information for logged responses.
llm prompt … –async responses are now logged to the database. #641
llm.get_models() and llm.get_async_models() functions, documented here. #640
response.usage() and async response await response.usage() methods, returning a Usage(input=2, output=1, details=None) dataclass. #644
response.on_done(callback) and await response.on_done(callback) methods for specifying a callback to be executed when a response has completed, documented here. #653
Fix for bug running llm chat on Windows 11. Thanks, Sukhbinder Singh. #495

Tags: llm, releasenotes, generative-ai, projects, ai, llms

AI Summary and Description: Yes

Summary: The release of version 0.19 of the LLM Python library is notable as it enhances functionality related to token usage and tracking for Large Language Models. These developments can significantly impact professionals working on Generative AI and related SaaS platforms, particularly in managing costs associated with token consumption.

Detailed Description:
The release of version 0.19 of the LLM library introduces several important enhancements that can be particularly relevant for security and compliance professionals working with Generative AI applications. Here are the major points outlined in the release:

– **Token Usage Reporting**: The new version includes mechanisms to track and log token usage, which can help developers manage costs effectively when utilizing language models.
– Tokens used by a response are now logged into specific columns in a SQLite database, allowing for easier tracking of both input and output tokens.
– This feature also supports custom Python code integration, which provides flexibility in how token accounting can be implemented.

– **Usage Flags**: With the introduction of a new flag `-u/–usage`, users can display token usage directly at the end of model responses, facilitating immediate visibility into resource consumption.

– **Asynchronous Support**: Enhanced support for Python’s asyncio allows for concurrent processing of responses, ensuring efficient application performance while also logging token usage.

– **Use Cases in SaaS Platforms**: The ability to track token consumption is critical for developers planning to offer Generative AI features as part of a Software as a Service (SaaS) offering. This monitoring allows for better resource allocation and cost management, negating concerns over unlimited token consumption by customers.

– **Extensions and Bugs Fixed**: The new version addresses various functionality improvements and bug fixes, including support for running on Windows 11, indicating ongoing development and responsiveness to user feedback.

In summary, these enhancements not only improve the technical capabilities of the LLM library but also align with compliance and cost management considerations essential for AI practitioners. As organizations increasingly adopt AI-driven solutions, the ability to monitor and control token usage becomes a vital component of operational sustainability and security compliance.