Source URL: https://simonwillison.net/2024/Oct/27/llm-jq/#atom-everything
Source: Simon Willison’s Weblog
Title: Run a prompt to generate and execute jq programs using llm-jq
Feedly Summary: llm-jq is a brand new plugin for LLM which lets you pipe JSON directly into the llm jq command along with a human-language description of how you’d like to manipulate that JSON and have a jq program generated and executed for you on the fly.
Thomas Ptacek on Twitter:
The JQ CLI should just BE a ChatGPT client, so there’s no pretense of actually understanding this syntax. Cut out the middleman, just look up what I’m trying to do, for me.
I couldn’t resist writing a plugin. Here’s an example of llm-jq in action:
llm install llm-jq
curl -s https://api.github.com/repos/simonw/datasette/issues | \
llm jq ‘count by user login, top 3’
This outputs the following:
[
{
“login": "simonw",
"count": 11
},
{
"login": "king7532",
"count": 5
},
{
"login": "dependabot[bot]",
"count": 2
}
]
group_by(.user.login) | map({login: .[0].user.login, count: length}) | sort_by(-.count) | .[0:3]
The JSON result is sent to standard output, the jq program it generated and executed is sent to standard error. Add the -s/–silent option to tell it not to output the program, or the -v/–verbose option for verbose output that shows the prompt it sent to the LLM as well.
Under the hood it passes the first 1024 bytes of the JSON piped to it plus the program description "count by user login, top 3" to the default LLM model (usually gpt-4o-mini unless you set another with e.g. llm models default claude-3.5-sonnet) and system prompt. It then runs jq in a subprocess and pipes in the full JSON that was passed to it.
Here’s the system prompt it uses, adapted from my llm-cmd plugin:
Based on the example JSON snippet and the desired query, write a jq program
Return only the jq program to be executed as a raw string, no string delimiters wrapping it, no yapping, no markdown, no fenced code blocks, what you return will be passed to subprocess.check_output(‘jq’, […]) directly. For example, if the user asks: extract the name of the first person You return only: .people[0].name
I used Claude to figure out how to pipe content from the parent process to the child and detect and return the correct exit code.
Tags: plugins, projects, thomas-ptacek, ai, jq, generative-ai, ai-assisted-programming, llm
AI Summary and Description: Yes
Summary: The content discusses the new llm-jq plugin that integrates JSON handling with natural language descriptions, using a large language model (LLM) like GPT to generate jq commands. This presents a novel intersection of AI capabilities with data manipulation tasks, making it highly relevant for developers and professionals interested in AI-enhanced programming tools.
Detailed Description:
The llm-jq plugin represents an innovative approach to simplify JSON manipulation through AI assistance. Here are the major points:
– **Integration with LLMs**: The plugin allows users to input JSON alongside a natural language description to automatically generate and execute jq commands via the LLM.
– **User Interaction Example**: The demonstrative example showcases how to install the plugin and use it to summarize issue counts attributed to various GitHub users, illustrating its practical utility.
– **Command Construction**: The plugin does not require users to know jq syntax deeply, as it directly translates human language queries into executable jq scripts. This lowers the barrier for developers who might not be familiar with such command-line tools.
– **System Prompting**: The backend uses a well-defined system prompt to guide the LLM in returning precise jq commands. This interaction emphasizes the importance of crafting effective prompts for maximizing LLM utility.
– **Flexibility and Options**: Users can choose verbosity levels in outputs, demonstrating the plugin’s flexibility in providing information.
Key Insights:
– **AI-Assisted Programming**: The plugin is a clear example of how generative AI can assist in programming tasks, potentially speeding up workflow and reducing error rates during data manipulation.
– **DevSecOps and MLOps Relevance**: By facilitating faster data processing and analysis, such tools can significantly impact development and operational security practices, enhancing data management across integrated platforms.
– **Future Potential**: As tools like llm-jq evolve, there may emerge new use cases, particularly for users requiring quick data manipulation without deep technical expertise in underlying data formats or command-line tools.
In summary, llm-jq bridges the gap between natural language processing and programming, illustrating significant advancements in user-friendly interfaces for data manipulation. This is crucial for professionals in AI and cloud computing who constantly seek efficient, security-conscious solutions in programming and infrastructure management.