Tag: audio processing

  • Simon Willison’s Weblog: LLM 0.18

    Source URL: https://simonwillison.net/2024/Nov/17/llm-018/#atom-everything Source: Simon Willison’s Weblog Title: LLM 0.18 Feedly Summary: LLM 0.18 New release of LLM. The big new feature is asynchronous model support – you can now use supported models in async Python code like this: import llm model = llm.get_async_model(“gpt-4o") async for chunk in model.prompt( "Five surprising names for a pet…

  • Hacker News: Debugging Audio Artifacts Caused by a Serial Port?

    Source URL: https://www.recall.ai/post/debugging-audio-artifacts-caused-by-a-serial-port Source: Hacker News Title: Debugging Audio Artifacts Caused by a Serial Port? Feedly Summary: Comments AI Summary and Description: Yes Summary: This text describes a complex troubleshooting experience following the migration of a large-scale infrastructure from Kubernetes to a self-managed solution, illustrating how an unexpected audio issue emerged due to logging configurations.…

  • Simon Willison’s Weblog: Experimenting with audio input and output for the OpenAI Chat Completion API

    Source URL: https://simonwillison.net/2024/Oct/18/openai-audio/#atom-everything Source: Simon Willison’s Weblog Title: Experimenting with audio input and output for the OpenAI Chat Completion API Feedly Summary: OpenAI promised this at DevDay a few weeks ago and now it’s here: their Chat Completion API can now accept audio as input and return it as output. OpenAI still recommend their WebSocket-based…

  • Hacker News: Ichigo: Local real-time voice AI

    Source URL: https://github.com/homebrewltd/ichigo Source: Hacker News Title: Ichigo: Local real-time voice AI Feedly Summary: Comments AI Summary and Description: Yes Summary: The text discusses the launch of the open research project 🍓 Ichigo, which enhances a text-based large language model (LLM) with native listening capabilities through improved audio processing techniques. It highlights advancements in the…

  • Hacker News: Moshi: A speech-text foundation model for real time dialogue

    Source URL: https://github.com/kyutai-labs/moshi Source: Hacker News Title: Moshi: A speech-text foundation model for real time dialogue Feedly Summary: Comments AI Summary and Description: Yes Summary: The text describes “Moshi,” a speech-text foundation model that enables real-time dialogue using advanced audio processing techniques. It introduces a new neural audio codec, “Mimi,” which supports fully streaming audio…