Release v0.0.40 · pipecat-ai/pipecat

Added

VAD parameters can now be dynamicallt updated using the VADParamsUpdateFrame.
ErrorFrame has now a fatal field to indicate the bot should exit if a fatal error is pushed upstream (false by default). A new FatalErrorFrame that sets this flag to true has been added.
AnthropicLLMService now supports function calling and initial support for prompt caching.
(see https://www.anthropic.com/news/prompt-caching)
ElevenLabsTTSService can now specify ElevenLabs input parameters such as output_format.
TwilioFrameSerializer can now specify Twilio's and Pipecat's desired sample rates to use.
Added new on_participant_updated event to DailyTransport.
Added DailyRESTHelper.delete_room_by_name() and DailyRESTHelper.delete_room_by_url().
Added LLM and TTS usage metrics. Those are enabled when PipelineParams.enable_usage_metrics is True.
AudioRawFrames are now pushed downstream from the base output transport. This allows capturing the exact words the bot says by adding an STT service at the end of the pipeline.
Added new GStreamerPipelineSource. This processor can generate image or audio frames from a GStreamer pipeline (e.g. reading an MP4 file, and RTP stream or anything supported by GStreamer).
Added TransportParams.audio_out_is_live. This flag is False by default and it is useful to indicate we should not synchronize audio with sporadic images.
Added new BotStartedSpeakingFrame and BotStoppedSpeakingFrame control frames. These frames are pushed upstream and they should wrap BotSpeakingFrame.
Transports now allow you to register event handlers without decorators.

Changed

Support RTVI message protocol 0.1. This includes new messages, support for messages responses, support for actions, configuration, webhooks and a bunch of new cool stuff.
(see https://docs.rtvi.ai/)
SileroVAD dependency is now imported via pip's silero-vad package.
ElevenLabsTTSService now uses eleven_turbo_v2_5 model by default.
BotSpeakingFrame is now a control frame.
StartFrame is now a control frame similar to EndFrame.
DeepgramTTSService now is more customizable. You can adjust the encoding and sample rate.

Fixed

TTSStartFrame and TTSStopFrame are now sent when TTS really starts and stops. This allows for knowing when the bot starts and stops speaking even with asynchronous services (like Cartesia).
Fixed AzureSTTService transcription frame timestamps.
Fixed an issue with DailyRESTHelper.create_room() expirations which would cause this function to stop working after the initial expiration elapsed.
Improved EndFrame and CancelFrame handling. EndFrame should end things gracefully while a CancelFrame should cancel all running tasks as soon as possible.
Fixed an issue in AIService that would cause a yielded None value to be processed.
RTVI's bot-ready message is now sent when the RTVI pipeline is ready and a first participant joins.
Fixed a BaseInputTransport issue that was causing incoming system frames to be queued instead of being pushed immediately.
Fixed a BaseInputTransport issue that was causing start/stop interruptions incoming frames to not cancel tasks and be processed properly.

Other

Added studypal example (from to the Cartesia folks!).
Most examples now use Cartesia.
Added examples foundational/19a-tools-anthropic.py, foundational/19b-tools-video-anthropic.py and foundational/19a-tools-togetherai.py.
Added examples foundational/18-gstreamer-filesrc.py and foundational/18a-gstreamer-videotestsrc.py that show how to use GStreamerPipelineSource.
Remove requests library usage.
Cleanup examples and use DailyRESTHelper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.40

Added

Changed

Fixed

Other