Skip to content

v0.0.40

Compare
Choose a tag to compare
@aconchillo aconchillo released this 20 Aug 18:52
· 1639 commits to main since this release

Added

  • VAD parameters can now be dynamicallt updated using the VADParamsUpdateFrame.

  • ErrorFrame has now a fatal field to indicate the bot should exit if a fatal error is pushed upstream (false by default). A new FatalErrorFrame that sets this flag to true has been added.

  • AnthropicLLMService now supports function calling and initial support for prompt caching.
    (see https://www.anthropic.com/news/prompt-caching)

  • ElevenLabsTTSService can now specify ElevenLabs input parameters such as output_format.

  • TwilioFrameSerializer can now specify Twilio's and Pipecat's desired sample rates to use.

  • Added new on_participant_updated event to DailyTransport.

  • Added DailyRESTHelper.delete_room_by_name() and DailyRESTHelper.delete_room_by_url().

  • Added LLM and TTS usage metrics. Those are enabled when PipelineParams.enable_usage_metrics is True.

  • AudioRawFrames are now pushed downstream from the base output transport. This allows capturing the exact words the bot says by adding an STT service at the end of the pipeline.

  • Added new GStreamerPipelineSource. This processor can generate image or audio frames from a GStreamer pipeline (e.g. reading an MP4 file, and RTP stream or anything supported by GStreamer).

  • Added TransportParams.audio_out_is_live. This flag is False by default and it is useful to indicate we should not synchronize audio with sporadic images.

  • Added new BotStartedSpeakingFrame and BotStoppedSpeakingFrame control frames. These frames are pushed upstream and they should wrap BotSpeakingFrame.

  • Transports now allow you to register event handlers without decorators.

Changed

  • Support RTVI message protocol 0.1. This includes new messages, support for messages responses, support for actions, configuration, webhooks and a bunch of new cool stuff.
    (see https://docs.rtvi.ai/)

  • SileroVAD dependency is now imported via pip's silero-vad package.

  • ElevenLabsTTSService now uses eleven_turbo_v2_5 model by default.

  • BotSpeakingFrame is now a control frame.

  • StartFrame is now a control frame similar to EndFrame.

  • DeepgramTTSService now is more customizable. You can adjust the encoding and sample rate.

Fixed

  • TTSStartFrame and TTSStopFrame are now sent when TTS really starts and stops. This allows for knowing when the bot starts and stops speaking even with asynchronous services (like Cartesia).

  • Fixed AzureSTTService transcription frame timestamps.

  • Fixed an issue with DailyRESTHelper.create_room() expirations which would cause this function to stop working after the initial expiration elapsed.

  • Improved EndFrame and CancelFrame handling. EndFrame should end things gracefully while a CancelFrame should cancel all running tasks as soon as possible.

  • Fixed an issue in AIService that would cause a yielded None value to be processed.

  • RTVI's bot-ready message is now sent when the RTVI pipeline is ready and a first participant joins.

  • Fixed a BaseInputTransport issue that was causing incoming system frames to be queued instead of being pushed immediately.

  • Fixed a BaseInputTransport issue that was causing start/stop interruptions incoming frames to not cancel tasks and be processed properly.

Other

  • Added studypal example (from to the Cartesia folks!).

  • Most examples now use Cartesia.

  • Added examples foundational/19a-tools-anthropic.py, foundational/19b-tools-video-anthropic.py and foundational/19a-tools-togetherai.py.

  • Added examples foundational/18-gstreamer-filesrc.py and foundational/18a-gstreamer-videotestsrc.py that show how to use GStreamerPipelineSource.

  • Remove requests library usage.

  • Cleanup examples and use DailyRESTHelper.