Pipecat Voice & Multimodal Agent Tracing
Pipecat is an open-source Python framework for building real-time voice and multimodal conversational agents. It orchestrates the audio transport, speech-to-text, LLM, and text-to-speech stages of a conversation into a single streaming pipeline. Arize provides first-class support for instrumenting Pipecat pipelines using theopeninference-instrumentation-pipecat package. Each pipeline run is captured as a session in Arize with LLM, STT, TTS, and tool spans tied to a conversation_id so you can review the full audio-in to audio-out path of any turn.
Quick Start: Pipecat Python Integration
Installation & Setup
Install the OpenInference Pipecat instrumentor along with Pipecat and the Arize OTel helper:openinference-instrumentation-pipecat >=1.0 requires pipecat-ai >=1.0 and Python >=3.11. Pipecat 1.0 renamed observers, removed LLMMessagesFrame, and dropped Python 3.10 support. If you’re still on pipecat-ai<1.0, pin the instrumentor:Instrumentation Setup
Configure thePipecatInstrumentor and tracer to send traces to Arize:
Example: Basic Pipecat Pipeline Usage
Build your pipeline as usual. Pass aconversation_id to PipelineTask so each conversation is grouped into a session in Arize:
What is covered by the Instrumentation
Arize provides comprehensive observability for Pipecat’s real-time voice and multimodal pipelines, automatically tracing:Conversation & Session Tracking
-
Conversation Sessions: All turns sharing a
conversation_idgrouped into a single session - Turn Boundaries: Each user-to-assistant exchange captured as a parent span
- End-to-End Latency: The full audio-in to audio-out path for every turn
Pipeline Stage Spans
- LLM Calls: Prompts, responses, token counts, and model metadata from your LLM service
- Speech-to-Text (STT): Input audio transcription with latency
- Text-to-Speech (TTS): Output audio synthesis with latency
- Tool / Function Calls: When the LLM service invokes tools, their inputs, outputs, and duration
Performance & Reliability Monitoring
- Stage Latency: Per-stage timing to identify bottlenecks in the audio pipeline
- Token Usage: Prompt, completion, and total tokens across the conversation
- Errors: Failures inside any pipeline stage surfaced as span errors
Verify in Arize AX
Check from the skill, CLI, or SDK
Confirm spans are actually reaching your Arize AX project. Use whichever fits your workflow — the skill and CLI work for any framework; the SDK check is shown for each language.- Arize skill (agent)
- AX CLI
- SDK
Install the Arize Skills plugin and let your coding agent check for you:Then prompt your agent:
Use the arize-trace skill to export and analyze recent traces from my project. Confirm spans are arriving, and summarize any errors or latency issues.