Skip to main content
The UserBotLatencyObserver measures the time between when a user stops speaking and when the bot starts responding, emitting events for custom handling and optional OpenTelemetry tracing integration. It also tracks first-bot-speech latency and provides detailed per-service latency breakdowns when metrics are enabled.

Features

  • Tracks user speech start/stop timing using VAD frames
  • Measures bot response latency from the actual moment the user started speaking
  • Measures first bot speech latency (client connection to first speech)
  • Provides detailed latency breakdown with per-service TTFB, text aggregation, user turn duration, and function call metrics
  • Emits on_latency_measured events for custom processing
  • Emits on_latency_breakdown events with detailed per-service metrics
  • Emits on_first_bot_speech_latency event for greeting latency measurement
  • Automatically records latency as OpenTelemetry span attributes when tracing is enabled
  • Automatically resets between conversation turns

Usage

Basic Latency Monitoring

Add latency monitoring to your pipeline and handle the event:
from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver

latency_observer = UserBotLatencyObserver()

@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    print(f"User-to-bot latency: {latency:.3f}s")

task = PipelineTask(
    pipeline,
    params=PipelineParams(observers=[latency_observer]),
)

Detailed Latency Breakdown

Enable metrics to collect per-service latency breakdown:
from pipecat.observers.user_bot_latency_observer import UserBotLatencyObserver

latency_observer = UserBotLatencyObserver()

@latency_observer.event_handler("on_latency_breakdown")
async def on_latency_breakdown(observer, breakdown):
    print(f"Latency breakdown ({len(breakdown.chronological_events())} events):")
    for event in breakdown.chronological_events():
        print(f"  {event}")

task = PipelineTask(
    pipeline,
    params=PipelineParams(
        observers=[latency_observer],
        enable_metrics=True,  # Required for breakdown metrics
    ),
)

OpenTelemetry Integration

When tracing is enabled, latency measurements are automatically recorded as turn.user_bot_latency_seconds attributes on OpenTelemetry turn spans. No additional configuration is needed.

How It Works

The observer tracks conversation flow through these key events:
  1. Client connects (ClientConnectedFrame) → Records timestamp for first-bot-speech measurement
  2. User starts speaking (VADUserStartedSpeakingFrame) → Resets latency tracking
  3. User stops speaking (VADUserStoppedSpeakingFrame) → Records timestamp, accounting for VAD stop_secs delay
  4. Bot starts speaking (BotStartedSpeakingFrame) → Calculates latency and emits on_latency_measured and on_latency_breakdown events
When enable_metrics=True in PipelineParams, the observer also collects per-service metrics (TTFB, text aggregation, function call latency) from MetricsFrame instances and includes them in the latency breakdown.

Event Handlers

on_latency_measured

Called each time a user-to-bot latency measurement is captured.
@latency_observer.event_handler("on_latency_measured")
async def on_latency_measured(observer, latency):
    # latency is a float representing seconds
    logger.info(f"Response latency: {latency:.3f}s")

on_latency_breakdown

Called alongside on_latency_measured with detailed per-service metrics collected during the user→bot cycle. The breakdown includes TTFB from each service, text aggregation latency, user turn duration, and function call timings.
@latency_observer.event_handler("on_latency_breakdown")
async def on_latency_breakdown(observer, breakdown):
    # breakdown is a LatencyBreakdown object
    logger.info("Latency breakdown:")
    for event in breakdown.chronological_events():
        logger.info(f"  {event}")
LatencyBreakdown fields:
FieldTypeDescription
ttfbList[TTFBBreakdownMetrics]Time-to-first-byte metrics from each service
text_aggregationOptional[TextAggregationBreakdownMetrics]First text aggregation measurement (sentence aggregation latency)
user_turn_start_timeOptional[float]Unix timestamp when user turn started (adjusted for VAD stop_secs)
user_turn_secsOptional[float]User turn duration including VAD silence detection, STT finalization, and turn analyzer wait
function_callsList[FunctionCallMetrics]Latency for each function call executed during the cycle
The breakdown.chronological_events() method returns a human-readable list of all metrics sorted by start time, useful for logging and debugging.

on_first_bot_speech_latency

Called once when the bot first speaks after client connection. Measures the time from ClientConnectedFrame to the first BotStartedSpeakingFrame. This is particularly useful for measuring greeting latency.
@latency_observer.event_handler("on_first_bot_speech_latency")
async def on_first_bot_speech_latency(observer, latency):
    logger.info(f"First bot speech latency: {latency:.3f}s")
The on_latency_breakdown event is also emitted for the first bot speech, allowing you to see the detailed breakdown of what contributed to the greeting latency.

Deprecated: UserBotLatencyLogObserver

UserBotLatencyLogObserver is deprecated. Use UserBotLatencyObserver directly with its on_latency_measured event handler instead.

Configuration

Constructor Parameters

max_frames
int
default:"100"
Maximum number of frame IDs to keep in history for duplicate detection. Prevents unbounded memory growth in long conversations.

Limitations

  • Requires proper frame sequencing to work accurately
  • Per-service metrics are only collected when enable_metrics=True in PipelineParams