UserBotLatencyObserver measures the time between when a user stops speaking and when the bot starts responding, emitting events for custom handling and optional OpenTelemetry tracing integration. It also tracks first-bot-speech latency and provides detailed per-service latency breakdowns when metrics are enabled.
Features
- Tracks user speech start/stop timing using VAD frames
- Measures bot response latency from the actual moment the user started speaking
- Measures first bot speech latency (client connection to first speech)
- Provides detailed latency breakdown with per-service TTFB, text aggregation, user turn duration, and function call metrics
- Emits
on_latency_measuredevents for custom processing - Emits
on_latency_breakdownevents with detailed per-service metrics - Emits
on_first_bot_speech_latencyevent for greeting latency measurement - Automatically records latency as OpenTelemetry span attributes when tracing is enabled
- Automatically resets between conversation turns
Usage
Basic Latency Monitoring
Add latency monitoring to your pipeline and handle the event:Detailed Latency Breakdown
Enable metrics to collect per-service latency breakdown:OpenTelemetry Integration
When tracing is enabled, latency measurements are automatically recorded asturn.user_bot_latency_seconds attributes on OpenTelemetry turn spans. No additional configuration is needed.
How It Works
The observer tracks conversation flow through these key events:- Client connects (
ClientConnectedFrame) → Records timestamp for first-bot-speech measurement - User starts speaking (
VADUserStartedSpeakingFrame) → Resets latency tracking - User stops speaking (
VADUserStoppedSpeakingFrame) → Records timestamp, accounting for VADstop_secsdelay - Bot starts speaking (
BotStartedSpeakingFrame) → Calculates latency and emitson_latency_measuredandon_latency_breakdownevents
enable_metrics=True in PipelineParams, the observer also collects per-service metrics (TTFB, text aggregation, function call latency) from MetricsFrame instances and includes them in the latency breakdown.
Event Handlers
on_latency_measured
Called each time a user-to-bot latency measurement is captured.on_latency_breakdown
Called alongsideon_latency_measured with detailed per-service metrics collected during the user→bot cycle. The breakdown includes TTFB from each service, text aggregation latency, user turn duration, and function call timings.
| Field | Type | Description |
|---|---|---|
ttfb | List[TTFBBreakdownMetrics] | Time-to-first-byte metrics from each service |
text_aggregation | Optional[TextAggregationBreakdownMetrics] | First text aggregation measurement (sentence aggregation latency) |
user_turn_start_time | Optional[float] | Unix timestamp when user turn started (adjusted for VAD stop_secs) |
user_turn_secs | Optional[float] | User turn duration including VAD silence detection, STT finalization, and turn analyzer wait |
function_calls | List[FunctionCallMetrics] | Latency for each function call executed during the cycle |
breakdown.chronological_events() method returns a human-readable list of all metrics sorted by start time, useful for logging and debugging.
on_first_bot_speech_latency
Called once when the bot first speaks after client connection. Measures the time fromClientConnectedFrame to the first BotStartedSpeakingFrame. This is particularly useful for measuring greeting latency.
The
on_latency_breakdown event is also emitted for the first bot speech, allowing you to see the detailed breakdown of what contributed to the greeting latency.Deprecated: UserBotLatencyLogObserver
Configuration
Constructor Parameters
Maximum number of frame IDs to keep in history for duplicate detection. Prevents unbounded memory growth in long conversations.
Limitations
- Requires proper frame sequencing to work accurately
- Per-service metrics are only collected when
enable_metrics=TrueinPipelineParams