Observability¶
The observability package provides instrumentation interfaces for monitoring and debugging voice operations. It enables tracking of call lifecycle events, TTS synthesis metrics, and STT transcription performance.
Overview¶
OmniVoice observability consists of two main components:
- Voice Events - Call lifecycle events (initiated, answered, ended, etc.)
- Operation Hooks - TTS and STT instrumentation for latency, throughput, and error tracking
┌─────────────────────────────────────────────────────────────────┐
│ Voice Application │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ TTS │ │ STT │ │ CallSystem │ │
│ │ + Hook │ │ + Hook │ │ + Observer │ │
│ └──────┬───────┘ └──────┬───────┘ └────────┬─────────┘ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Observability Layer │ │
│ ├──────────────────┬──────────────────┬───────────────────────┤ │
│ │ TTSHook │ STTHook │ VoiceObserver │ │
│ │ - Latency │ - Latency │ - Call Events │ │
│ │ - Audio Size │ - Confidence │ - Media Events │ │
│ │ - Errors │ - Errors │ - DTMF Events │ │
│ └──────────────────┴──────────────────┴───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Backends │ │
│ │ Prometheus │ OpenTelemetry │ Logging │ Custom │ │
│ └─────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Voice Events¶
Event Types¶
The package defines event types for the call lifecycle:
| Event | Description |
|---|---|
call.initiated |
Call started (outbound) or received (inbound) |
call.ringing |
Outbound call is ringing |
call.answered |
Call was answered |
call.ended |
Call ended normally |
call.failed |
Call failed |
call.busy |
Line was busy |
call.no_answer |
No answer |
media.connected |
Media streaming connected |
media.disconnected |
Media streaming disconnected |
media.error |
Media streaming error |
dtmf.received |
DTMF tones received |
VoiceEvent Structure¶
type VoiceEvent struct {
Type EventType // Event type (e.g., "call.answered")
Timestamp time.Time // When the event occurred
CallID string // Unique call identifier
Provider string // Provider name (e.g., "twilio")
Direction string // "inbound" or "outbound"
From string // Caller ID
To string // Called number
Duration time.Duration // Call duration (for ended events)
Error error // Error details (for failed events)
Metadata map[string]any // Provider-specific data
}
VoiceObserver Interface¶
Implement VoiceObserver to receive voice events:
Basic Usage¶
import "github.com/plexusone/omnivoice-core/observability"
// Create an observer using the function adapter
observer := observability.VoiceObserverFunc(func(ctx context.Context, event observability.VoiceEvent) {
log.Printf("[%s] %s: %s -> %s",
event.Type, event.CallID, event.From, event.To)
})
// Use with CallSystem
call, err := provider.MakeCall(ctx, "+15559876543",
callsystem.WithObserver(observer),
)
Emitting Events¶
Providers use EmitEvent to send events with functional options:
observability.EmitEvent(ctx, observer, observability.EventCallAnswered, callID, "twilio",
observability.WithDirection("outbound"),
observability.WithFrom("+15551234567"),
observability.WithTo("+15559876543"),
)
Multi-Observer¶
Fan out events to multiple observers:
multi := observability.NewMultiObserver(
metricsObserver,
loggingObserver,
analyticsObserver,
)
call, err := provider.MakeCall(ctx, to, callsystem.WithObserver(multi))
TTS Hooks¶
The TTSHook interface instruments text-to-speech operations:
type TTSHook interface {
// Called before synthesis
BeforeSynthesize(ctx context.Context, info TTSCallInfo, req TTSRequest) context.Context
// Called after synthesis completes
AfterSynthesize(ctx context.Context, info TTSCallInfo, req TTSRequest, resp *TTSResponse, err error)
// Wraps streaming audio for byte counting
WrapStream(ctx context.Context, info TTSCallInfo, req TTSRequest, stream <-chan []byte) <-chan []byte
}
TTSCallInfo¶
type TTSCallInfo struct {
CallID string // Unique identifier for correlation
Provider string // Provider name (e.g., "elevenlabs")
StartTime time.Time // Operation start time
VoiceID string // Voice being used
Model string // TTS model
}
TTSRequest / TTSResponse¶
type TTSRequest struct {
Text string // Text to synthesize
TextLength int // Character count
OutputFormat string // Audio format (e.g., "mp3")
SampleRate int // Audio sample rate
}
type TTSResponse struct {
AudioSize int64 // Generated audio size in bytes
Duration time.Duration // Audio duration
Latency time.Duration // Time to first byte (streaming)
}
Using TTS Hooks¶
// Set hook on client (applies to all operations)
ttsClient.SetHook(myTTSHook)
// Or per-request via config
result, err := provider.Synthesize(ctx, text, tts.SynthesisConfig{
VoiceID: "voice-id",
Hook: myTTSHook,
})
Example: Metrics Hook¶
type MetricsTTSHook struct {
synthesizeLatency prometheus.Histogram
audioBytes prometheus.Counter
errors prometheus.Counter
}
func (h *MetricsTTSHook) BeforeSynthesize(ctx context.Context, info observability.TTSCallInfo, req observability.TTSRequest) context.Context {
return ctx // Could add trace span to context
}
func (h *MetricsTTSHook) AfterSynthesize(ctx context.Context, info observability.TTSCallInfo, req observability.TTSRequest, resp *observability.TTSResponse, err error) {
if err != nil {
h.errors.Inc()
return
}
h.synthesizeLatency.Observe(resp.Latency.Seconds())
h.audioBytes.Add(float64(resp.AudioSize))
}
func (h *MetricsTTSHook) WrapStream(ctx context.Context, info observability.TTSCallInfo, req observability.TTSRequest, stream <-chan []byte) <-chan []byte {
return stream // Could wrap to count bytes
}
STT Hooks¶
The STTHook interface instruments speech-to-text operations:
type STTHook interface {
// Called before transcription
BeforeTranscribe(ctx context.Context, info STTCallInfo, req STTRequest) context.Context
// Called after transcription completes
AfterTranscribe(ctx context.Context, info STTCallInfo, req STTRequest, resp *STTResponse, err error)
// Wraps audio writer for byte tracking
WrapStreamWriter(ctx context.Context, info STTCallInfo, req STTRequest, writer io.WriteCloser) io.WriteCloser
// Called for each streaming result
OnStreamResult(ctx context.Context, info STTCallInfo, resp STTResponse)
}
STTCallInfo¶
type STTCallInfo struct {
CallID string // Unique identifier for correlation
Provider string // Provider name (e.g., "deepgram")
StartTime time.Time // Operation start time
Model string // STT model
Language string // Expected language
}
STTRequest / STTResponse¶
type STTRequest struct {
AudioSize int64 // Audio size in bytes
Encoding string // Audio encoding (e.g., "pcm")
SampleRate int // Sample rate
Channels int // Number of channels
IsStreaming bool // Streaming transcription
}
type STTResponse struct {
Transcript string // Transcribed text
TranscriptLength int // Character count
Confidence float64 // Confidence score (0-1)
AudioDuration time.Duration // Audio processed
Latency time.Duration // Processing latency
IsFinal bool // Final result (streaming)
}
Using STT Hooks¶
// Set hook on client
sttClient.SetHook(mySTTHook)
// Or per-request
result, err := provider.Transcribe(ctx, audio, stt.TranscriptionConfig{
Model: "nova-2",
Hook: mySTTHook,
})
NoOp Implementations¶
For testing or optional observability, use the provided no-op implementations:
// These do nothing but satisfy the interfaces
var _ observability.TTSHook = observability.NoOpTTSHook{}
var _ observability.STTHook = observability.NoOpSTTHook{}
Integration Patterns¶
OpenTelemetry¶
type OTelTTSHook struct {
tracer trace.Tracer
}
func (h *OTelTTSHook) BeforeSynthesize(ctx context.Context, info observability.TTSCallInfo, req observability.TTSRequest) context.Context {
ctx, span := h.tracer.Start(ctx, "tts.synthesize",
trace.WithAttributes(
attribute.String("tts.provider", info.Provider),
attribute.String("tts.voice", info.VoiceID),
attribute.Int("tts.text_length", req.TextLength),
),
)
return ctx
}
func (h *OTelTTSHook) AfterSynthesize(ctx context.Context, info observability.TTSCallInfo, req observability.TTSRequest, resp *observability.TTSResponse, err error) {
span := trace.SpanFromContext(ctx)
if err != nil {
span.RecordError(err)
span.SetStatus(codes.Error, err.Error())
} else {
span.SetAttributes(
attribute.Int64("tts.audio_bytes", resp.AudioSize),
attribute.Float64("tts.latency_ms", float64(resp.Latency.Milliseconds())),
)
}
span.End()
}
Logging¶
type LoggingObserver struct {
logger *slog.Logger
}
func (o *LoggingObserver) OnEvent(ctx context.Context, event observability.VoiceEvent) {
o.logger.Info("voice event",
"type", event.Type,
"call_id", event.CallID,
"provider", event.Provider,
"direction", event.Direction,
"from", event.From,
"to", event.To,
)
}
Best Practices¶
- Keep hooks lightweight - Observers are called synchronously; avoid blocking operations
- Handle errors internally - Hooks should not panic or return errors
- Use context for correlation - Pass trace IDs through context in
BeforeSynthesize/BeforeTranscribe - Aggregate metrics - Use counters and histograms rather than logging every event
- Filter events - Not all events need processing; filter by type as needed
API Reference¶
See the GoDoc for complete API documentation.