Skip to content

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, this project adheres to Semantic Versioning, commits follow Conventional Commits, and this changelog is generated by Structured Changelog.

Unreleased

v0.6.0 - 2026-03-21

Highlights

  • Observability package for voice instrumentation with hooks and events
  • Registry package for provider discovery and registration
  • CallSystem client with multi-provider failover support
  • SMS messaging support via SMSProvider interface

Added

  • observability package with VoiceEvent, VoiceObserver, TTSHook, and STTHook interfaces (dda0212)
  • Event types for call lifecycle (initiated, ringing, answered, ended, failed) (dda0212)
  • NoOpTTSHook and NoOpSTTHook for optional instrumentation (dda0212)
  • registry package with Registry interface for provider discovery (0ec559f)
  • Factory types for TTS, STT, and CallSystem providers (0ec559f)
  • callsystem.Client for managing multiple CallSystem providers with automatic failover (c12a16a)
  • SMSProvider interface and SMSMessage type for SMS support (d7d8c63)
  • Hook field in SynthesisConfig for TTS observability (8b3c38a)
  • Hook field in TranscriptionConfig for STT observability (8b3c38a)
  • Observer field in CallSystemConfig and CallOptions for call events (8b3c38a)
  • WithObserver CallOption for per-call observability (8b3c38a)
  • ObservableCallSystem interface combining CallSystem with Observable (8b3c38a)
  • SetHook() and Hook() methods on TTS and STT clients (8b3c38a)

Fixed

  • Resolved gosec G120 warnings in Twilio webhook example by adding http.MaxBytesReader (ac115b7)

v0.5.0 - 2026-02-28

Highlights

  • Organization rename from agentplexus to plexusone

Changed

  • Breaking: Go module path changed from github.com/agentplexus/omnivoice to github.com/plexusone/omnivoice-core (bf46b07)

v0.4.3 - 2026-02-15

Highlights

  • Comprehensive tests for English and Chinese subtitle generation

Tests

  • TestWordsToSubtitleCues_EnglishWordGrouping for word-based cue grouping (0ddb8bc)
  • TestWordsToSubtitleCues_ChineseCharacters for character-by-character tokenization (0ddb8bc)
  • TestWordsToSubtitleCues_MixedChineseEnglish for mixed language content (0ddb8bc)
  • TestWordsToSubtitleCues_LongChineseText for multi-cue splitting (0ddb8bc)

v0.4.2 - 2026-02-15

Highlights

  • Fixed subtitle word cutoff at line boundaries

Fixed

  • Subtitle cue chunking now checks actual wrapped line count instead of total character count, preventing words from being cut off when they would appear on a third line (a301897)

Tests

  • TestWordsToSubtitleCues_LineCountLimit verifies cues split correctly at line boundaries (a301897)

v0.4.1 - 2026-02-14

Highlights

  • STT conformance tests for TranscribeFile and TranscribeURL batch transcription methods

Tests

  • TranscribeFile conformance test for local file transcription (c441944)
  • TranscribeURL conformance test for remote URL transcription (c441944)

v0.4.0 - 2026-02-14

Highlights

  • Subtitle generation from STT transcription results
  • Extensible config maps for provider-specific settings

Added

  • Subtitle package for SRT/VTT generation from transcription results (17730a7)
  • Configurable max characters per line and lines per cue for subtitles (17730a7)
  • Word-level timestamp-based cue splitting (17730a7)
  • Extensions map in TranscriptionConfig for provider-specific STT settings (84c37f5)
  • Extensions map in SynthesisConfig for provider-specific TTS settings (665c3be)

Fixed

  • Subtitle wrapText no longer clips words when text exceeds line limit (63144bb)

Documentation

  • Voice cloning guide with recording tips and phonetically balanced text (1f0cdd8)

Tests

  • Call system provider conformance tests (MakeCall, ListCalls, OnIncomingCall) (9683ca2)
  • Transport provider conformance tests (Listen, Connect, Protocol) (9683ca2)

v0.3.0 - 2026-01-24

Highlights

  • Provider conformance test suites for TTS and STT implementations

Added

  • TTS provider conformance test suite (Synthesize, SynthesizeStream, SynthesizeFromReader) (e3705c7)
  • Mock TTS provider for self-testing with configurable audio format responses (e3705c7)
  • STT provider conformance test suite (Transcribe, TranscribeStream) (69cfd20)
  • Mock STT provider with streaming transcription simulation (69cfd20)

Fixed

  • MCP session and tool handlers now log Close() errors instead of discarding (6099072)

Documentation

  • Provider conformance testing TRD describing test categories and API design (58a9697)

Build

v0.2.0 - 2026-01-18

Highlights

  • Audio codec package with PCM, mu-law, and a-law support for telephony
  • MCP server enabling Claude Code to make voice calls
  • Pipeline components connecting STT, TTS, and transport providers

Added

  • Audio codec package with PCM sample conversions (int16, float32, float64, bytes) (f64fe1e)
  • Mu-law encoding/decoding for Twilio Media Streams (f64fe1e)
  • A-law encoding/decoding for international telephony (f64fe1e)
  • Audio resampling, normalization, and analysis utilities (f64fe1e)
  • MCP server with stdio transport for voice interactions (721cbac)
  • Voice interaction tools: initiate_call, continue_call, speak_to_user, end_call (721cbac)
  • Session management for tracking active voice calls (721cbac)
  • TTSPipeline for streaming TTS output to transport connections (11c906d)
  • StreamingTTSPipeline for connecting streaming LLM text to TTS to transport (11c906d)
  • STTPipeline for streaming audio from transport to STT with event callbacks (11c906d)

Documentation

  • Voice integration PRD outlining goals, user stories, and success metrics (fd86611)
  • Twilio integration TRD detailing Media Streams architecture (fd86611)

Tests

  • Comprehensive unit tests for audio codec functions (mu-law, a-law, PCM) (f64fe1e)

v0.1.0 - 2025-12-28

Highlights

  • Initial OmniVoice voice abstraction layer for multi-provider telephony

Added

  • Voice abstraction layer with provider-agnostic interfaces (8a54bc2)
  • STT (speech-to-text) provider interface with streaming support (8a54bc2)
  • TTS (text-to-speech) provider interface with streaming support (8a54bc2)
  • Transport interface for audio connections (Twilio, Zoom, etc.) (8a54bc2)
  • Export CallOptions for provider implementations (7e1b52d)

Documentation

  • README with project overview and shields (4f298df)
  • Marp presentation for OmniVoice (d2d67cf)

Build

  • GitHub Actions CI workflow (4bad35d)
  • golangci-lint configuration and fixes (3693297)