Skip to main content

Overview

SupertonicTTSService is a Pipecat-compatible TTSService wrapper for the official Supertonic Python SDK. It runs the Supertonic model locally and outputs TTSAudioRawFrame audio. The package is an independent community integration and is not affiliated with Supertone or the Supertonic team.

Source Repository

Source code, examples, and issues for the Supertonic integration

PyPI Package

The pipecat-supertonic package on PyPI

Supertonic

The official Supertonic SDK and model

Installation

This is a community-maintained package distributed separately from pipecat-ai:
uv add pipecat-supertonic

Prerequisites

Supertonic runs locally; no account or API key is required. By default the service downloads and caches the Supertonic model on first use (controlled by the auto_download parameter). Call warmup() during application startup before the service is used in a live Pipecat pipeline. The service does not lazy-load Supertonic during active TTS requests and fails fast if used before warmup, which avoids first-request cold-start delays and keeps TTS frame ordering stable.

Configuration

model
str
default:"supertonic-3"
Supertonic model name.
voice
str
default:"M1"
Supertonic voice name. Supported voices: F1, F2, F3, F4, F5, M1, M2, M3, M4, M5.
language
Language | str
default:"Language.EN"
Language for synthesis.
speed
float
default:"1.05"
Speech speed multiplier.
total_steps
int
default:"5"
Number of synthesis steps.
max_chunk_length
int
default:"None"
Maximum characters per synthesized chunk.
silence_duration
float
default:"0.3"
Silence inserted between synthesized chunks.
auto_download
bool
default:"True"
Whether to download model assets automatically.
intra_op_num_threads
int
default:"None"
ONNX intra-op thread count.
inter_op_num_threads
int
default:"None"
ONNX inter-op thread count.
sample_rate
int
default:"None"
Output sample rate for generated audio.
settings
SupertonicTTSService.Settings
default:"None"
Runtime-configurable settings. When provided alongside direct parameters, settings values take precedence. See Settings below.

Settings

Runtime-configurable settings passed via the settings constructor argument using SupertonicTTSService.Settings(...).
ParameterTypeDefaultDescription
speedfloatSpeech speed multiplier.
total_stepsintNumber of synthesis steps.
max_chunk_lengthintMaximum characters per synthesized chunk.
silence_durationfloatSilence inserted between synthesized chunks.
The Settings dataclass extends Pipecat’s TTSSettings. See the source repository for the authoritative, up-to-date list.

Usage

from pipecat.pipeline.pipeline import Pipeline
from pipecat_supertonic import SupertonicTTSService

tts = SupertonicTTSService(
    settings=SupertonicTTSService.Settings(
        voice="M1",
        language="en",
        total_steps=5,
        speed=1.05,
    )
)

# Required before use in a live pipeline. Call during application startup.
await tts.warmup()

pipeline = Pipeline(
    [
        transport.input(),
        stt,
        llm,
        tts,
        transport.output(),
    ]
)

Compatibility

Tested with pipecat-ai==1.2.0 and supertonic==1.2.1. Check the source repository for the latest tested version and changelog.