Skip to content

Smallest AI Voice Provider

Smallest AI provides lightning-fast text-to-speech synthesis with low-latency models optimized for real-time applications. The Beluga AI provider uses the Smallest AI v1 API for synthesis, supporting configurable voice, model, and speed settings.

Choose Smallest AI when you want a lightweight TTS option with fast synthesis speeds and simple voice/model configuration. The lightning model is optimized for low-latency synthesis with minimal setup. For broader voice variety or voice cloning, consider ElevenLabs or PlayHT.

Installation

import _ "github.com/lookatitude/beluga-ai/voice/tts/providers/smallest"

The blank import registers the "smallest" provider with the TTS registry.

Configuration

Field	Type	Default	Description
`Voice`	`string`	`"emily"`	Voice identifier
`Model`	`string`	`"lightning"`	Model identifier (lightning)
`Speed`	`float64`	—	Speech rate multiplier (1.0 = normal)
`Extra`	—	—	See below

Extra Fields

Key	Type	Required	Description
`api_key`	`string`	Yes	Smallest AI API key
`base_url`	`string`	No	Override base URL

Basic Usage

package main

import (
    "context"
    "log"
    "os"

    "github.com/lookatitude/beluga-ai/voice/tts"
    _ "github.com/lookatitude/beluga-ai/voice/tts/providers/smallest"
)

func main() {
    ctx := context.Background()

    engine, err := tts.New("smallest", tts.Config{
        Voice: "emily",
        Extra: map[string]any{"api_key": os.Getenv("SMALLEST_API_KEY")},
    })
    if err != nil {
        log.Fatal(err)
    }

    audio, err := engine.Synthesize(ctx, "Hello, welcome to Beluga AI.")
    if err != nil {
        log.Fatal(err)
    }

    if err := os.WriteFile("output.wav", audio, 0644); err != nil {
        log.Fatal(err)
    }
}

Direct Construction

import "github.com/lookatitude/beluga-ai/voice/tts/providers/smallest"

engine, err := smallest.New(tts.Config{
    Voice: "emily",
    Model: "lightning",
    Extra: map[string]any{"api_key": os.Getenv("SMALLEST_API_KEY")},
})

Streaming

The streaming interface synthesizes each text chunk independently:

for chunk, err := range engine.SynthesizeStream(ctx, textStream) {
    if err != nil {
        log.Printf("error: %v", err)
        break
    }
    transport.Send(chunk)
}

FrameProcessor Integration

processor := tts.AsFrameProcessor(engine, 24000, tts.WithVoice("emily"))
pipeline := voice.Chain(sttProcessor, llmProcessor, processor)

Advanced Features

Per-Request Options

audio, err := engine.Synthesize(ctx, "Hello!",
    tts.WithVoice("different-voice"),
    tts.WithModel("lightning"),
    tts.WithSpeed(1.3),
)

Custom Endpoint

engine, err := tts.New("smallest", tts.Config{
    Voice: "emily",
    Extra: map[string]any{
        "api_key":  os.Getenv("SMALLEST_API_KEY"),
        "base_url": "https://smallest.internal.corp/v1",
    },
})