Skip to content
Docs

Fish Audio Voice Provider

Fish Audio provides text-to-speech synthesis with support for voice cloning via reference IDs. The Beluga AI provider uses the Fish Audio v1 API for synthesis, producing audio output in configurable formats.

Choose Fish Audio when you want an open-source-friendly TTS option with voice cloning capabilities. Fish Audio’s reference ID system allows you to use community-shared voices or create your own cloned voices. For the most polished commercial voice quality, consider ElevenLabs.

import _ "github.com/lookatitude/beluga-ai/voice/tts/providers/fish"

The blank import registers the "fish" provider with the TTS registry.

FieldTypeDefaultDescription
Voicestring"default"Reference ID for voice cloning
FormatAudioFormatOutput audio format
ExtraSee below
KeyTypeRequiredDescription
api_keystringYesFish Audio API key
base_urlstringNoOverride base URL
package main
import (
"context"
"log"
"os"
"github.com/lookatitude/beluga-ai/voice/tts"
_ "github.com/lookatitude/beluga-ai/voice/tts/providers/fish"
)
func main() {
ctx := context.Background()
engine, err := tts.New("fish", tts.Config{
Voice: "default",
Extra: map[string]any{"api_key": os.Getenv("FISH_API_KEY")},
})
if err != nil {
log.Fatal(err)
}
audio, err := engine.Synthesize(ctx, "Hello, welcome to Beluga AI.")
if err != nil {
log.Fatal(err)
}
if err := os.WriteFile("output.wav", audio, 0644); err != nil {
log.Fatal(err)
}
}
import "github.com/lookatitude/beluga-ai/voice/tts/providers/fish"
engine, err := fish.New(tts.Config{
Voice: "custom-reference-id",
Extra: map[string]any{"api_key": os.Getenv("FISH_API_KEY")},
})

The streaming interface synthesizes each text chunk independently:

for chunk, err := range engine.SynthesizeStream(ctx, textStream) {
if err != nil {
log.Printf("error: %v", err)
break
}
transport.Send(chunk)
}
processor := tts.AsFrameProcessor(engine, 24000)
pipeline := voice.Chain(sttProcessor, llmProcessor, processor)

Fish Audio uses a reference ID system for voice cloning. Provide a reference_id as the Voice field to use a cloned voice:

engine, err := tts.New("fish", tts.Config{
Voice: "your-cloned-voice-reference-id",
Extra: map[string]any{"api_key": os.Getenv("FISH_API_KEY")},
})
audio, err := engine.Synthesize(ctx, "Hello!",
tts.WithVoice("different-reference-id"),
tts.WithFormat(tts.FormatMP3),
)
engine, err := tts.New("fish", tts.Config{
Voice: "default",
Extra: map[string]any{
"api_key": os.Getenv("FISH_API_KEY"),
"base_url": "https://fish.internal.corp/v1",
},
})