LLM Providers
Choosing an LLM provider involves tradeoffs between cost, latency, capability, data residency, and vendor lock-in. Beluga AI supports 22 LLM providers through a unified ChatModel interface so you can evaluate these tradeoffs without rewriting application code. All providers register via init() and are created through the same registry pattern — switching between providers requires changing only an import path and a configuration struct.
This design also enables multi-provider strategies: route complex queries to Claude or GPT-4o while sending simpler tasks to faster, cheaper models like Groq or local Ollama instances.
Provider Overview
Section titled “Provider Overview”| Provider | Registry Name | Models | Import Path |
|---|---|---|---|
| OpenAI | openai | GPT-4o, GPT-4, o1, o3 | llm/providers/openai |
| Anthropic | anthropic | Claude Opus 4, Sonnet 4.5, Haiku | llm/providers/anthropic |
google | Gemini 2.5, 2.0, 1.5 | llm/providers/google | |
| Ollama | ollama | Llama 3, Mistral, Phi, any local | llm/providers/ollama |
| AWS Bedrock | bedrock | Claude, Titan, Llama via AWS | llm/providers/bedrock |
| Azure OpenAI | azure | GPT-4o, GPT-4 via Azure | llm/providers/azure |
| Groq | groq | Llama, Mixtral on Groq LPUs | llm/providers/groq |
| Together AI | together | Open-source models hosted | llm/providers/together |
| Fireworks | fireworks | Open-source models, fast inference | llm/providers/fireworks |
| Mistral | mistral | Mistral Large, Medium, Small | llm/providers/mistral |
| Cohere | cohere | Command R+, Command R | llm/providers/cohere |
| DeepSeek | deepseek | DeepSeek V3, R1 | llm/providers/deepseek |
| xAI | xai | Grok-2, Grok-3 | llm/providers/xai |
| Perplexity | perplexity | Sonar models with search | llm/providers/perplexity |
| OpenRouter | openrouter | Multi-provider routing | llm/providers/openrouter |
| Qwen | qwen | Qwen-2.5, QwQ | llm/providers/qwen |
| Cerebras | cerebras | Llama on Cerebras hardware | llm/providers/cerebras |
| SambaNova | sambanova | Llama, Mistral on SambaNova | llm/providers/sambanova |
| HuggingFace | huggingface | Inference API models | llm/providers/huggingface |
| LiteLLM | litellm | Proxy for 100+ providers | llm/providers/litellm |
| Llama.cpp | llama | Local GGUF models | llm/providers/llama |
| Bifrost | bifrost | Multi-provider gateway | llm/providers/bifrost |
Common Pattern
Section titled “Common Pattern”Every provider follows the same three-step pattern:
package main
import ( "context" "fmt" "log"
"github.com/lookatitude/beluga-ai/config" "github.com/lookatitude/beluga-ai/llm" "github.com/lookatitude/beluga-ai/schema"
// 1. Import the provider — its init() registers with the llm registry _ "github.com/lookatitude/beluga-ai/llm/providers/openai")
func main() { ctx := context.Background()
// 2. Create the model via the registry model, err := llm.New("openai", config.ProviderConfig{ Model: "gpt-4o", APIKey: "sk-...", }) if err != nil { log.Fatal(err) }
// 3. Use the unified ChatModel interface resp, err := model.Generate(ctx, []schema.Message{ schema.NewUserMessage(schema.Text("What is Go?")), }) if err != nil { log.Fatal(err) } fmt.Println(resp.Content())}OpenAI
Section titled “OpenAI”OpenAI is the default provider for GPT-4o, o1, and o3 models.
export OPENAI_API_KEY="sk-..."import _ "github.com/lookatitude/beluga-ai/llm/providers/openai"
model, err := llm.New("openai", config.ProviderConfig{ Model: "gpt-4o", APIKey: os.Getenv("OPENAI_API_KEY"),})Available models: gpt-4o, gpt-4o-mini, gpt-4-turbo, o1, o1-mini, o3-mini
Options:
| Key | Type | Description |
|---|---|---|
temperature | float64 | Sampling temperature (0.0-2.0) |
max_tokens | float64 | Maximum response tokens |
top_p | float64 | Nucleus sampling |
frequency_penalty | float64 | Repetition penalty |
model, err := llm.New("openai", config.ProviderConfig{ Model: "gpt-4o", APIKey: os.Getenv("OPENAI_API_KEY"), Options: map[string]any{ "temperature": 0.7, "max_tokens": 4096.0, },})Anthropic
Section titled “Anthropic”Anthropic provides the Claude family of models with native tool use and extended context windows.
export ANTHROPIC_API_KEY="sk-ant-..."import _ "github.com/lookatitude/beluga-ai/llm/providers/anthropic"
model, err := llm.New("anthropic", config.ProviderConfig{ Model: "claude-sonnet-4-5-20250929", APIKey: os.Getenv("ANTHROPIC_API_KEY"),})Available models: claude-opus-4-6, claude-sonnet-4-5-20250929, claude-haiku-4-5-20251001
Anthropic uses a dedicated SDK rather than the OpenAI-compatible wrapper, which provides native support for Claude’s extended thinking, tool use, and content block streaming.
Google Gemini
Section titled “Google Gemini”Google provides Gemini models with multimodal capabilities.
export GOOGLE_API_KEY="AIza..."import _ "github.com/lookatitude/beluga-ai/llm/providers/google"
model, err := llm.New("google", config.ProviderConfig{ Model: "gemini-2.5-pro", APIKey: os.Getenv("GOOGLE_API_KEY"),})Available models: gemini-2.5-pro, gemini-2.5-flash, gemini-2.0-flash, gemini-1.5-pro
Ollama (Local Models)
Section titled “Ollama (Local Models)”Ollama runs open-source models locally with no API key required.
# Start Ollama serverollama serveollama pull llama3.1import _ "github.com/lookatitude/beluga-ai/llm/providers/ollama"
model, err := llm.New("ollama", config.ProviderConfig{ Model: "llama3.1", BaseURL: "http://localhost:11434/v1",})Ollama uses the OpenAI-compatible API format, so all standard options apply. No API key is needed for local usage.
AWS Bedrock
Section titled “AWS Bedrock”AWS Bedrock provides access to multiple model families through your AWS account.
export AWS_ACCESS_KEY_ID="..."export AWS_SECRET_ACCESS_KEY="..."export AWS_REGION="us-east-1"import _ "github.com/lookatitude/beluga-ai/llm/providers/bedrock"
model, err := llm.New("bedrock", config.ProviderConfig{ Model: "anthropic.claude-sonnet-4-5-20250929-v1:0", Options: map[string]any{ "region": "us-east-1", },})Bedrock reads AWS credentials from the standard AWS credential chain (environment variables, shared config, IAM role).
Azure OpenAI
Section titled “Azure OpenAI”Azure OpenAI provides GPT models via your Azure subscription with enterprise data residency.
export AZURE_OPENAI_API_KEY="..."export AZURE_OPENAI_ENDPOINT="https://my-resource.openai.azure.com"import _ "github.com/lookatitude/beluga-ai/llm/providers/azure"
model, err := llm.New("azure", config.ProviderConfig{ Model: "gpt-4o", APIKey: os.Getenv("AZURE_OPENAI_API_KEY"), BaseURL: os.Getenv("AZURE_OPENAI_ENDPOINT"), Options: map[string]any{ "api_version": "2024-06-01", "deployment_name": "my-gpt4o-deployment", },})Groq provides ultra-fast inference on custom LPU hardware.
export GROQ_API_KEY="gsk_..."import _ "github.com/lookatitude/beluga-ai/llm/providers/groq"
model, err := llm.New("groq", config.ProviderConfig{ Model: "llama-3.1-70b-versatile", APIKey: os.Getenv("GROQ_API_KEY"),})Together AI
Section titled “Together AI”Together AI hosts open-source models with OpenAI-compatible API.
export TOGETHER_API_KEY="..."import _ "github.com/lookatitude/beluga-ai/llm/providers/together"
model, err := llm.New("together", config.ProviderConfig{ Model: "meta-llama/Llama-3.1-70B-Instruct-Turbo", APIKey: os.Getenv("TOGETHER_API_KEY"),})Fireworks
Section titled “Fireworks”Fireworks provides fast inference for open-source models.
export FIREWORKS_API_KEY="..."import _ "github.com/lookatitude/beluga-ai/llm/providers/fireworks"
model, err := llm.New("fireworks", config.ProviderConfig{ Model: "accounts/fireworks/models/llama-v3p1-70b-instruct", APIKey: os.Getenv("FIREWORKS_API_KEY"),})Mistral
Section titled “Mistral”Mistral AI provides European-hosted models.
export MISTRAL_API_KEY="..."import _ "github.com/lookatitude/beluga-ai/llm/providers/mistral"
model, err := llm.New("mistral", config.ProviderConfig{ Model: "mistral-large-latest", APIKey: os.Getenv("MISTRAL_API_KEY"),})DeepSeek
Section titled “DeepSeek”DeepSeek provides high-performance reasoning models.
export DEEPSEEK_API_KEY="..."import _ "github.com/lookatitude/beluga-ai/llm/providers/deepseek"
model, err := llm.New("deepseek", config.ProviderConfig{ Model: "deepseek-chat", APIKey: os.Getenv("DEEPSEEK_API_KEY"),})Streaming
Section titled “Streaming”All providers support streaming via iter.Seq2:
for chunk, err := range model.Stream(ctx, messages) { if err != nil { log.Fatal(err) } fmt.Print(chunk.Delta)}Middleware
Section titled “Middleware”Add cross-cutting concerns to any provider using middleware:
model = llm.ApplyMiddleware(model, llm.WithLogging(slog.Default()), llm.WithRetry(3),)Middleware is composable and applies to Generate, Stream, and BindTools uniformly across all providers.
Discovering Providers at Runtime
Section titled “Discovering Providers at Runtime”List all registered providers:
providers := llm.List()fmt.Println("Available providers:", providers)This returns all providers whose import-side-effect init() has run. Use this for dynamic provider selection in configuration-driven applications.
Configuration Reference
Section titled “Configuration Reference”All providers accept config.ProviderConfig:
| Field | Type | Description |
|---|---|---|
Provider | string | Registry name (e.g. "openai") |
APIKey | string | Authentication key |
Model | string | Model identifier |
BaseURL | string | Override default API endpoint |
Timeout | time.Duration | Request timeout (default: 30s) |
Options | map[string]any | Provider-specific options |
OpenAI-Compatible Providers
Section titled “OpenAI-Compatible Providers”Most providers (18 of 22) use Beluga’s shared internal/openaicompat package, which means they accept the same Options keys: temperature, max_tokens, top_p, frequency_penalty, presence_penalty, stop, and response_format. The only providers with custom implementations are Anthropic, Google, Bedrock, and Bifrost.