Skip to content
Docs

LLM Providers — 22 Models

Beluga AI provides a unified llm.ChatModel interface across 22 LLM providers. Every provider registers itself via init(), so a blank import is sufficient to make it available through the registry.

This unified interface means you can switch providers by changing a single line of configuration — your application code, middleware, and hooks work identically across all providers. Start with any provider during development and switch to another for production without code changes.

All providers implement the same interface:

type ChatModel interface {
Generate(ctx context.Context, msgs []schema.Message, opts ...GenerateOption) (*schema.AIMessage, error)
Stream(ctx context.Context, msgs []schema.Message, opts ...GenerateOption) iter.Seq2[schema.StreamChunk, error]
BindTools(tools []schema.ToolDefinition) ChatModel
ModelID() string
}

You can instantiate any provider two ways:

Via the registry (recommended for dynamic configuration):

import (
"github.com/lookatitude/beluga-ai/config"
"github.com/lookatitude/beluga-ai/llm"
_ "github.com/lookatitude/beluga-ai/llm/providers/openai"
)
model, err := llm.New("openai", config.ProviderConfig{
Model: "gpt-4o",
APIKey: os.Getenv("OPENAI_API_KEY"),
})

Via direct construction (for compile-time type safety):

import "github.com/lookatitude/beluga-ai/llm/providers/openai"
model, err := openai.New(config.ProviderConfig{
Model: "gpt-4o",
APIKey: os.Getenv("OPENAI_API_KEY"),
})

All providers accept config.ProviderConfig:

FieldTypeDescription
ProviderstringRegistered provider name (e.g. "openai")
APIKeystringAuthentication key
ModelstringModel identifier (e.g. "gpt-4o")
BaseURLstringOverride the default API endpoint
Timeouttime.DurationMaximum request duration (default: 30s)
Optionsmap[string]anyProvider-specific key-value configuration

All providers support the same set of generation options passed via functional options:

resp, err := model.Generate(ctx, msgs,
llm.WithTemperature(0.7),
llm.WithMaxTokens(1024),
llm.WithTopP(0.9),
llm.WithStopSequences("END"),
llm.WithToolChoice(llm.ToolChoiceAuto),
llm.WithResponseFormat(llm.ResponseFormat{Type: "json_object"}),
)

These providers use their vendor’s native SDK and offer the deepest feature integration, including provider-specific capabilities like vision, prompt caching, and extended context windows:

ProviderRegistry NameDescription
OpenAIopenaiGPT-4o, GPT-4, o1, o3 models
AnthropicanthropicClaude 4.5, Claude 4 models
GooglegoogleGemini 2.5, Gemini 2.0 models
Azure OpenAIazureOpenAI models hosted on Azure
AWS BedrockbedrockMulti-provider models via AWS
MistralmistralMistral Large, Codestral models
CoherecohereCommand R+ models

These providers expose an OpenAI-compatible API and share a common implementation layer via Beluga’s internal/openaicompat package. They all support streaming, tool calling, and structured output through the same code path:

ProviderRegistry NameDescription
GroqgroqUltra-fast inference with LPU hardware
Together AItogetherOpen-source model hosting
Fireworks AIfireworksFast inference for open models
DeepSeekdeepseekDeepSeek V3, R1 reasoning models
OpenRouteropenrouterMulti-provider routing gateway
PerplexityperplexitySearch-augmented generation
HuggingFacehuggingfaceInference API for hosted models
xAIxaiGrok models
QwenqwenAlibaba Qwen models via DashScope
SambaNovasambanovaHigh-throughput inference
CerebrascerebrasWafer-scale inference
OllamaollamaLocal model serving

These providers delegate to other providers or gateways, adding a routing or abstraction layer between your application and the underlying LLM service:

ProviderRegistry NameDescription
LlamallamaMeta Llama models via any backend
BifrostbifrostLLM gateway with load balancing
LiteLLMlitellmUniversal LLM proxy (100+ models)
Use CaseRecommended ProviderWhy
General-purpose defaultOpenAIBroadest ecosystem, mature tooling
Strong reasoning and safetyAnthropicLarge context, prompt caching
Multimodal (text + images + video)GoogleLong context, Google Cloud integration
Enterprise Azure complianceAzure OpenAIPrivate networking, AAD, SLAs
AWS-native deploymentAWS BedrockIAM roles, multi-provider catalog
Lowest inference latencyGroq, CerebrasCustom hardware, fastest tokens/sec
Local/offline developmentOllamaNo API key, no network required
Search-augmented answersPerplexityBuilt-in web search
Model comparison and evaluationOpenRouterSingle API key, hundreds of models
Infrastructure-level LLM managementLiteLLMSpend tracking, rate limiting, proxy

All providers support the same middleware for cross-cutting concerns:

model = llm.ApplyMiddleware(model,
llm.WithLogging(logger),
llm.WithFallback(backupModel),
)

See the LLM middleware guide for details.