Open Source MIT Licensed Go 1.23+

The Go Framework for
Production AI Agents

Build, orchestrate, and deploy agentic AI systems with streaming-first design, 22+ LLM providers, and enterprise-grade infrastructure. One go get away.

Get Started View on GitHub

main.go

model, _ := llm.New("openai", llm.ProviderConfig{Model: "gpt-4o"})
agent := agent.New("researcher",
    agent.WithLLM(model),
    agent.WithTools(webSearch, calculator),
    agent.WithMemory(memory.NewSemantic(embedder, store)),
)
for event, err := range agent.Stream(ctx, "Analyze Q4 earnings") {
    fmt.Print(event.Text())
}

22+ LLM Providers OpenAI, Anthropic, Gemini, Ollama, Bedrock, Groq...

157 Packages Covering every layer of agent development

2,885 Tests Production-grade test coverage

108 Integrations Vector stores, embeddings, voice, tools, infra

The Problem

AI agent frameworks were built for Python. Your production stack runs Go.

Go teams building AI products are forced to wrap Python services, use incomplete Go libraries, or build from scratch. The Go AI ecosystem is fragmented — no single framework covers the full stack from LLM abstraction to production deployment.

Beluga AI

One framework. Every building block. Pure Go.

Beluga AI provides streaming LLM abstraction, agent runtimes with pluggable reasoning, RAG pipelines, voice processing, guardrails, durable workflows, and protocol interoperability (MCP + A2A) — all idiomatic Go, all composable, all streaming-first.

Core capabilities

Everything you need to build, deploy, and operate agentic AI systems.

Agent Runtime & Reasoning

Build agents with pluggable reasoning — ReAct, Tree-of-Thought, LATS, Reflexion, and more. Handoffs-as-tools enable multi-agent collaboration with zero boilerplate.

Learn more

LLM Abstraction & Routing

Unified ChatModel interface across 22+ providers with intelligent routing, structured output, context window management, and provider-aware rate limiting.

Learn more

RAG Pipeline

Hybrid retrieval combining dense vectors, BM25, and graph traversal with RRF. Advanced strategies: CRAG, Adaptive RAG, HyDE, GraphRAG.

Learn more

Voice Pipeline

Frame-based STT to LLM to TTS processing with sub-800ms latency. Speech-to-speech modes, Silero VAD, semantic turn detection, WebRTC/LiveKit transports.

Learn more

Guardrails & Safety

Three-stage guard pipeline with prompt injection detection, PII filtering, and capability-based sandboxing with default-deny policies.

Learn more

Protocols & Interop

First-class MCP server/client for tool ecosystems and A2A for agent-to-agent communication. REST, SSE, gRPC, and WebSocket transports.

Learn more

Layered architecture. Clean separation of concerns.

Dependencies flow downward. Every layer is independently extensible.

L4 Protocol

MCP Server/Client A2A Server/ClientREST/SSEgRPCWebSocketFramework Adapters

L3 Infrastructure

Guard PipelineCircuit BreakerRetryCachingAuth (RBAC/ABAC)HITLDurable Workflows

L2 Capability

LLM AgentsToolsMemoryRAG Voice

L1 Foundation

SchemaStream (iter.Seq2)ConfigObservability (OTel GenAI)Transport

Foundation Core primitives and shared types

Core primitives, shared schema types, configuration loading with hot-reload, and OpenTelemetry observability using GenAI semantic conventions. Zero external dependencies beyond stdlib and OTel.

Capability LLM, Agents, RAG, Voice, and more

LLM abstraction with router and structured output, agent runtime with planners and handoffs, tool system, MemGPT 3-tier memory, RAG pipeline with hybrid retrieval, and frame-based voice processing.

Infrastructure Safety, resilience, and operations

Three-stage guard pipeline for input/output/tool safety, circuit breaker and retry resilience, exact and semantic caching, RBAC/ABAC auth, human-in-the-loop approval, and durable workflow execution.

Protocol Interoperability and transport

MCP server and client for tool interoperability, A2A for agent-to-agent communication, REST/SSE endpoints, gRPC services, and HTTP framework adapters for Gin, Fiber, Echo, and Chi.

See it in action

From simple agents to multi-agent systems and voice pipelines, all in a few lines of Go.

model, _ := llm.New("openai", llm.ProviderConfig{
    APIKey: os.Getenv("OPENAI_API_KEY"),
    Model:  "gpt-4o",
})
agent := agent.New("assistant",
    agent.WithLLM(model),
    agent.WithTools(webSearch, calculator),
)
for event, err := range agent.Stream(ctx, "Research GPU trends") {
    fmt.Print(event.Text())
}

Run this example

embedder, _ := embedding.New("openai", embedding.Config{Model: "text-embedding-3-small"})
store, _ := vectorstore.New("pgvector", vectorstore.Config{DSN: pgDSN})
retriever := retriever.NewHybrid(store, retriever.WithBM25(), retriever.WithRRF(60))

docs := loader.LoadDir("./knowledge-base/")
splitter := splitter.NewRecursive(splitter.WithChunkSize(512))
chunks := splitter.Split(docs)
store.Add(ctx, embedder.EmbedDocuments(ctx, chunks))

agent := agent.New("researcher", agent.WithLLM(model), agent.WithRetriever(retriever))
for event, err := range agent.Stream(ctx, "What are our Q4 results?") {
    fmt.Print(event.Text())
}

Run this example

billing := agent.New("billing-specialist",
    agent.WithLLM(model),
    agent.WithTools(lookupInvoice, processRefund),
)
shipping := agent.New("shipping-specialist",
    agent.WithLLM(model),
    agent.WithTools(trackPackage, updateAddress),
)
triage := agent.New("triage",
    agent.WithLLM(model),
    agent.WithHandoffs(billing, shipping),
    agent.WithInstructions("Route customer requests to the right specialist."),
)
for event, err := range triage.Stream(ctx, "Where is my order #12345?") {
    fmt.Print(event.Text())
}

Run this example

stt, _ := stt.New("deepgram", stt.Config{Model: "nova-3"})
tts, _ := tts.New("elevenlabs", tts.Config{Voice: "aria"})
vad := vad.NewSilero(vad.WithThreshold(0.5))

pipeline := voice.NewPipeline(
    voice.WithSTT(stt),
    voice.WithLLM(model),
    voice.WithTTS(tts),
    voice.WithVAD(vad),
)
transport := transport.NewWebSocket(":8080")
pipeline.Start(ctx, transport)

Run this example

Extensive provider ecosystem

108 integrations across 12 categories. All pluggable via the registry pattern.

LLMEmbeddingsVector StoresVoiceTools & LoadersInfrastructure

OpenAI

Anthropic

Google Gemini

AWS Bedrock

Ollama

Groq

Mistral

DeepSeek

xAI Grok

Cohere

Together AI

Fireworks AI

Azure OpenAI

Perplexity

SambaNova

Cerebras

OpenRouter

Hugging Face

OpenAI

Google

Ollama

Cohere

Voyage

Jina

Mistral

Sentence Transformers

pgvector

Qdrant

Pinecone

ChromaDB

Weaviate

Milvus

Turbopuffer

Redis

Elasticsearch

MongoDB

SQLite-vec

Vespa

Deepgram

ElevenLabs

Cartesia

OpenAI Realtime

Gemini Live

AssemblyAI

PlayHT

Fish Audio

Silero VAD

LiveKit

MCP Client/Server

FuncTool

Firecrawl

Unstructured.io

Docling

Confluence

Notion

GitHub

S3/GCS

Langfuse

Arize Phoenix

RAGAS

NeMo Guardrails

Temporal

NATS

Gin

Fiber

Echo

Chi

Go is the production language for AI infrastructure

Performance

15,000+ RPS

with p95 < 30ms. Compiled to machine code. No GIL.

Deployment

Single binary

No dependency management. Smallest container images.

Cloud-native

Born for K8s

Kubernetes, Docker, Terraform are all Go. Your AI agents should be too.

Read the full case for Go

Start building AI agents in 5 minutes

Install go get github.com/lookatitude/beluga-ai

Configure export OPENAI_API_KEY=sk-...

Run go run main.go

Read the Quick Start Browse the Cookbook

AI Agents

Data & Retrieval

Infrastructure

Orchestration

The Go Framework for
Production AI Agents

AI agent frameworks were built for Python. Your production stack runs Go.

One framework. Every building block. Pure Go.

Core capabilities

Agent Runtime & Reasoning

LLM Abstraction & Routing

RAG Pipeline

Voice Pipeline

Guardrails & Safety

Protocols & Interop

Layered architecture. Clean separation of concerns.

See it in action

Extensive provider ecosystem

Go is the production language for AI infrastructure

Start building AI agents in 5 minutes

AI Agents

Data & Retrieval

Infrastructure

Orchestration

The Go Framework forProduction AI Agents

AI agent frameworks were built for Python. Your production stack runs Go.

One framework. Every building block. Pure Go.

Core capabilities

Agent Runtime & Reasoning

LLM Abstraction & Routing

RAG Pipeline

Voice Pipeline

Guardrails & Safety

Protocols & Interop

Layered architecture. Clean separation of concerns.

See it in action

Extensive provider ecosystem

Go is the production language for AI infrastructure

Start building AI agents in 5 minutes

The Go Framework for
Production AI Agents