What Beluga can do.
A complete agent stack organized around three concerns. Build is where agents come from. Know is what agents remember and retrieve. Ship is how agents behave in production.
Agents that reason, act, and recover.
The runtime runs a Plan → Act → Observe → Replan loop on every turn. Eight planning strategies share one Planner interface — swap them with a config change. LLM calls route across 22 providers with a unified ChatModel. Tools are any Go function wrapped with a schema. Handoffs between agents are auto-generated tools.
import (
_ "github.com/lookatitude/beluga-ai/llm/providers/anthropic"
"github.com/lookatitude/beluga-ai/agent"
"github.com/lookatitude/beluga-ai/tool"
)
model, _ := llm.New("anthropic", llm.Config{Model: "claude-sonnet-4-6"})
researchAgent, _ := agent.New(ctx,
agent.WithLLM(model),
agent.WithPersona("senior researcher, cites sources"),
agent.WithPlanner("react"),
agent.WithTools(tool.Must(tool.HTTPFetch()), tool.Must(tool.MarkdownParser())),
)
stream, _ := researchAgent.Stream(ctx, "summarise streaming-first patterns in Beluga")
for ev, err := range stream.Range {
if err != nil { break }
ev.Render()
} Memory that persists. Retrieval that finds the right thing.
Three-tier memory — working, recall, archival — with graph-store support. RAG uses hybrid retrieval: BM25, dense vector, and graph traversal, fused with Reciprocal Rank Fusion. Strategies include CRAG, Adaptive RAG, HyDE, and GraphRAG. Thirteen vector-store backends, nine embedding providers, eight memory stores.
import (
"github.com/lookatitude/beluga-ai/rag/retriever"
"github.com/lookatitude/beluga-ai/rag/vectorstore"
_ "github.com/lookatitude/beluga-ai/rag/vectorstore/providers/pgvector"
_ "github.com/lookatitude/beluga-ai/rag/embedding/providers/openai"
)
store, _ := vectorstore.New("pgvector", vectorstore.Config{
DSN: "postgres://beluga@db:5432/kb",
Dimension: 1536,
})
// Hybrid retrieval: BM25 + dense vector, fused with RRF.
hybrid := retriever.Hybrid(
retriever.BM25(store, retriever.WithK(40)),
retriever.Dense(store, retriever.WithK(40)),
retriever.RRF(60),
)
docs, err := hybrid.Retrieve(ctx, "how does crash-durable workflow replay work?") Production defaults, not production afterthoughts.
The guard pipeline runs three stages — Input, Output, Tool — around every LLM interaction. Circuit breakers, rate limits, and retry are middleware on the same interface as your LLM call. OTel GenAI spans emit from 17 packages at every boundary. Durable workflows replay from an event log. Cost tracking attributes every token to a team.
// Resilience + observability + safety compose on the same interface.
// Read outside-in: guardrails wrap tracing wraps rate-limit wraps retry.
safeModel := llm.ApplyMiddleware(base,
llm.WithGuardrails(pipeline), // input + output + tool guards
llm.WithTracing(), // gen_ai.* OTel spans at every boundary
llm.WithRateLimit(60, 150000), // 60 req/min, 150k tok/min
llm.WithRetry(3), // respects core.IsRetryable()
llm.WithCostTracking(costCenter),
) Frame-based voice, built in.
STT → LLM → TTS as a typed pipeline. Six STT providers, seven TTS providers, three speech-to-speech providers. LiveKit, Daily, and Pipecat transports. Silero and WebRTC VAD. No other Go agent framework includes this.
import (
"github.com/lookatitude/beluga-ai/voice"
_ "github.com/lookatitude/beluga-ai/voice/stt/providers/deepgram"
_ "github.com/lookatitude/beluga-ai/voice/tts/providers/cartesia"
_ "github.com/lookatitude/beluga-ai/voice/vad/providers/silero"
_ "github.com/lookatitude/beluga-ai/voice/transport/providers/livekit"
)
pipeline, _ := voice.NewPipeline(voice.Config{
Transport: "livekit",
VAD: "silero",
STT: "deepgram",
LLM: model,
TTS: "cartesia",
})
// Frame-based — pipeline.Run handles barge-in, turn detection,
// and sub-800ms glass-to-glass latency.
pipeline.Run(ctx) Agents meet the network.
Expose the same Agent over MCP for tools, A2A for inter-agent discovery, REST/SSE for streaming clients, gRPC for low-latency internal calls, or WebSocket for bidirectional voice. Pick one or ship them all — the Runner is the deployment boundary, not the agent.
import (
"github.com/lookatitude/beluga-ai/protocol/mcp"
"github.com/lookatitude/beluga-ai/runtime"
)
runner, _ := runtime.NewRunner(runtime.Config{
Agent: myAgent,
Expose: runtime.ExposeAll{
MCP: mcp.Server(":8080"), // Model Context Protocol
A2A: true, // /.well-known/agent.json
REST: ":3000", // REST + SSE streaming
GRPC: ":50051", // protobuf contracts
},
})
runner.Serve(ctx) Everything you just read is open source.
MIT licensed. 110 providers. Zero paid tiers.