Skip to content
Docs

Observability Providers

Beluga AI v2 provides a unified observability layer built on OpenTelemetry with support for exporting LLM trace data to external platforms. The o11y package handles tracing, metrics, structured logging, and health checks. Observability providers implement the TraceExporter interface to send LLM call data to platforms such as Langfuse, LangSmith, Opik, and Phoenix.

All observability providers implement the TraceExporter interface:

type TraceExporter interface {
ExportLLMCall(ctx context.Context, data LLMCallData) error
}

The LLMCallData struct captures all details of an LLM invocation:

type LLMCallData struct {
Model string
Provider string
InputTokens int
OutputTokens int
Duration time.Duration
Cost float64
Messages []schema.Message
Response string
Error string
Metadata map[string]any
}

The o11y package provides native OpenTelemetry tracing with GenAI semantic conventions:

import "github.com/lookatitude/beluga-ai/o11y"
// Initialize tracer with default OTLP exporter
shutdown, err := o11y.InitTracer("my-service")
if err != nil {
log.Fatal(err)
}
defer shutdown()
// Create spans for AI operations
ctx, span := o11y.StartSpan(ctx, "llm.generate", o11y.Attrs{
o11y.AttrRequestModel: "gpt-4o",
o11y.AttrSystem: "openai",
})
defer span.End()
// Record errors
if err != nil {
span.RecordError(err)
span.SetStatus(o11y.StatusError, err.Error())
}

The tracing layer uses standardized attribute names from the OpenTelemetry GenAI conventions:

AttributeConstantDescription
gen_ai.agent.nameAttrAgentNameAgent identifier
gen_ai.operation.nameAttrOperationNameOperation type
gen_ai.tool.nameAttrToolNameTool identifier
gen_ai.request.modelAttrRequestModelRequested model
gen_ai.response.modelAttrResponseModelActual model used
gen_ai.usage.input_tokensAttrInputTokensInput token count
gen_ai.usage.output_tokensAttrOutputTokensOutput token count
gen_ai.systemAttrSystemProvider system name

Record token usage, latency, and cost metrics through the o11y package:

// Initialize meter
err := o11y.InitMeter("my-service")
if err != nil {
log.Fatal(err)
}
// Record token usage
o11y.TokenUsage(ctx, 500, 150)
// Record operation duration (milliseconds)
o11y.OperationDuration(ctx, 1250.0)
// Record estimated cost (USD)
o11y.Cost(ctx, 0.003)
// Custom counters and histograms
o11y.Counter(ctx, "tool.calls", 1)
o11y.Histogram(ctx, "retrieval.latency_ms", 45.2)

The Logger type wraps slog with context-aware methods:

logger := o11y.NewLogger(
o11y.WithLogLevel("debug"),
o11y.WithJSON(),
)
// Attach to context for propagation
ctx = o11y.WithLogger(ctx, logger)
// Use from context
log := o11y.FromContext(ctx)
log.Info(ctx, "generation complete",
"model", "gpt-4o",
"tokens", 150,
)

Register health checkers for infrastructure components:

registry := o11y.NewHealthRegistry()
registry.Register("database", o11y.HealthCheckerFunc(func(ctx context.Context) o11y.HealthResult {
return o11y.HealthResult{
Status: o11y.Healthy,
Component: "database",
Message: "connection pool active",
}
}))
results := registry.CheckAll(ctx)
for _, r := range results {
fmt.Printf("%s: %s (%s)\n", r.Component, r.Status, r.Message)
}

Fan out trace data to multiple platforms simultaneously:

import (
"github.com/lookatitude/beluga-ai/o11y"
"github.com/lookatitude/beluga-ai/o11y/providers/langfuse"
"github.com/lookatitude/beluga-ai/o11y/providers/phoenix"
)
lfExporter, err := langfuse.New(
langfuse.WithPublicKey(os.Getenv("LANGFUSE_PUBLIC_KEY")),
langfuse.WithSecretKey(os.Getenv("LANGFUSE_SECRET_KEY")),
)
if err != nil {
log.Fatal(err)
}
pxExporter, err := phoenix.New(
phoenix.WithBaseURL("http://localhost:6006"),
)
if err != nil {
log.Fatal(err)
}
multi := o11y.NewMultiExporter(lfExporter, pxExporter)
err = multi.ExportLLMCall(ctx, o11y.LLMCallData{
Model: "gpt-4o",
Provider: "openai",
InputTokens: 500,
OutputTokens: 150,
Duration: 1200 * time.Millisecond,
Response: "The capital of France is Paris.",
})
OptionDescription
WithSpanExporter(exp)Use a custom OTel span exporter
WithSampler(s)Use a custom OTel sampler
WithSyncExport()Synchronous export (useful for testing)
ProviderDescription
LangfuseOpen-source LLM observability with trace and generation tracking
LangSmithLangChain’s observability platform for tracing LLM runs
OpikComet’s LLM observability platform with workspace management
PhoenixArize’s open-source LLM observability with OTel-native spans