Foundation Layer

Observability

OpenTelemetry GenAI semantic conventions at every boundary. Token counting, cost tracking, structured logging, health checks, and adapter exports to Langfuse, Phoenix, and any OTel backend.

OTel GenAICost TrackingStructured LoggingHealth ChecksDev Dashboard

Overview

Observability in AI systems requires more than standard application monitoring. Beluga AI implements the OpenTelemetry GenAI semantic conventions (v1.37+) at every boundary -- LLM calls, tool executions, agent decisions, and memory operations all produce structured spans with the gen_ai.* attribute namespace.

Every LLM call is automatically instrumented with token counts (prompt, completion, total), model parameters, latency, and cost estimates. Costs are tracked per-model with configurable pricing tables, so you can monitor spend across providers in real time. Structured logging via slog ensures that every event is machine-parseable and correlatable with traces.

For production deployments, Beluga includes Kubernetes-ready health check endpoints (liveness and readiness probes) and exports telemetry to any OTel-compatible backend. Dedicated adapters for Langfuse, Arize Phoenix, LangSmith, and other AI-specific platforms provide deeper insights into prompt performance, evaluation metrics, and agent behavior. A built-in dev dashboard gives you instant visibility during development.

Capabilities

OpenTelemetry GenAI Conventions

All instrumentation follows the official OTel GenAI semantic conventions (v1.37+). Spans use the gen_ai.* attribute namespace with standardized names for model, provider, token counts, and finish reasons. Agent spans, tool spans, and model spans are automatically linked in a trace hierarchy.

Structured Logging

Every boundary in the framework emits structured log entries via Go's slog package. Logs include trace IDs for correlation, model parameters, tool names, and timing information. Log levels are configurable per-component so you can increase verbosity where needed.

Metrics

Automatic collection of token usage (prompt, completion, total), latency histograms per model and provider, cost tracking with configurable pricing tables, and error rates. All metrics are exported as OTel metrics compatible with Prometheus, Grafana, and Datadog.

Health Checks

Built-in HTTP endpoints for Kubernetes liveness and readiness probes. Health checks verify LLM provider connectivity, memory store availability, tool service reachability, and custom application-level checks. Configurable timeouts and degraded-state reporting.

Observability Adapters

Export telemetry to AI-specific observability platforms via dedicated adapters. Each adapter translates Beluga's OTel spans into the platform's native format, preserving all GenAI attributes. Switch platforms by changing a single configuration line.

Built-in Dev Dashboard

An embedded web UI available during development that shows real-time traces, cost breakdowns, tool call timelines, and a prompt playground. No external services required -- just enable it and open your browser. Useful for debugging agent behavior and optimizing prompt performance.

Architecture

LLM Calls
Tool Exec
Agent Decisions
Memory Ops
OTel GenAI Instrumentation
Traces
Metrics
Logs
Langfuse
Phoenix
Jaeger
Grafana
Dev Dashboard

Providers & Implementations

Name Priority Key Differentiator
OTel Collector P0 Standard OTel export to any compatible backend (Jaeger, Zipkin, etc.)
Langfuse P0 LLM-native observability with prompt management, evaluation, and cost tracking
Arize Phoenix P1 Open-source LLM tracing with embedding analysis and retrieval evaluation
LangSmith P1 LangChain ecosystem tracing with dataset management and evaluation
Jaeger P1 Distributed tracing with service dependency visualization
Grafana + Prometheus P1 Metrics dashboards with alerting, long-term storage, and PromQL queries
Datadog P2 Full-stack APM with LLM monitoring, cost analytics, and anomaly detection

Full Example

Set up OpenTelemetry tracing with cost tracking and export to Langfuse:

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/lookatitude/beluga-ai/agent"
    "github.com/lookatitude/beluga-ai/llm"
    "github.com/lookatitude/beluga-ai/o11y"
    "github.com/lookatitude/beluga-ai/o11y/adapters/langfuse"
)

func main() {
    ctx := context.Background()

    // Initialize OTel with GenAI semantic conventions
    shutdown, err := o11y.Init(ctx,
        o11y.WithServiceName("my-ai-service"),
        o11y.WithGenAIConventions(true), // gen_ai.* attributes

        // Cost tracking with per-model pricing
        o11y.WithCostTracking(o11y.PricingTable{
            "gpt-4o":         {PromptPer1K: 0.0025, CompletionPer1K: 0.01},
            "claude-sonnet":  {PromptPer1K: 0.003, CompletionPer1K: 0.015},
        }),

        // Export to Langfuse for LLM-specific analytics
        o11y.WithExporter(langfuse.NewExporter(
            langfuse.WithPublicKey("pk-..."),
            langfuse.WithSecretKey("sk-..."),
        )),

        // Structured logging via slog
        o11y.WithStructuredLogging(o11y.LogConfig{
            Level:        "info",
            Format:       "json",
            AddSource:    true,
            IncludeTrace: true,
        }),

        // Health checks for Kubernetes
        o11y.WithHealthChecks(":8080",
            o11y.LivenessPath("/healthz"),
            o11y.ReadinessPath("/readyz"),
        ),
    )
    if err != nil {
        log.Fatal(err)
    }
    defer shutdown(ctx)

    // Create model - automatically instrumented
    model, _ := llm.New("openai",
        llm.WithModel("gpt-4o"),
    )

    // Create agent - all operations produce OTel spans
    myAgent, _ := agent.New("traced-agent",
        agent.WithModel(model),
    )

    // Run agent - traces, metrics, and logs emitted automatically
    result, err := myAgent.Run(ctx, "Summarize the quarterly report")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result)

    // Access cost summary programmatically
    summary := o11y.GetCostSummary(ctx)
    fmt.Printf("Total cost: $%.4f (%d tokens)\n",
        summary.TotalCost, summary.TotalTokens)
}

Related Features