Foundation Layer

Observability

OpenTelemetry GenAI semantic conventions at every boundary. Token counting, cost tracking, structured logging, health checks, and adapter exports to Langfuse, Phoenix, and any OTel backend.

OTel GenAICost TrackingStructured LoggingHealth ChecksDev Dashboard

Overview

Observability in AI systems requires more than standard application monitoring. Beluga AI implements the OpenTelemetry GenAI semantic conventions (v1.37+) at every boundary -- LLM calls, tool executions, agent decisions, and memory operations all produce structured spans with the gen_ai.* attribute namespace.

Every LLM call is automatically instrumented with token counts (prompt, completion, total), model parameters, latency, and cost estimates. Costs are tracked per-model with configurable pricing tables, so you can monitor spend across providers in real time. Structured logging via slog ensures that every event is machine-parseable and correlatable with traces.

For production deployments, Beluga includes Kubernetes-ready health check endpoints (liveness and readiness probes) and exports telemetry to any OTel-compatible backend. Dedicated adapters for Langfuse, Arize Phoenix, LangSmith, and other AI-specific platforms provide deeper insights into prompt performance, evaluation metrics, and agent behavior. A built-in dev dashboard gives you instant visibility during development.

Capabilities

OpenTelemetry GenAI Conventions

All instrumentation follows the official OTel GenAI semantic conventions (v1.37+). Spans use the gen_ai.* attribute namespace with standardized names for model, provider, token counts, and finish reasons. Agent spans, tool spans, and model spans are automatically linked in a trace hierarchy.

Structured Logging

Every boundary in the framework emits structured log entries via Go's slog package. Logs include trace IDs for correlation, model parameters, tool names, and timing information. Log levels are configurable per-component so you can increase verbosity where needed.

Metrics

Automatic collection of token usage (prompt, completion, total), latency histograms per model and provider, cost tracking with configurable pricing tables, and error rates. All metrics are exported as OTel metrics compatible with Prometheus, Grafana, and Datadog.

Health Checks

Built-in HTTP endpoints for Kubernetes liveness and readiness probes. Health checks verify LLM provider connectivity, memory store availability, tool service reachability, and custom application-level checks. Configurable timeouts and degraded-state reporting.

Observability Adapters

Export telemetry to AI-specific observability platforms via dedicated adapters. Each adapter translates Beluga's OTel spans into the platform's native format, preserving all GenAI attributes. Switch platforms by changing a single configuration line.

Built-in Dev Dashboard

An embedded web UI available during development that shows real-time traces, cost breakdowns, tool call timelines, and a prompt playground. No external services required -- just enable it and open your browser. Useful for debugging agent behavior and optimizing prompt performance.

Architecture

LLM Calls

Tool Exec

Agent Decisions

Memory Ops

OTel GenAI Instrumentation

Traces

Metrics

Logs

Langfuse

Phoenix

Jaeger

Grafana

Dev Dashboard

Providers & Implementations

Name	Priority	Key Differentiator
OTel Collector	P0	Standard OTel export to any compatible backend (Jaeger, Zipkin, etc.)
Langfuse	P0	LLM-native observability with prompt management, evaluation, and cost tracking
Arize Phoenix	P1	Open-source LLM tracing with embedding analysis and retrieval evaluation
LangSmith	P1	LangChain ecosystem tracing with dataset management and evaluation
Jaeger	P1	Distributed tracing with service dependency visualization
Grafana + Prometheus	P1	Metrics dashboards with alerting, long-term storage, and PromQL queries
Datadog	P2	Full-stack APM with LLM monitoring, cost analytics, and anomaly detection

Full Example

Set up OpenTelemetry tracing with cost tracking and export to Langfuse:

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/lookatitude/beluga-ai/agent"
    "github.com/lookatitude/beluga-ai/llm"
    "github.com/lookatitude/beluga-ai/o11y"
    "github.com/lookatitude/beluga-ai/o11y/adapters/langfuse"
)

func main() {
    ctx := context.Background()

    // Initialize OTel with GenAI semantic conventions
    shutdown, err := o11y.Init(ctx,
        o11y.WithServiceName("my-ai-service"),
        o11y.WithGenAIConventions(true), // gen_ai.* attributes

        // Cost tracking with per-model pricing
        o11y.WithCostTracking(o11y.PricingTable{
            "gpt-4o":         {PromptPer1K: 0.0025, CompletionPer1K: 0.01},
            "claude-sonnet":  {PromptPer1K: 0.003, CompletionPer1K: 0.015},
        }),

        // Export to Langfuse for LLM-specific analytics
        o11y.WithExporter(langfuse.NewExporter(
            langfuse.WithPublicKey("pk-..."),
            langfuse.WithSecretKey("sk-..."),
        )),

        // Structured logging via slog
        o11y.WithStructuredLogging(o11y.LogConfig{
            Level:        "info",
            Format:       "json",
            AddSource:    true,
            IncludeTrace: true,
        }),

        // Health checks for Kubernetes
        o11y.WithHealthChecks(":8080",
            o11y.LivenessPath("/healthz"),
            o11y.ReadinessPath("/readyz"),
        ),
    )
    if err != nil {
        log.Fatal(err)
    }
    defer shutdown(ctx)

    // Create model - automatically instrumented
    model, _ := llm.New("openai",
        llm.WithModel("gpt-4o"),
    )

    // Create agent - all operations produce OTel spans
    myAgent, _ := agent.New("traced-agent",
        agent.WithModel(model),
    )

    // Run agent - traces, metrics, and logs emitted automatically
    result, err := myAgent.Run(ctx, "Summarize the quarterly report")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result)

    // Access cost summary programmatically
    summary := o11y.GetCostSummary(ctx)
    fmt.Printf("Total cost: $%.4f (%d tokens)\n",
        summary.TotalCost, summary.TotalTokens)
}

AI Agents

Data & Retrieval

Infrastructure

Orchestration

Observability

Overview

Capabilities

OpenTelemetry GenAI Conventions

Structured Logging

Metrics

Health Checks

Observability Adapters

Built-in Dev Dashboard

Architecture

Providers & Implementations

Full Example

Related Features

AI Agents

Data & Retrieval

Infrastructure

Orchestration

Observability

Overview

Capabilities

OpenTelemetry GenAI Conventions

Structured Logging

Metrics

Health Checks

Observability Adapters

Built-in Dev Dashboard

Architecture

Providers & Implementations

Full Example

Related Features

LLM Providers

Agent Runtime

Guardrails