Errors
Beluga uses structured, typed errors throughout. Every public function that
can fail returns a *core.Error or wraps one in its chain. Raw
errors.New and fmt.Errorf strings exist in internal paths — not in any
function that crosses a package boundary.
The reason is operational: when an LLM provider returns a 429, the retry
middleware needs to know whether to retry, not just that something failed. A
string error cannot carry that signal reliably across package boundaries.
A typed ErrorCode can.
Two error paths coexist at runtime: application errors (tool failures, bad schemas) become Observations that the planner can recover from; infrastructure errors (network blips, rate limits) are intercepted by middleware and retried transparently.
graph TD Tool[Tool.Execute fails] --> Err[core.Error + ErrorCode] Err --> Obs[Observation with error] Obs --> Replan[Planner.Replan] Replan --> Try2[Try a different tool or strategy] LLM[LLM transient error] --> MW[Middleware retry] MW --> Retry[Exponential backoff] Retry --> LLM
ErrorCode — the classification
Section titled “ErrorCode — the classification”Source: core/errors.go:12-39
type ErrorCode string
const ( ErrRateLimit ErrorCode = "rate_limit" ErrAuth ErrorCode = "auth_error" ErrTimeout ErrorCode = "timeout" ErrInvalidInput ErrorCode = "invalid_input" ErrToolFailed ErrorCode = "tool_failed" ErrProviderDown ErrorCode = "provider_unavailable" ErrGuardBlocked ErrorCode = "guard_blocked" ErrBudgetExhausted ErrorCode = "budget_exhausted" ErrNotFound ErrorCode = "not_found")These codes are the vocabulary for programmatic error handling. Every package
in the capability layer maps its provider-specific failures to one of these
codes. A networking timeout from the OpenAI SDK becomes ErrTimeout; a 429
becomes ErrRateLimit.
core.Error — the structure
Section titled “core.Error — the structure”Source: core/errors.go:50-62
type Error struct { Op string // operation that failed: "llm.generate", "tool.execute" Code ErrorCode // classification Message string // human-readable description Err error // wrapped cause (participates in errors.Is / errors.As)}Op is the dotted operation name: llm.generate, tool.execute,
memory.load. It survives stack unwinding and appears verbatim in
observability backends as the failing operation.
Err implements Unwrap() so errors.Is and errors.As traverse the
full chain (core/errors.go:103-105).
Creating errors
Section titled “Creating errors”Two constructors:
core.NewError — explicit fields:
Source: core/errors.go:66-73
return core.NewError("tool.execute", core.ErrToolFailed, "fetch returned 404", err)core.Errorf — formatted message, %w wrapping:
Source: core/errors.go:82-90
return core.Errorf(core.ErrTimeout, "llm.generate: deadline exceeded after %d tokens: %w", n, err)core.Errorf delegates to fmt.Errorf so %w works as expected. The wrapped
error is preserved in Error.Err and participates in errors.Is /
errors.As traversal.
IsRetryable — the retry decision
Section titled “IsRetryable — the retry decision”Source: core/errors.go:120-126
func IsRetryable(err error) bool { var e *Error if errors.As(err, &e) { return retryableCodes[e.Code] } return false}Three codes are retryable (core/errors.go:42-46): ErrRateLimit,
ErrTimeout, ErrProviderDown. All others are not. IsRetryable uses
errors.As so it traverses wrapped errors correctly — a *core.Error
nested inside a fmt.Errorf("%w", ...) still matches.
Full example — wrapping a provider error
Section titled “Full example — wrapping a provider error”import ( "context" "errors" "fmt" "time"
"github.com/lookatitude/beluga-ai/core")
// callWithRetry calls fn up to maxAttempts times, retrying on retryable errors.func callWithRetry(ctx context.Context, maxAttempts int, fn func(context.Context) error) error { var lastErr error for attempt := range maxAttempts { lastErr = fn(ctx) if lastErr == nil { return nil } if !core.IsRetryable(lastErr) { // auth failures, invalid input, guard blocks — don't retry return lastErr } // back off proportional to attempt number select { case <-ctx.Done(): return fmt.Errorf("retry loop cancelled: %w", ctx.Err()) case <-time.After(time.Duration(attempt+1) * 200 * time.Millisecond): } } return fmt.Errorf("all %d attempts failed: %w", maxAttempts, lastErr)}
// wrapProviderError maps a raw SDK error to a core.Error.func wrapProviderError(op string, sdkErr error) error { if sdkErr == nil { return nil } // In practice, inspect sdkErr for status codes, message text, etc. var code core.ErrorCode switch { case errors.Is(sdkErr, context.DeadlineExceeded): code = core.ErrTimeout default: code = core.ErrProviderDown } return core.NewError(op, code, sdkErr.Error(), sdkErr)}Error string format
Section titled “Error string format”Error.Error() returns: "<op> [<code>]: <message>: <cause>" when a cause
is present, or "<op> [<code>]: <message>" without one.
Source: core/errors.go:94-99
llm.generate [rate_limit]: provider returned 429: http: status 429This format is readable in logs and slog structured output. The op field
scopes the error to its origin without requiring a stack trace.
Error matching with errors.Is
Section titled “Error matching with errors.Is”core.Error.Is compares by Code only — two errors with the same code
match, regardless of message or cause (core/errors.go:109-115). This lets
you write:
import ( "errors"
"github.com/lookatitude/beluga-ai/core")
func handleErr(err error) { sentinel := &core.Error{Code: core.ErrGuardBlocked} if errors.Is(err, sentinel) { // guard rejected the request — do not retry, surface to user }}Common mistakes
Section titled “Common mistakes”- Returning raw
fmt.Errorffrom a public function.IsRetryablereturnsfalsefor all unclassified errors. Middleware cannot retry what it cannot classify. C-003 in.wiki/corrections.mddocuments 190 occurrences of this pattern found across 50+ files during the v2 migration. - Using error message strings for branching. String-matching errors breaks as soon as provider SDK messages change. Use
ErrorCodeorerrors.As. - Swallowing errors silently. Every error must either be returned, logged, or explicitly discarded with a comment explaining why. The rule is in
.claude/rules/go-packages.md. - Panicking for recoverable errors.
panicis for programmer errors (nil pointer, out-of-bounds). Provider failures, tool timeouts, and guard blocks are recoverable — return a*core.Error. - Leaking internal details in errors returned to callers.
Error.Messageis for operators; strip SQL query text, file paths, and internal IDs before the error crosses a trust boundary.
Related reading
Section titled “Related reading”- Resilience — circuit breakers and retry middleware that consume
IsRetryable. - Extensibility — where
core.Errorfits in the middleware chain. core/errors.go— canonical source for all error types and codes..wiki/patterns/error-handling.md— patterns and anti-patterns with file:line references.