Capability Layer

Memory Systems

Three-tier MemGPT memory (Core, Recall, Archival) with 6 strategies: buffer, window, summary, entity, semantic, and graph. Self-editable via agent tools for persistent personality and knowledge.

3-Tier MemGPT6 StrategiesGraph MemorySelf-Editable

Overview

Memory is what transforms a stateless LLM into a persistent, context-aware agent. Beluga AI implements the MemGPT three-tier memory model, giving agents a structured approach to managing information across conversations and sessions. Rather than dumping everything into a context window, agents decide what to remember, what to search for, and what to archive — just as humans manage working memory, episodic recall, and long-term knowledge differently.

The three tiers are: Core Memory (always present in the LLM context — personality, user preferences, active goals), Recall Memory (searchable conversation history with multiple retrieval strategies), and Archival Memory (long-term vector and graph storage for knowledge that persists indefinitely). Agents can combine these tiers through a composite memory system that unifies working, episodic, semantic, and graph memory behind a single interface.

A key differentiator is self-editable core memory: agents are given tools to read and write their own core memory blocks. This means an agent can update its understanding of a user's preferences, record important decisions, or adjust its own personality traits — all without external intervention. Seven store backends (from in-memory for development to Neo4j for graph relationships) ensure the right storage for every deployment scenario.

Capabilities

Three-Tier Memory (MemGPT)

Core Memory lives permanently in the agent's system prompt — it holds personality traits, user preferences, and active goals. Recall Memory is searchable conversation history, indexed by time, topic, and relevance, allowing agents to retrieve past interactions on demand. Archival Memory provides long-term storage via vector embeddings and graph relationships, persisting knowledge across sessions indefinitely. Each tier has distinct read/write characteristics optimized for its access pattern.

Memory Strategies

Six strategies control how conversation context flows into memory:

  • Buffer — Keeps all messages in context up to a token limit, then truncates oldest.
  • Window — Sliding window of the last N messages, simple and predictable.
  • Summary — Periodically summarizes older messages, compressing history while preserving key points.
  • Entity — Extracts and tracks entities (people, places, concepts) mentioned across conversations.
  • Semantic — Embeds messages and retrieves by semantic similarity to the current query.
  • Graph — Builds a knowledge graph from conversations, capturing relationships between entities.

Composite Memory

The composite memory system combines multiple memory types behind a unified interface. A typical production setup might use Working Memory (buffer strategy for immediate context), Episodic Memory (window + summary for conversation history), Semantic Memory (vector-based for knowledge retrieval), and Graph Memory (Neo4j or Memgraph for relationship queries). The composite layer handles routing queries to the appropriate memory tier and merging results.

Self-Editable Core Memory

Agents are given specialized tools — core_memory_read, core_memory_write, and core_memory_replace — that let them modify their own core memory blocks. When an agent learns a user's name, discovers a preference, or needs to update a goal, it can persist that information directly. This creates agents that genuinely learn and adapt over time, maintaining personality consistency across thousands of interactions.

Store Backends

Seven store backends cover the full range of deployment scenarios:

  • In-Memory — Zero-config for development and testing.
  • Redis — Fast key-value access for session-scoped memory.
  • PostgreSQL — Durable relational storage with pgvector for embeddings.
  • SQLite — Embedded storage for single-node or edge deployments.
  • Neo4j — Native graph database for relationship-heavy memory.
  • DragonflyDB — Redis-compatible with better memory efficiency at scale.
  • Memgraph — In-memory graph database for low-latency graph queries.

Architecture

MemGPT Three-Tier Memory Model
Core
Always in context — personality, preferences, goals. Self-editable by agent.
Recall
Searchable history — past conversations indexed by time, topic, relevance.
Archival
Long-term storage — vector embeddings + graph relationships. Persistent across sessions.
Memory Strategies
Buffer
Window
Summary
Entity
Semantic
Graph
Composite Memory Flow
Agent Query
Composite Router
Working
Episodic
Semantic
Graph
Merged Context

Providers & Implementations

Memory Store Backends

Name Priority Key Differentiator
In-MemoryP0Zero-config, ideal for development and testing
RedisP0Fast key-value access, TTL support, session-scoped memory
PostgreSQLP0Durable relational storage with pgvector for embeddings
SQLiteP1Embedded single-file storage, edge and mobile deployments
Neo4jP1Native graph database for relationship-heavy memory patterns
DragonflyDBP2Redis-compatible with 25x memory efficiency at scale
MemgraphP2In-memory graph for sub-millisecond relationship queries

Full Example

An agent configured with composite memory combining all four memory types:

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/lookatitude/beluga-ai/agent"
    "github.com/lookatitude/beluga-ai/llm"
    "github.com/lookatitude/beluga-ai/memory"
    "github.com/lookatitude/beluga-ai/memory/stores/inmemory"
    "github.com/lookatitude/beluga-ai/memory/stores/postgres"
    _ "github.com/lookatitude/beluga-ai/llm/providers/openai"
)

func main() {
    ctx := context.Background()

    model, err := llm.New("openai", llm.ProviderConfig{
        Model: "gpt-4o",
    })
    if err != nil {
        log.Fatal(err)
    }

    // Configure store backends
    devStore := inmemory.New()
    pgStore, err := postgres.New(
        postgres.WithDSN("postgres://localhost:5432/beluga"),
        postgres.WithEmbeddingDimension(1536),
    )
    if err != nil {
        log.Fatal(err)
    }

    // Build composite memory with all four tiers
    mem := memory.NewComposite(
        // Working memory: buffer strategy, always in context
        memory.WithWorking(memory.NewCore(
            memory.WithStore(devStore),
            memory.WithBlocks("personality", "user_preferences", "active_goals"),
            memory.WithSelfEditable(true), // Agent can modify via tools
        )),

        // Episodic memory: window + summary for conversation history
        memory.WithEpisodic(memory.NewRecall(
            memory.WithStore(pgStore),
            memory.WithStrategy(memory.StrategySummary),
            memory.WithWindowSize(20),
            memory.WithSummaryThreshold(50),
        )),

        // Semantic memory: embedding-based retrieval
        memory.WithSemantic(memory.NewArchival(
            memory.WithStore(pgStore),
            memory.WithStrategy(memory.StrategySemantic),
            memory.WithTopK(5),
        )),

        // Graph memory: entity and relationship tracking
        memory.WithGraph(memory.NewArchival(
            memory.WithStore(pgStore),
            memory.WithStrategy(memory.StrategyGraph),
        )),
    )

    // Create agent with composite memory
    assistant := agent.New("memory-agent",
        agent.WithModel(model),
        agent.WithMemory(mem),
        agent.WithSystemPrompt(`You are a helpful assistant with persistent memory.
You remember user preferences and past conversations.
Use your core_memory_write tool to save important information.`),
    )

    // First conversation — agent learns preferences
    result, err := assistant.Run(ctx, "My name is Alex and I prefer concise answers.")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result)

    // Later conversation — agent recalls preferences from memory
    result, err = assistant.Run(ctx, "What do you remember about me?")
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result)
}

Related Features