Memory Systems
Three-tier MemGPT memory (Core, Recall, Archival) with 6 strategies: buffer, window, summary, entity, semantic, and graph. Self-editable via agent tools for persistent personality and knowledge.
Overview
Memory is what transforms a stateless LLM into a persistent, context-aware agent. Beluga AI implements the MemGPT three-tier memory model, giving agents a structured approach to managing information across conversations and sessions. Rather than dumping everything into a context window, agents decide what to remember, what to search for, and what to archive — just as humans manage working memory, episodic recall, and long-term knowledge differently.
The three tiers are: Core Memory (always present in the LLM context — personality, user preferences, active goals), Recall Memory (searchable conversation history with multiple retrieval strategies), and Archival Memory (long-term vector and graph storage for knowledge that persists indefinitely). Agents can combine these tiers through a composite memory system that unifies working, episodic, semantic, and graph memory behind a single interface.
A key differentiator is self-editable core memory: agents are given tools to read and write their own core memory blocks. This means an agent can update its understanding of a user's preferences, record important decisions, or adjust its own personality traits — all without external intervention. Seven store backends (from in-memory for development to Neo4j for graph relationships) ensure the right storage for every deployment scenario.
Capabilities
Three-Tier Memory (MemGPT)
Core Memory lives permanently in the agent's system prompt — it holds personality traits, user preferences, and active goals. Recall Memory is searchable conversation history, indexed by time, topic, and relevance, allowing agents to retrieve past interactions on demand. Archival Memory provides long-term storage via vector embeddings and graph relationships, persisting knowledge across sessions indefinitely. Each tier has distinct read/write characteristics optimized for its access pattern.
Memory Strategies
Six strategies control how conversation context flows into memory:
- Buffer — Keeps all messages in context up to a token limit, then truncates oldest.
- Window — Sliding window of the last N messages, simple and predictable.
- Summary — Periodically summarizes older messages, compressing history while preserving key points.
- Entity — Extracts and tracks entities (people, places, concepts) mentioned across conversations.
- Semantic — Embeds messages and retrieves by semantic similarity to the current query.
- Graph — Builds a knowledge graph from conversations, capturing relationships between entities.
Composite Memory
The composite memory system combines multiple memory types behind a unified interface. A typical production setup might use Working Memory (buffer strategy for immediate context), Episodic Memory (window + summary for conversation history), Semantic Memory (vector-based for knowledge retrieval), and Graph Memory (Neo4j or Memgraph for relationship queries). The composite layer handles routing queries to the appropriate memory tier and merging results.
Self-Editable Core Memory
Agents are given specialized tools — core_memory_read, core_memory_write,
and core_memory_replace — that let them modify their own core memory blocks. When an
agent learns a user's name, discovers a preference, or needs to update a goal, it can persist that
information directly. This creates agents that genuinely learn and adapt over time, maintaining
personality consistency across thousands of interactions.
Store Backends
Seven store backends cover the full range of deployment scenarios:
- In-Memory — Zero-config for development and testing.
- Redis — Fast key-value access for session-scoped memory.
- PostgreSQL — Durable relational storage with pgvector for embeddings.
- SQLite — Embedded storage for single-node or edge deployments.
- Neo4j — Native graph database for relationship-heavy memory.
- DragonflyDB — Redis-compatible with better memory efficiency at scale.
- Memgraph — In-memory graph database for low-latency graph queries.
Architecture
Providers & Implementations
Memory Store Backends
| Name | Priority | Key Differentiator |
|---|---|---|
| In-Memory | P0 | Zero-config, ideal for development and testing |
| Redis | P0 | Fast key-value access, TTL support, session-scoped memory |
| PostgreSQL | P0 | Durable relational storage with pgvector for embeddings |
| SQLite | P1 | Embedded single-file storage, edge and mobile deployments |
| Neo4j | P1 | Native graph database for relationship-heavy memory patterns |
| DragonflyDB | P2 | Redis-compatible with 25x memory efficiency at scale |
| Memgraph | P2 | In-memory graph for sub-millisecond relationship queries |
Full Example
An agent configured with composite memory combining all four memory types:
package main
import (
"context"
"fmt"
"log"
"github.com/lookatitude/beluga-ai/agent"
"github.com/lookatitude/beluga-ai/llm"
"github.com/lookatitude/beluga-ai/memory"
"github.com/lookatitude/beluga-ai/memory/stores/inmemory"
"github.com/lookatitude/beluga-ai/memory/stores/postgres"
_ "github.com/lookatitude/beluga-ai/llm/providers/openai"
)
func main() {
ctx := context.Background()
model, err := llm.New("openai", llm.ProviderConfig{
Model: "gpt-4o",
})
if err != nil {
log.Fatal(err)
}
// Configure store backends
devStore := inmemory.New()
pgStore, err := postgres.New(
postgres.WithDSN("postgres://localhost:5432/beluga"),
postgres.WithEmbeddingDimension(1536),
)
if err != nil {
log.Fatal(err)
}
// Build composite memory with all four tiers
mem := memory.NewComposite(
// Working memory: buffer strategy, always in context
memory.WithWorking(memory.NewCore(
memory.WithStore(devStore),
memory.WithBlocks("personality", "user_preferences", "active_goals"),
memory.WithSelfEditable(true), // Agent can modify via tools
)),
// Episodic memory: window + summary for conversation history
memory.WithEpisodic(memory.NewRecall(
memory.WithStore(pgStore),
memory.WithStrategy(memory.StrategySummary),
memory.WithWindowSize(20),
memory.WithSummaryThreshold(50),
)),
// Semantic memory: embedding-based retrieval
memory.WithSemantic(memory.NewArchival(
memory.WithStore(pgStore),
memory.WithStrategy(memory.StrategySemantic),
memory.WithTopK(5),
)),
// Graph memory: entity and relationship tracking
memory.WithGraph(memory.NewArchival(
memory.WithStore(pgStore),
memory.WithStrategy(memory.StrategyGraph),
)),
)
// Create agent with composite memory
assistant := agent.New("memory-agent",
agent.WithModel(model),
agent.WithMemory(mem),
agent.WithSystemPrompt(`You are a helpful assistant with persistent memory.
You remember user preferences and past conversations.
Use your core_memory_write tool to save important information.`),
)
// First conversation — agent learns preferences
result, err := assistant.Run(ctx, "My name is Alex and I prefer concise answers.")
if err != nil {
log.Fatal(err)
}
fmt.Println(result)
// Later conversation — agent recalls preferences from memory
result, err = assistant.Run(ctx, "What do you remember about me?")
if err != nil {
log.Fatal(err)
}
fmt.Println(result)
}