Skip to content
Docs

AI IDE Extension with Project Memory

Developers waste significant time repeatedly explaining project context to AI assistants that have no memory across sessions. Every new conversation starts from zero: “We use React with TypeScript, our API is in Go, the database is PostgreSQL, and we follow this naming convention…” This repeated context-setting costs 5-10 minutes per session and produces generic suggestions that miss project-specific patterns, naming conventions, and architectural decisions.

A context-aware IDE extension uses persistent memory to understand project patterns, maintain conversation history, and provide intelligent assistance grounded in the actual codebase. The key insight is using vector-backed memory: code is embedded into a semantic vector space where similarity search retrieves the most relevant code context for each developer query, rather than relying on keyword matching or fixed context windows.

Beluga AI provides memory abstractions for storing and retrieving context, vector stores for semantic code search, and embedding models for understanding code semantics. The system indexes code changes incrementally, retrieves relevant context for developer queries, and maintains conversation history across sessions.

The architecture uses vector memory (not buffer or window memory) because code context retrieval is fundamentally a semantic search problem. A developer asking “how do we handle authentication?” needs code from auth-related files regardless of when those files were last edited. Vector similarity search finds semantically relevant code across the entire project, while buffer memory would only recall the most recent interactions.

┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ Code │───▶│ Context │───▶│ Project │
│ Changes │ │ Indexer │ │ Memory │
└──────────────┘ └──────────────┘ └──────┬───────┘
┌──────────────┐ ┌──────────────┐ ┌─────▼────────┐
│ Context- │◀───│ AI │◀───│ Context │
│ aware │ │ Assistant │ │ Retriever │
│ Response │ └──────────────┘ └──────────────┘
└──────────────┘ ▲
┌──────┴───────┐
│ Developer │
│ Query │
└──────────────┘

The context manager maintains project-specific memory with semantic indexing. Each project gets its own namespace in the vector store, providing isolation between projects. The text-embedding-3-small model is chosen for IDE performance — it produces embeddings fast enough for real-time indexing without blocking the editor, while still providing sufficient semantic quality for code retrieval.

package main
import (
"context"
"fmt"
"github.com/lookatitude/beluga-ai/memory"
"github.com/lookatitude/beluga-ai/rag/embedding"
"github.com/lookatitude/beluga-ai/rag/vectorstore"
"github.com/lookatitude/beluga-ai/schema"
_ "github.com/lookatitude/beluga-ai/rag/embedding/providers/openai"
_ "github.com/lookatitude/beluga-ai/rag/vectorstore/providers/pgvector"
)
type IDEContextManager struct {
projectMemories map[string]memory.Memory
embedder embedding.Embedder
store vectorstore.VectorStore
}
func NewIDEContextManager(ctx context.Context) (*IDEContextManager, error) {
embedder, err := embedding.New("openai", embedding.ProviderConfig{
Model: "text-embedding-3-small", // Smaller model for IDE performance
})
if err != nil {
return nil, fmt.Errorf("create embedder: %w", err)
}
store, err := vectorstore.New("pgvector", vectorstore.ProviderConfig{
ConnectionString: "postgresql://localhost/ide_context",
})
if err != nil {
return nil, fmt.Errorf("create vector store: %w", err)
}
return &IDEContextManager{
projectMemories: make(map[string]memory.Memory),
embedder: embedder,
store: store,
}, nil
}
func (m *IDEContextManager) GetProjectMemory(ctx context.Context, projectID string) (memory.Memory, error) {
if mem, exists := m.projectMemories[projectID]; exists {
return mem, nil
}
// Create vector-backed memory for project
mem := memory.NewVectorMemory(m.store,
memory.WithNamespace(fmt.Sprintf("project_%s", projectID)),
memory.WithEmbedder(m.embedder),
)
m.projectMemories[projectID] = mem
return mem, nil
}

Index code changes incrementally for semantic search. Each file is embedded and stored with metadata (project ID, file path, type, timestamp) that enables filtered retrieval — when the developer queries a specific project, only that project’s code is searched.

func (m *IDEContextManager) IndexCode(ctx context.Context, projectID, filePath, code string) error {
// Generate embedding for code
embeddings, err := m.embedder.Embed(ctx, []string{code})
if err != nil {
return fmt.Errorf("embed code: %w", err)
}
// Create document with metadata
doc := schema.Document{
Content: code,
Metadata: map[string]interface{}{
"project_id": projectID,
"file_path": filePath,
"type": "code",
"indexed_at": time.Now(),
},
}
// Store in vector database
if err := m.store.Add(ctx, []schema.Document{doc}, [][]float64{embeddings[0]}); err != nil {
return fmt.Errorf("store code: %w", err)
}
return nil
}
func (m *IDEContextManager) IndexFileChanges(ctx context.Context, projectID string, changes []FileChange) error {
for _, change := range changes {
if err := m.IndexCode(ctx, projectID, change.Path, change.Content); err != nil {
// Log error but continue with other files
continue
}
}
return nil
}
type FileChange struct {
Path string
Content string
}

Retrieve relevant code context for developer queries:

func (m *IDEContextManager) GetRelevantContext(ctx context.Context, projectID, query string, topK int) (string, error) {
// Generate query embedding
queryEmbeddings, err := m.embedder.Embed(ctx, []string{query})
if err != nil {
return "", fmt.Errorf("embed query: %w", err)
}
// Search with project filter
results, err := m.store.SimilaritySearch(ctx, queryEmbeddings[0],
vectorstore.WithTopK(topK),
vectorstore.WithMetadataFilter(map[string]interface{}{
"project_id": projectID,
}),
)
if err != nil {
return "", fmt.Errorf("similarity search: %w", err)
}
// Build context from results
var context string
for _, result := range results {
filePath := result.Metadata["file_path"].(string)
context += fmt.Sprintf("File: %s\n```\n%s\n```\n\n", filePath, result.Content)
}
return context, nil
}

Provide AI assistance grounded in project context:

import (
"github.com/lookatitude/beluga-ai/llm"
_ "github.com/lookatitude/beluga-ai/llm/providers/openai"
)
func (m *IDEContextManager) ProvideAssistance(ctx context.Context, projectID, query string) (string, error) {
// Get relevant code context
codeContext, err := m.GetRelevantContext(ctx, projectID, query, 5)
if err != nil {
return "", fmt.Errorf("get context: %w", err)
}
// Get conversation history
projectMem, err := m.GetProjectMemory(ctx, projectID)
if err != nil {
return "", fmt.Errorf("get memory: %w", err)
}
history, err := projectMem.Load(ctx)
if err != nil {
return "", fmt.Errorf("load history: %w", err)
}
// Build messages with context
messages := []schema.Message{
&schema.SystemMessage{Parts: []schema.ContentPart{
schema.TextPart{Text: `You are an AI assistant for a software development project.
Project Context:
` + codeContext + `
Provide helpful, context-aware assistance based on the project code.`},
}},
}
// Add conversation history
messages = append(messages, history...)
// Add current query
messages = append(messages, &schema.HumanMessage{Parts: []schema.ContentPart{
schema.TextPart{Text: query},
}})
// Generate response
model, err := llm.New("openai", llm.ProviderConfig{Model: "gpt-4o"})
if err != nil {
return "", fmt.Errorf("create model: %w", err)
}
resp, err := model.Generate(ctx, messages)
if err != nil {
return "", fmt.Errorf("generate response: %w", err)
}
response := resp.Parts[0].(schema.TextPart).Text
// Save to memory
if err := projectMem.Save(ctx, []schema.Message{
&schema.HumanMessage{Parts: []schema.ContentPart{schema.TextPart{Text: query}}},
resp,
}); err != nil {
// Log error but return response
}
return response, nil
}

Index only changed files to maintain IDE performance:

type FileWatcher struct {
manager *IDEContextManager
projectID string
}
func (w *FileWatcher) OnFileChange(ctx context.Context, filePath, content string) error {
// Index changed file asynchronously to avoid blocking IDE
go func() {
if err := w.manager.IndexCode(context.Background(), w.projectID, filePath, content); err != nil {
// Log error
}
}()
return nil
}

Limit context size to fit LLM context windows:

func (m *IDEContextManager) GetRelevantContextWithLimit(ctx context.Context, projectID, query string, maxTokens int) (string, error) {
context, err := m.GetRelevantContext(ctx, projectID, query, 10)
if err != nil {
return "", err
}
// Truncate to token limit (approximate: 1 token ≈ 4 characters)
maxChars := maxTokens * 4
if len(context) > maxChars {
context = context[:maxChars]
}
return context, nil
}

Store sensitive code locally rather than in cloud vector stores:

import _ "github.com/lookatitude/beluga-ai/rag/vectorstore/providers/sqlite"
func NewLocalIDEContextManager(ctx context.Context, dbPath string) (*IDEContextManager, error) {
embedder, err := embedding.New("openai", embedding.ProviderConfig{
Model: "text-embedding-3-small",
})
if err != nil {
return nil, fmt.Errorf("create embedder: %w", err)
}
// Use local SQLite vector store
store, err := vectorstore.New("sqlite", vectorstore.ProviderConfig{
Path: dbPath,
})
if err != nil {
return nil, fmt.Errorf("create vector store: %w", err)
}
return &IDEContextManager{
projectMemories: make(map[string]memory.Memory),
embedder: embedder,
store: store,
}, nil
}
  • Lazy indexing: Index files on-demand rather than scanning entire project
  • Background processing: Run indexing and embedding in background threads
  • Caching: Cache embeddings for frequently accessed code
  • Batch updates: Batch file changes for efficient indexing

After implementing the context-aware IDE extension, the team achieved:

MetricBeforeAfterImprovement
Repetitive Explanations100%18%82% reduction
Context Retention0%92%New capability
Assistance Relevance60%91%52% improvement
Developer Time SavedBaseline+25%25% productivity gain
Satisfaction Score6.0/109.0/1050% improvement
Code QualityBaseline+17%17% improvement