Skip to content
Docs

Embedding Providers — 9 Options

Beluga AI v2 provides a unified embedding.Embedder interface for converting text into dense vector representations. All providers register via init() and are instantiated through the global registry.

The unified interface means your RAG pipeline, vector store integration, and retrieval logic work identically regardless of which embedding provider you choose. Switch providers by changing a single import and configuration value.

type Embedder interface {
Embed(ctx context.Context, texts []string) ([][]float32, error)
EmbedSingle(ctx context.Context, text string) ([]float32, error)
Dimensions() int
}
import (
"github.com/lookatitude/beluga-ai/config"
"github.com/lookatitude/beluga-ai/rag/embedding"
// Register the provider you need via blank import
_ "github.com/lookatitude/beluga-ai/rag/embedding/providers/openai"
)
func main() {
emb, err := embedding.New("openai", config.ProviderConfig{
APIKey: os.Getenv("OPENAI_API_KEY"),
})
if err != nil {
log.Fatal(err)
}
vectors, err := emb.Embed(context.Background(), []string{"hello world"})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Dimensions: %d\n", emb.Dimensions())
}
ProviderRegistry NameDefault ModelDefault Dimensions
OpenAIopenaitext-embedding-3-small1536
Coherecohereembed-english-v3.01024
Googlegoogletext-embedding-004768
Ollamaollamanomic-embed-text768
Jinajinajina-embeddings-v2-base-en768
Voyagevoyagevoyage-21024
Mistralmistralmistral-embed1024
Sentence Transformerssentence_transformersall-MiniLM-L6-v2384
In-MemoryinmemoryN/A (hash-based)128

List all registered providers at runtime:

names := embedding.List()
// Returns sorted list: ["cohere", "google", "inmemory", "jina", ...]
Use CaseRecommended Provider
General-purpose, best cost/quality ratioOpenAI
Asymmetric search (separate doc/query embeddings)Cohere, Voyage
Code retrievalVoyage (voyage-code-2)
Multilingual (100+ languages)Cohere (embed-multilingual-v3.0)
Local/offline developmentOllama
Testing and unit testsIn-Memory
Self-hosted with open-source modelsSentence Transformers
Google Cloud integrationGoogle

All embedders support middleware for cross-cutting concerns such as logging, caching, and tracing:

emb := embedding.ApplyMiddleware(baseEmb,
loggingMiddleware,
cachingMiddleware,
)

Hooks allow observing embedding operations without wrapping the interface:

emb = embedding.ApplyMiddleware(baseEmb,
embedding.WithHooks(embedding.Hooks{
BeforeEmbed: func(ctx context.Context, texts []string) error {
log.Printf("Embedding %d texts", len(texts))
return nil
},
AfterEmbed: func(ctx context.Context, embeddings [][]float32, err error) {
if err != nil {
log.Printf("Embedding failed: %v", err)
}
},
}),
)