Skip to content
Docs

Embedding Providers

Embeddings convert text into numerical vectors that capture semantic meaning, enabling similarity search, clustering, and retrieval-augmented generation (RAG). The choice of embedding provider affects retrieval quality, latency, cost, and data privacy — there is no single best option for all use cases.

Beluga AI provides a unified Embedder interface for converting text into vector embeddings. All embedding providers register via init() and follow the same registry pattern used across the framework, so you can evaluate different providers without changing your pipeline code.

ProviderRegistry NameDimensionsImport Path
OpenAIopenai1536, 3072rag/embedding/providers/openai
Coherecohere384-1024rag/embedding/providers/cohere
Googlegoogle768rag/embedding/providers/google
OllamaollamaVaries by modelrag/embedding/providers/ollama
Jinajina768-1024rag/embedding/providers/jina
Voyagevoyage1024rag/embedding/providers/voyage
Mistralmistral1024rag/embedding/providers/mistral
Sentence Transformerssentence_transformersVariesrag/embedding/providers/sentence_transformers
In-MemoryinmemoryConfigurablerag/embedding/providers/inmemory
package main
import (
"context"
"fmt"
"log"
"github.com/lookatitude/beluga-ai/config"
"github.com/lookatitude/beluga-ai/rag/embedding"
// Import the provider
_ "github.com/lookatitude/beluga-ai/rag/embedding/providers/openai"
)
func main() {
ctx := context.Background()
// Create embedder via registry
emb, err := embedding.New("openai", config.ProviderConfig{
APIKey: "sk-...",
Model: "text-embedding-3-small",
})
if err != nil {
log.Fatal(err)
}
// Embed text
vectors, err := emb.Embed(ctx, []string{"Hello, world!", "How are you?"})
if err != nil {
log.Fatal(err)
}
fmt.Printf("Generated %d embeddings of dimension %d\n", len(vectors), emb.Dimensions())
}

All providers implement:

type Embedder interface {
// Embed produces embeddings for a batch of texts.
Embed(ctx context.Context, texts []string) ([][]float32, error)
// EmbedSingle embeds a single text and returns its vector.
EmbedSingle(ctx context.Context, text string) ([]float32, error)
// Dimensions returns the vector dimensionality.
Dimensions() int
}

OpenAI provides text-embedding-3-small (1536d) and text-embedding-3-large (3072d).

Terminal window
export OPENAI_API_KEY="sk-..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/openai"
emb, err := embedding.New("openai", config.ProviderConfig{
APIKey: os.Getenv("OPENAI_API_KEY"),
Model: "text-embedding-3-small",
})
ModelDimensionsUse Case
text-embedding-3-small1536General purpose, cost-effective
text-embedding-3-large3072Higher accuracy, more expensive
text-embedding-ada-0021536Legacy, widely compatible

Cohere provides multilingual embeddings across 100+ languages.

Terminal window
export COHERE_API_KEY="..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/cohere"
emb, err := embedding.New("cohere", config.ProviderConfig{
APIKey: os.Getenv("COHERE_API_KEY"),
Model: "embed-multilingual-v3.0",
})
ModelDimensionsLanguages
embed-multilingual-v3.01024100+ languages
embed-english-v3.01024English only
embed-english-light-v3.0384English only, lightweight

Cohere embeddings are particularly suited for cross-language retrieval tasks where queries and documents may be in different languages.

Google Vertex AI provides embeddings via the Gemini API.

Terminal window
export GOOGLE_API_KEY="AIza..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/google"
emb, err := embedding.New("google", config.ProviderConfig{
APIKey: os.Getenv("GOOGLE_API_KEY"),
Model: "text-embedding-004",
})

Run embedding models locally with no external API calls.

Terminal window
ollama pull nomic-embed-text
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/ollama"
emb, err := embedding.New("ollama", config.ProviderConfig{
Model: "nomic-embed-text",
BaseURL: "http://localhost:11434",
})

Local embeddings are useful for air-gapped environments, reducing latency, and avoiding per-token costs. Popular models include nomic-embed-text, mxbai-embed-large, and all-minilm.

Jina AI provides embeddings optimized for retrieval tasks.

Terminal window
export JINA_API_KEY="..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/jina"
emb, err := embedding.New("jina", config.ProviderConfig{
APIKey: os.Getenv("JINA_API_KEY"),
Model: "jina-embeddings-v3",
})

Voyage AI provides high-quality embeddings for code and text.

Terminal window
export VOYAGE_API_KEY="..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/voyage"
emb, err := embedding.New("voyage", config.ProviderConfig{
APIKey: os.Getenv("VOYAGE_API_KEY"),
Model: "voyage-3",
})

Voyage embeddings are popular for code-related retrieval. Use voyage-code-3 for code search applications.

Mistral provides European-hosted embeddings.

Terminal window
export MISTRAL_API_KEY="..."
import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/mistral"
emb, err := embedding.New("mistral", config.ProviderConfig{
APIKey: os.Getenv("MISTRAL_API_KEY"),
Model: "mistral-embed",
})

Use sentence-transformer models via a local inference server.

import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/sentence_transformers"
emb, err := embedding.New("sentence_transformers", config.ProviderConfig{
BaseURL: "http://localhost:8080",
Model: "all-MiniLM-L6-v2",
})

The in-memory embedder generates deterministic hash-based vectors for testing and development.

import _ "github.com/lookatitude/beluga-ai/rag/embedding/providers/inmemory"
emb, err := embedding.New("inmemory", config.ProviderConfig{
Options: map[string]any{
"dimensions": 384.0,
},
})

Do not use the in-memory embedder for production — it does not produce semantically meaningful vectors.

Add caching, logging, or tracing to any embedder:

emb = embedding.ApplyMiddleware(emb,
embedding.WithLogging(slog.Default()),
)

Attach callbacks to embedding operations:

hooks := embedding.Hooks{
BeforeEmbed: func(ctx context.Context, texts []string) error {
slog.Info("embedding", "count", len(texts))
return nil
},
AfterEmbed: func(ctx context.Context, embeddings [][]float32, err error) {
if err == nil {
slog.Info("embedded", "vectors", len(embeddings))
}
},
}

For large document sets, embed in batches to manage memory and rate limits:

batchSize := 100
for i := 0; i < len(texts); i += batchSize {
end := min(i+batchSize, len(texts))
batch := texts[i:end]
vectors, err := emb.Embed(ctx, batch)
if err != nil {
return fmt.Errorf("batch %d: %w", i/batchSize, err)
}
// Store vectors...
}
NeedRecommended Provider
General purposeOpenAI text-embedding-3-small
MultilingualCohere embed-multilingual-v3.0
Code searchVoyage voyage-code-3
Local/offlineOllama nomic-embed-text
Cost-sensitiveOllama or Sentence Transformers
European data residencyMistral mistral-embed

Match the embedding provider to your vector store’s dimension requirements. The embedding dimension must match the dimension configured in your vector store index.