Skip to content
Docs

Pinecone Serverless Vector Store

Traditional vector databases require provisioning pods, managing replicas, and predicting capacity. Pinecone Serverless removes this operational complexity entirely — indexes scale automatically based on usage, and you pay only for the operations you perform. This makes it well suited for applications with unpredictable traffic patterns or teams that want to focus on application logic rather than infrastructure management.

Choose Pinecone Serverless when you want zero-ops vector storage, pay-per-use pricing, and do not need the advanced filtering capabilities of databases like Qdrant.

Beluga AI’s VectorStore interface abstracts vector database operations behind a unified API. The Pinecone provider registers via init() and supports add, search, and delete operations against Pinecone Serverless indexes.

Key capabilities:

  • Auto-scaling serverless indexes with no pod management
  • Pay-per-use pricing based on operations and storage
  • Metadata filtering for scoped similarity search
  • OpenTelemetry instrumentation for observability
  • Go 1.23 or later
  • Beluga AI framework installed
  • Pinecone account with API key (pinecone.io)
  • A serverless index created in the Pinecone console

Install the Beluga AI module:

Terminal window
go get github.com/lookatitude/beluga-ai
  1. Sign in to the Pinecone console
  2. Create a new index with type Serverless
  3. Set the dimension to match your embedding model (e.g., 1536 for OpenAI text-embedding-ada-002)
  4. Note your API key, environment, and project ID

Set the required environment variables:

Terminal window
export PINECONE_API_KEY="your-api-key"
export PINECONE_ENVIRONMENT="us-west1-gcp"
export PINECONE_PROJECT_ID="your-project-id"
export PINECONE_INDEX_NAME="my-index"
export OPENAI_API_KEY="your-openai-key"

The Pinecone provider accepts the following options:

OptionDescriptionDefaultRequired
api_keyPinecone API key-Yes
environmentPinecone environment (e.g., us-west1-gcp)-Yes
project_idPinecone project ID-Yes
index_nameName of the serverless index-Yes
embedding_dimEmbedding vector dimension-Yes
timeoutRequest timeout30sNo

Create a Pinecone vector store, add documents, and run similarity search:

package main
import (
"context"
"fmt"
"log"
"os"
"github.com/lookatitude/beluga-ai/rag/embedding"
"github.com/lookatitude/beluga-ai/rag/vectorstore"
"github.com/lookatitude/beluga-ai/schema"
)
func main() {
ctx := context.Background()
// Create an OpenAI embedder
embedder, err := embedding.New("openai",
embedding.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
embedding.WithModel("text-embedding-ada-002"),
)
if err != nil {
log.Fatalf("Failed to create embedder: %v", err)
}
// Create a Pinecone vector store
store, err := vectorstore.New("pinecone",
vectorstore.WithEmbedder(embedder),
vectorstore.WithOption("api_key", os.Getenv("PINECONE_API_KEY")),
vectorstore.WithOption("environment", os.Getenv("PINECONE_ENVIRONMENT")),
vectorstore.WithOption("project_id", os.Getenv("PINECONE_PROJECT_ID")),
vectorstore.WithOption("index_name", os.Getenv("PINECONE_INDEX_NAME")),
vectorstore.WithOption("embedding_dim", 1536),
)
if err != nil {
log.Fatalf("Failed to create store: %v", err)
}
// Embed and add documents
docs := []schema.Document{
{PageContent: "Machine learning is transforming industries.", Metadata: map[string]any{"category": "tech"}},
{PageContent: "Go is known for its simplicity and performance.", Metadata: map[string]any{"category": "programming"}},
}
embeddings, err := embedder.EmbedDocuments(ctx, docs)
if err != nil {
log.Fatalf("Failed to embed documents: %v", err)
}
err = store.Add(ctx, docs, embeddings)
if err != nil {
log.Fatalf("Failed to add documents: %v", err)
}
fmt.Printf("Added %d documents\n", len(docs))
// Search by query
queryVec, err := embedder.EmbedQuery(ctx, "programming languages")
if err != nil {
log.Fatalf("Failed to embed query: %v", err)
}
results, err := store.Search(ctx, queryVec, 5)
if err != nil {
log.Fatalf("Search failed: %v", err)
}
fmt.Printf("Found %d results\n", len(results))
for i, result := range results {
fmt.Printf(" %d. %s\n", i+1, result.PageContent)
}
}

Use metadata filters to scope searches to specific document categories:

results, err := store.Search(ctx, queryVec, 5,
vectorstore.WithFilter(map[string]any{"category": "tech"}),
)
if err != nil {
log.Fatalf("Filtered search failed: %v", err)
}

Serverless indexes differ from pod-based indexes:

  • Auto-scaling: No pod type or replica configuration required
  • Pay-per-use: Charges based on read/write operations and storage
  • Cold start: Infrequently accessed indexes may have higher initial latency

No additional configuration is needed beyond the standard options above.

Add tracing to Pinecone operations for production observability:

package main
import (
"context"
"fmt"
"log"
"os"
"time"
"github.com/lookatitude/beluga-ai/rag/embedding"
"github.com/lookatitude/beluga-ai/rag/vectorstore"
"github.com/lookatitude/beluga-ai/schema"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/attribute"
"go.opentelemetry.io/otel/trace"
)
func main() {
ctx, cancel := context.WithTimeout(context.Background(), 60*time.Second)
defer cancel()
embedder, err := embedding.New("openai",
embedding.WithAPIKey(os.Getenv("OPENAI_API_KEY")),
embedding.WithModel("text-embedding-ada-002"),
)
if err != nil {
log.Fatalf("Failed to create embedder: %v", err)
}
tracer := otel.Tracer("beluga.vectorstore.pinecone")
ctx, span := tracer.Start(ctx, "pinecone.ingest",
trace.WithAttributes(
attribute.String("provider", "pinecone"),
attribute.String("index", os.Getenv("PINECONE_INDEX_NAME")),
),
)
defer span.End()
store, err := vectorstore.New("pinecone",
vectorstore.WithEmbedder(embedder),
vectorstore.WithOption("api_key", os.Getenv("PINECONE_API_KEY")),
vectorstore.WithOption("environment", os.Getenv("PINECONE_ENVIRONMENT")),
vectorstore.WithOption("project_id", os.Getenv("PINECONE_PROJECT_ID")),
vectorstore.WithOption("index_name", os.Getenv("PINECONE_INDEX_NAME")),
vectorstore.WithOption("embedding_dim", 1536),
)
if err != nil {
span.RecordError(err)
log.Fatalf("Failed to create store: %v", err)
}
docs := []schema.Document{
{PageContent: "AI is revolutionizing technology.", Metadata: map[string]any{"category": "ai", "source": "doc1"}},
{PageContent: "Serverless computing scales automatically.", Metadata: map[string]any{"category": "cloud", "source": "doc2"}},
}
embeddings, err := embedder.EmbedDocuments(ctx, docs)
if err != nil {
span.RecordError(err)
log.Fatalf("Failed to embed documents: %v", err)
}
err = store.Add(ctx, docs, embeddings)
if err != nil {
span.RecordError(err)
log.Fatalf("Failed to add documents: %v", err)
}
span.SetAttributes(attribute.Int("documents_added", len(docs)))
fmt.Printf("Successfully added %d documents\n", len(docs))
}

When using Pinecone Serverless in production:

  • Batch operations: Group document additions into batches to reduce per-operation costs.
  • Rate limiting: Implement backoff and retry logic for API rate limits. Beluga’s resilience package provides built-in retry and circuit breaker support.
  • Index dimension limits: Verify that your embedding model dimension matches the index configuration exactly.
  • Monitoring: Track usage and costs in the Pinecone dashboard. Combine with OpenTelemetry traces for end-to-end visibility.
  • Backup: Pinecone Serverless does not provide built-in backup. Maintain a source-of-truth data pipeline for re-indexing.

The index name does not exist or is misspelled. Verify available indexes:

Terminal window
curl -X GET "https://api.pinecone.io/indexes" \
-H "Api-Key: $PINECONE_API_KEY"

The API key or environment does not match. Confirm that PINECONE_API_KEY and PINECONE_ENVIRONMENT correspond to the same Pinecone project.

The embedding vector dimension does not match the index configuration. Ensure embedding_dim matches the dimension set when the index was created (e.g., 1536 for text-embedding-ada-002).