Skip to content
Docs

Meta Llama LLM Provider

The Llama provider is a meta-provider that enables running Meta’s Llama models through any of several hosting backends. Since Meta does not offer a direct inference API, this provider delegates to Together AI, Fireworks AI, Groq, SambaNova, Cerebras, or Ollama depending on the selected backend.

Use the Llama provider when you want to standardize your code around Llama models while retaining flexibility to switch between hosting backends. This lets you optimize for cost, speed, or locality (via Ollama) without changing your application code — only the backend configuration changes.

Terminal window
go get github.com/lookatitude/beluga-ai/llm/providers/llama

You also need to import the backend provider you intend to use:

import (
_ "github.com/lookatitude/beluga-ai/llm/providers/llama"
_ "github.com/lookatitude/beluga-ai/llm/providers/together" // or your chosen backend
)
FieldRequiredDefaultDescription
ModelYesLlama model ID (format depends on backend)
APIKeyVariesAPI key for the backend provider
BaseURLNoBackend defaultOverride API endpoint
TimeoutNo30sRequest timeout

Provider-specific options (via Options map):

KeyDefaultDescription
backend"together"Backend provider: together, fireworks, groq, sambanova, cerebras, ollama

Supported backends and their default base URLs:

BackendDefault Base URL
togetherhttps://api.together.xyz/v1
fireworkshttps://api.fireworks.ai/inference/v1
groqhttps://api.groq.com/openai/v1
sambanovahttps://api.sambanova.ai/v1
cerebrashttps://api.cerebras.ai/v1
ollamahttp://localhost:11434/v1
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/lookatitude/beluga-ai/config"
"github.com/lookatitude/beluga-ai/llm"
"github.com/lookatitude/beluga-ai/schema"
_ "github.com/lookatitude/beluga-ai/llm/providers/llama"
_ "github.com/lookatitude/beluga-ai/llm/providers/together"
)
func main() {
model, err := llm.New("llama", config.ProviderConfig{
Model: "meta-llama/Llama-3.3-70B-Instruct",
APIKey: os.Getenv("TOGETHER_API_KEY"),
Options: map[string]any{
"backend": "together",
},
})
if err != nil {
log.Fatal(err)
}
msgs := []schema.Message{
schema.NewSystemMessage("You are a helpful assistant."),
schema.NewHumanMessage("What is the capital of France?"),
}
resp, err := model.Generate(context.Background(), msgs)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text())
}
model, err := llm.New("llama", config.ProviderConfig{
Model: "llama-3.3-70b-versatile",
APIKey: os.Getenv("GROQ_API_KEY"),
Options: map[string]any{
"backend": "groq",
},
})
model, err := llm.New("llama", config.ProviderConfig{
Model: "llama3.2",
Options: map[string]any{
"backend": "ollama",
},
})
model, err := llm.New("llama", config.ProviderConfig{
Model: "accounts/fireworks/models/llama-v3p1-70b-instruct",
APIKey: os.Getenv("FIREWORKS_API_KEY"),
Options: map[string]any{
"backend": "fireworks",
},
})
for chunk, err := range model.Stream(context.Background(), msgs) {
if err != nil {
log.Fatal(err)
}
fmt.Print(chunk.Delta)
}
fmt.Println()

All features available on the underlying backend provider are supported: tool calling, structured output, generation options, etc. The Llama provider simply delegates to the chosen backend.

resp, err := model.Generate(ctx, msgs,
llm.WithTemperature(0.7),
llm.WithMaxTokens(2048),
)
resp, err := model.Generate(ctx, msgs)
if err != nil {
// Error prefix depends on the backend provider
log.Fatal(err)
}
import (
"github.com/lookatitude/beluga-ai/llm/providers/llama"
_ "github.com/lookatitude/beluga-ai/llm/providers/together"
)
model, err := llama.New(config.ProviderConfig{
Model: "meta-llama/Llama-3.3-70B-Instruct",
APIKey: os.Getenv("TOGETHER_API_KEY"),
Options: map[string]any{"backend": "together"},
})

Note that Llama model IDs vary by backend:

BackendExample Model ID
Togethermeta-llama/Llama-3.3-70B-Instruct
Fireworksaccounts/fireworks/models/llama-v3p1-70b-instruct
Groqllama-3.3-70b-versatile
SambaNovaMeta-Llama-3.3-70B-Instruct
Cerebrasllama-3.3-70b
Ollamallama3.2