HuggingFace LLM Provider
The HuggingFace provider connects Beluga AI to HuggingFace’s Inference API, which provides hosted access to thousands of open-source models. HuggingFace exposes an OpenAI-compatible chat completions endpoint, so this provider supports all standard features including streaming and tool calling.
Choose HuggingFace when you need access to specialized or fine-tuned models from the HuggingFace ecosystem. The free Inference API is suitable for prototyping, while Dedicated Inference Endpoints provide production-grade hosting with guaranteed compute for your chosen model.
Installation
Section titled “Installation”go get github.com/lookatitude/beluga-ai/llm/providers/huggingfaceConfiguration
Section titled “Configuration”| Field | Required | Default | Description |
|---|---|---|---|
Model | Yes | — | Model ID (HuggingFace repo format) |
APIKey | Yes | — | HuggingFace token (hf_...) |
BaseURL | No | https://api-inference.huggingface.co/v1 | Override API endpoint |
Timeout | No | 30s | Request timeout |
Environment variables:
| Variable | Maps to |
|---|---|
HUGGINGFACE_API_KEY | APIKey |
HF_TOKEN | APIKey |
Basic Usage
Section titled “Basic Usage”package main
import ( "context" "fmt" "log" "os"
"github.com/lookatitude/beluga-ai/config" "github.com/lookatitude/beluga-ai/llm" "github.com/lookatitude/beluga-ai/schema" _ "github.com/lookatitude/beluga-ai/llm/providers/huggingface")
func main() { model, err := llm.New("huggingface", config.ProviderConfig{ Model: "meta-llama/Meta-Llama-3.1-70B-Instruct", APIKey: os.Getenv("HF_TOKEN"), }) if err != nil { log.Fatal(err) }
msgs := []schema.Message{ schema.NewSystemMessage("You are a helpful assistant."), schema.NewHumanMessage("What is the capital of France?"), }
resp, err := model.Generate(context.Background(), msgs) if err != nil { log.Fatal(err) }
fmt.Println(resp.Text())}Streaming
Section titled “Streaming”for chunk, err := range model.Stream(context.Background(), msgs) { if err != nil { log.Fatal(err) } fmt.Print(chunk.Delta)}fmt.Println()Advanced Features
Section titled “Advanced Features”Tool Calling
Section titled “Tool Calling”Tool calling support depends on the model:
modelWithTools := model.BindTools(tools)resp, err := modelWithTools.Generate(ctx, msgs, llm.WithToolChoice(llm.ToolChoiceAuto))Structured Output
Section titled “Structured Output”resp, err := model.Generate(ctx, msgs, llm.WithResponseFormat(llm.ResponseFormat{Type: "json_object"}),)Generation Options
Section titled “Generation Options”resp, err := model.Generate(ctx, msgs, llm.WithTemperature(0.7), llm.WithMaxTokens(2048), llm.WithTopP(0.9), llm.WithStopSequences("END"),)Dedicated Inference Endpoints
Section titled “Dedicated Inference Endpoints”To use a HuggingFace Dedicated Inference Endpoint, set the BaseURL:
model, err := llm.New("huggingface", config.ProviderConfig{ Model: "meta-llama/Meta-Llama-3.1-70B-Instruct", APIKey: os.Getenv("HF_TOKEN"), BaseURL: "https://your-endpoint.us-east-1.aws.endpoints.huggingface.cloud/v1",})Error Handling
Section titled “Error Handling”resp, err := model.Generate(ctx, msgs)if err != nil { log.Fatal(err)}Direct Construction
Section titled “Direct Construction”import "github.com/lookatitude/beluga-ai/llm/providers/huggingface"
model, err := huggingface.New(config.ProviderConfig{ Model: "meta-llama/Meta-Llama-3.1-70B-Instruct", APIKey: os.Getenv("HF_TOKEN"),})Available Models
Section titled “Available Models”HuggingFace hosts thousands of models. Popular choices for the Inference API include:
| Model ID | Description |
|---|---|
meta-llama/Meta-Llama-3.1-70B-Instruct | Llama 3.1 70B |
meta-llama/Meta-Llama-3.1-8B-Instruct | Llama 3.1 8B |
mistralai/Mixtral-8x7B-Instruct-v0.1 | Mixtral 8x7B |
microsoft/Phi-3-medium-4k-instruct | Phi-3 Medium |
Refer to the HuggingFace model hub for the full catalog.