Skip to content
Docs

Confluence Document Loader

The Confluence loader implements the loader.DocumentLoader interface for loading pages from Atlassian Confluence. It fetches page content via the Confluence REST API and extracts plain text from the HTML storage representation.

Choose Confluence when your organization’s knowledge base lives in Atlassian Confluence and you want to feed wiki pages into a RAG pipeline. The loader supports both Atlassian Cloud (Basic auth) and Data Center/Server (Bearer token) authentication. For other knowledge platforms, consider Notion or Google Drive.

Terminal window
go get github.com/lookatitude/beluga-ai/rag/loader/providers/confluence
package main
import (
"context"
"fmt"
"log"
"os"
"github.com/lookatitude/beluga-ai/config"
"github.com/lookatitude/beluga-ai/rag/loader"
_ "github.com/lookatitude/beluga-ai/rag/loader/providers/confluence"
)
func main() {
l, err := loader.New("confluence", config.ProviderConfig{
APIKey: os.Getenv("CONFLUENCE_API_TOKEN"),
BaseURL: "https://your-domain.atlassian.net/wiki",
Options: map[string]any{
"user": "user@example.com",
},
})
if err != nil {
log.Fatal(err)
}
docs, err := l.Load(context.Background(), "12345")
if err != nil {
log.Fatal(err)
}
fmt.Printf("Title: %s\n", docs[0].Metadata["title"])
fmt.Printf("Content: %s\n", docs[0].Content)
}
ParameterTypeDefaultDescription
APIKeystring(required)Confluence API token
BaseURLstring(required)Confluence instance URL (e.g., https://your-domain.atlassian.net/wiki)
Timeouttime.Duration30sHTTP request timeout
Options["user"]string""Confluence username for Basic auth; omit for Bearer token auth

The loader supports two authentication modes:

When Options["user"] is set, the loader uses HTTP Basic authentication with the user email and API token:

l, err := loader.New("confluence", config.ProviderConfig{
APIKey: os.Getenv("CONFLUENCE_API_TOKEN"),
BaseURL: "https://your-domain.atlassian.net/wiki",
Options: map[string]any{
"user": "user@example.com",
},
})

When Options["user"] is omitted, the loader uses Bearer token authentication:

l, err := loader.New("confluence", config.ProviderConfig{
APIKey: os.Getenv("CONFLUENCE_PAT"),
BaseURL: "https://confluence.internal.corp/wiki",
})

The source parameter accepts:

  • Page ID: A numeric string like "12345"
  • Space/Page ID: A path like "ENGINEERING/12345" (the space key prefix is ignored; only the page ID is used for the API call)
FieldTypeDescription
sourcestringOriginal source string
loaderstringAlways "confluence"
page_idstringConfluence page ID
titlestringPage title
spacestringSpace key the page belongs to

The loader fetches pages using the Confluence REST API with body.storage expansion and strips HTML tags to produce plain text content. This means formatting information (bold, italic, links) is removed, but all text content is preserved.

docs, err := l.Load(ctx, "12345")
if err != nil {
// Possible errors:
// - "confluence: base URL is required" (missing BaseURL)
// - "confluence: API key/token is required" (missing APIKey)
// - "confluence: fetch page ...: ..." (API request failure)
log.Fatal(err)
}