Modeling Multi-Turn Conversations
LLMs are stateless — to maintain a conversation, you must send the full message history (or a relevant subset) with every request. This is a fundamental architectural constraint, not a limitation of any particular provider. Beluga AI’s schema package provides structured message types that ensure correct role tagging, multimodal content support, and clean integration with any LLM provider. Understanding how to build and manage message history is essential for any application that goes beyond single-shot queries.
What You Will Build
Section titled “What You Will Build”A structured multi-turn conversation using typed messages, demonstrating how to build, manage, and constrain conversation history.
Prerequisites
Section titled “Prerequisites”- Go 1.23+
- Understanding of Go interfaces
Message Roles
Section titled “Message Roles”Every message in a conversation has a role that tells the LLM how to interpret its content. These roles map to the fundamental participants in an AI interaction: the system (which sets rules), the human (who asks questions), the AI (which responds), and tools (which provide external data). The typed message constructors enforce correct role assignment, preventing common errors like accidentally tagging a system prompt as user input.
| Role | Constant | Usage |
|---|---|---|
| System | schema.RoleSystem | Sets behavior, persona, or rules |
| Human | schema.RoleHuman | User input |
| AI | schema.RoleAI | Model responses |
| Tool | schema.RoleTool | Results from tool executions |
Step 1: Creating Messages
Section titled “Step 1: Creating Messages”Use the factory functions to create typed messages. These functions return concrete types (*schema.SystemMessage, *schema.HumanMessage, etc.) rather than the schema.Message interface, giving you access to type-specific fields when needed while still satisfying the interface for inclusion in message slices.
package main
import ( "context" "fmt"
"github.com/lookatitude/beluga-ai/schema")
func main() { // System message — sets the AI's persona sysMsg := schema.NewSystemMessage("You are a concise data analyst.")
// Human message — the user's query humanMsg := schema.NewHumanMessage("What is the average rainfall in Seattle?")
// AI message — the model's response (in practice, this comes from the LLM) aiMsg := schema.NewAIMessage("Seattle receives an average of 37 inches of rain per year.")
// Tool message — result of a tool execution toolMsg := schema.NewToolMessage("call_123", `{"result": 37.49}`)
// Access content via the Text() helper fmt.Printf("System: %s\n", sysMsg.Text()) fmt.Printf("User: %s\n", humanMsg.Text()) fmt.Printf("AI: %s\n", aiMsg.Text()) fmt.Printf("Tool: %s\n", toolMsg.Text())}Each message type stores content as []schema.ContentPart, supporting multimodal content (text, images, audio, video, files). The Text() helper extracts and concatenates all TextPart values.
Step 2: Building a Conversation
Section titled “Step 2: Building a Conversation”A conversation is a []schema.Message slice passed to the model. The slice ordering matters — models process messages sequentially and expect a natural conversation flow: system instructions first, then alternating human/AI turns. This slice-based representation keeps conversations simple to construct, inspect, and serialize.
func buildConversation() []schema.Message { return []schema.Message{ schema.NewSystemMessage("You are a helpful assistant."), schema.NewHumanMessage("Hi, I'm building a workflow."), schema.NewAIMessage("That sounds great! What kind of workflow?"), schema.NewHumanMessage("A data processing pipeline."), }}Step 3: Sending to a ChatModel
Section titled “Step 3: Sending to a ChatModel”Pass the conversation history to any ChatModel. The model receives the full history and generates a response that accounts for all prior context. The Usage field on the response reports token consumption, which is important for cost tracking and context window management.
func chat(ctx context.Context, model llm.ChatModel) error { messages := buildConversation()
// Generate a response resp, err := model.Generate(ctx, messages) if err != nil { return fmt.Errorf("generate failed: %w", err) }
fmt.Printf("AI: %s\n", resp.Text()) fmt.Printf("Tokens used: %d\n", resp.Usage.TotalTokens)
// Append the response to history for the next turn messages = append(messages, resp)
return nil}Step 4: Managing a Conversation Loop
Section titled “Step 4: Managing a Conversation Loop”Implement a multi-turn conversation by maintaining the message history across turns. Each iteration appends the user query and AI response to the same slice, building up the full conversation context. The model sees the entire history with each call, which allows it to reference earlier parts of the conversation and maintain coherence across turns.
func conversationLoop(ctx context.Context, model llm.ChatModel) error { messages := []schema.Message{ schema.NewSystemMessage("You are a helpful coding assistant."), }
queries := []string{ "How do I read a file in Go?", "Can you show me error handling?", "What about concurrent file reading?", }
for _, query := range queries { // Add user message messages = append(messages, schema.NewHumanMessage(query))
// Generate response with full history resp, err := model.Generate(ctx, messages) if err != nil { return fmt.Errorf("generate failed: %w", err) }
fmt.Printf("User: %s\n", query) fmt.Printf("AI: %s\n\n", resp.Text())
// Append AI response to maintain context messages = append(messages, resp) }
return nil}Step 5: Context Window Management
Section titled “Step 5: Context Window Management”As conversations grow, they consume more tokens. Every LLM has a finite context window, and exceeding it causes API errors. The sliding window approach below preserves system messages (which define the agent’s behavior) while trimming older conversation turns. System messages are always kept because removing them would change the agent’s persona mid-conversation, leading to inconsistent behavior.
func trimHistory(messages []schema.Message, maxMessages int) []schema.Message { if len(messages) <= maxMessages { return messages }
// Separate system messages (always keep) from conversation messages var system []schema.Message var conversation []schema.Message
for _, msg := range messages { if msg.GetRole() == schema.RoleSystem { system = append(system, msg) } else { conversation = append(conversation, msg) } }
// Keep only the last N conversation messages maxConv := maxMessages - len(system) if maxConv < 0 { maxConv = 0 } if len(conversation) > maxConv { conversation = conversation[len(conversation)-maxConv:] }
// Reassemble: system messages first, then recent conversation result := make([]schema.Message, 0, len(system)+len(conversation)) result = append(result, system...) result = append(result, conversation...) return result}Usage:
// Before sending to the model, trim to the last 20 messagesmessages = trimHistory(messages, 20)resp, err := model.Generate(ctx, messages)Working with Multimodal Content
Section titled “Working with Multimodal Content”Messages can carry mixed content types. The ContentPart slice allows combining text and binary data in a single message, which is how vision-capable models receive images alongside text queries.
// Create a message with text and an imagehumanMsg := &schema.HumanMessage{ Parts: []schema.ContentPart{ schema.TextPart{Text: "What's in this image?"}, schema.ImagePart{ URL: "https://example.com/photo.jpg", MimeType: "image/jpeg", }, },}AI Messages with Tool Calls
Section titled “AI Messages with Tool Calls”When a model requests tool invocations, the AIMessage carries ToolCalls. This is the mechanism behind agentic behavior — the model decides which tools to call and with what arguments, and your code executes them and feeds results back as ToolMessage entries. The ID field links each tool result back to the specific call that requested it.
// AI response with tool calls (returned by the model)aiResp := &schema.AIMessage{ Parts: []schema.ContentPart{schema.TextPart{Text: "Let me look that up."}}, ToolCalls: []schema.ToolCall{ {ID: "call_1", Name: "search", Arguments: `{"query": "Go concurrency"}`}, }, Usage: schema.Usage{InputTokens: 50, OutputTokens: 20, TotalTokens: 70},}
// After executing the tool, provide the resulttoolResult := schema.NewToolMessage("call_1", `{"results": ["goroutines", "channels"]}`)Troubleshooting
Section titled “Troubleshooting”Messages appear to lose context: Ensure you append each AI response to the message history before generating the next turn. LLMs are stateless — they only know what you send them.
Token limits exceeded: Implement context window management (Step 5) to keep message counts or token counts within the model’s limits. Consider summarization strategies for long-running conversations.
Next Steps
Section titled “Next Steps”- Custom Message Types — Extend the message system with structured data
- Redis Memory Persistence — Persist conversation history across restarts