OmniLLM - Unified Go SDK for Large Language Models

OmniLLM is a unified Go SDK that provides a consistent interface for interacting with multiple Large Language Model (LLM) providers including OpenAI, Anthropic (Claude), Google Gemini, X.AI (Grok), and Ollama. It implements the Chat Completions API pattern and offers both synchronous and streaming capabilities. Additional providers like AWS Bedrock are available as external modules.

✨ Features

🔌 Multi-Provider Support: OpenAI, Anthropic (Claude), Google Gemini, X.AI (Grok), Ollama, plus external providers (AWS Bedrock, etc.)
🎯 Unified API: Same interface across all providers
📡 Streaming Support: Real-time response streaming for all providers
🧠 Conversation Memory: Persistent conversation history using Key-Value Stores
📊 Observability Hooks: Extensible hooks for tracing, logging, and metrics without modifying core library
🔄 Retry with Backoff: Automatic retries for transient failures (rate limits, 5xx errors)
🧪 Comprehensive Testing: Unit tests, integration tests, and mock implementations included
🔧 Extensible: Easy to add new LLM providers
📦 Modular: Provider-specific implementations in separate packages
🏗️ Reference Architecture: Internal providers serve as reference implementations for external providers
🔌 3rd Party Friendly: External providers can be injected without modifying core library
⚡ Type Safe: Full Go type safety with comprehensive error handling

🏗️ Architecture

OmniLLM uses a clean, modular architecture that separates concerns and enables easy extensibility:

omnillm/
├── client.go            # Main ChatClient wrapper
├── providers.go         # Factory functions for built-in providers
├── types.go             # Type aliases for backward compatibility
├── memory.go            # Conversation memory management
├── observability.go     # ObservabilityHook interface for tracing/logging/metrics
├── errors.go            # Unified error handling
├── *_test.go            # Comprehensive unit tests
├── provider/            # 🎯 Public interface package for external providers
│   ├── interface.go     # Provider interface that all providers must implement
│   └── types.go         # Unified request/response types
├── providers/           # 📦 Individual provider packages (reference implementations)
│   ├── openai/          # OpenAI implementation
│   │   ├── openai.go    # HTTP client
│   │   ├── types.go     # OpenAI-specific types
│   │   ├── adapter.go   # provider.Provider implementation
│   │   └── *_test.go    # Provider tests
│   ├── anthropic/       # Anthropic implementation
│   │   ├── anthropic.go # HTTP client (SSE streaming)
│   │   ├── types.go     # Anthropic-specific types
│   │   ├── adapter.go   # provider.Provider implementation
│   │   └── *_test.go    # Provider and integration tests
│   ├── gemini/          # Google Gemini implementation
│   ├── xai/             # X.AI Grok implementation
│   └── ollama/          # Ollama implementation
└── testing/             # 🧪 Test utilities
    └── mock_kvs.go      # Mock KVS for memory testing

Key Architecture Benefits

🎯 Public Interface: The provider package exports the Provider interface that external packages can implement
🏗️ Reference Implementation: Internal providers follow the exact same structure that external providers should use
🔌 Direct Injection: External providers are injected via ClientConfig.CustomProvider without modifying core code
📦 Modular Design: Each provider is self-contained with its own HTTP client, types, and adapter
🧪 Testable: Clean interfaces that can be easily mocked and tested
🔧 Extensible: New providers can be added without touching existing code
⚡ Native Implementation: Uses standard net/http for direct API communication (no official SDK dependencies)

🚀 Quick Start

Installation

go get github.com/agentplexus/omnillm

Basic Usage

package main

import (
    "context"
    "fmt"
    "log"
    
    "github.com/agentplexus/omnillm"
)

func main() {
    // Create a client for OpenAI
    client, err := omnillm.NewClient(omnillm.ClientConfig{
        Provider: omnillm.ProviderNameOpenAI,
        APIKey:   "your-openai-api-key",
    })
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    // Create a chat completion request
    response, err := client.CreateChatCompletion(context.Background(), &omnillm.ChatCompletionRequest{
        Model: omnillm.ModelGPT4o,
        Messages: []omnillm.Message{
            {
                Role:    omnillm.RoleUser,
                Content: "Hello! How can you help me today?",
            },
        },
        MaxTokens:   &[]int{150}[0],
        Temperature: &[]float64{0.7}[0],
    })
    if err != nil {
        log.Fatal(err)
    }

    fmt.Printf("Response: %s\n", response.Choices[0].Message.Content)
    fmt.Printf("Tokens used: %d\n", response.Usage.TotalTokens)
}

🔧 Supported Providers

OpenAI

Models: GPT-5, GPT-4.1, GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo
Features: Chat completions, streaming, function calling

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "your-openai-api-key",
    BaseURL:  "https://api.openai.com/v1", // optional
})

Anthropic (Claude)

Models: Claude-Opus-4.1, Claude-Opus-4, Claude-Sonnet-4, Claude-3.7-Sonnet, Claude-3.5-Haiku, Claude-3-Opus, Claude-3-Sonnet, Claude-3-Haiku
Features: Chat completions, streaming, system message support

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameAnthropic,
    APIKey:   "your-anthropic-api-key",
    BaseURL:  "https://api.anthropic.com", // optional
})

Google Gemini

Models: Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-1.5-Pro, Gemini-1.5-Flash
Features: Chat completions, streaming

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameGemini,
    APIKey:   "your-gemini-api-key",
})

AWS Bedrock (External Provider)

AWS Bedrock is available as an external module to avoid pulling AWS SDK dependencies for users who don't need it.

go get github.com/agentplexus/omnillm-bedrock

import (
    "github.com/agentplexus/omnillm"
    "github.com/agentplexus/omnillm-bedrock"
)

// Create the Bedrock provider
bedrockProvider, err := bedrock.NewProvider("us-east-1")
if err != nil {
    log.Fatal(err)
}

// Use it with omnillm via CustomProvider
client, err := omnillm.NewClient(omnillm.ClientConfig{
    CustomProvider: bedrockProvider,
})

See External Providers for more details.

X.AI (Grok)

Models: Grok-4.1-Fast (Reasoning/Non-Reasoning), Grok-4 (0709), Grok-4-Fast (Reasoning/Non-Reasoning), Grok-Code-Fast, Grok-3, Grok-3-Mini, Grok-2, Grok-2-Vision
Features: Chat completions, streaming, OpenAI-compatible API, 2M context window (4.1/4-Fast models)

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameXAI,
    APIKey:   "your-xai-api-key",
    BaseURL:  "https://api.x.ai/v1", // optional
})

Ollama (Local Models)

Models: Llama 3, Mistral, CodeLlama, Gemma, Qwen2.5, DeepSeek-Coder
Features: Local inference, no API keys required, optimized for Apple Silicon

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOllama,
    BaseURL:  "http://localhost:11434", // default Ollama endpoint
})

🔌 External Providers

Some providers with heavy SDK dependencies are available as separate modules to keep the core library lightweight. These are injected via ClientConfig.CustomProvider.

Provider	Module	Why External
AWS Bedrock	github.com/agentplexus/omnillm-bedrock	AWS SDK v2 adds 17+ transitive dependencies

Using External Providers

import (
    "github.com/agentplexus/omnillm"
    "github.com/agentplexus/omnillm-bedrock"  // or your custom provider
)

// Create the external provider
provider, err := bedrock.NewProvider("us-east-1")
if err != nil {
    log.Fatal(err)
}

// Inject via CustomProvider
client, err := omnillm.NewClient(omnillm.ClientConfig{
    CustomProvider: provider,
})

Creating Your Own External Provider

External providers implement the provider.Provider interface:

import "github.com/agentplexus/omnillm/provider"

type MyProvider struct{}

func (p *MyProvider) Name() string { return "myprovider" }
func (p *MyProvider) Close() error { return nil }

func (p *MyProvider) CreateChatCompletion(ctx context.Context, req *provider.ChatCompletionRequest) (*provider.ChatCompletionResponse, error) {
    // Your implementation
}

func (p *MyProvider) CreateChatCompletionStream(ctx context.Context, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error) {
    // Your streaming implementation
}

See the omnillm-bedrock source code as a reference implementation.

📡 Streaming Example

stream, err := client.CreateChatCompletionStream(context.Background(), &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o,
    Messages: []omnillm.Message{
        {
            Role:    omnillm.RoleUser,
            Content: "Tell me a short story about AI.",
        },
    },
    MaxTokens:   &[]int{200}[0],
    Temperature: &[]float64{0.8}[0],
})
if err != nil {
    log.Fatal(err)
}
defer stream.Close()

fmt.Print("AI Response: ")
for {
    chunk, err := stream.Recv()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    
    if len(chunk.Choices) > 0 && chunk.Choices[0].Delta != nil {
        fmt.Print(chunk.Choices[0].Delta.Content)
    }
}
fmt.Println()

🧠 Conversation Memory

OmniLLM supports persistent conversation memory using any Key-Value Store that implements the Sogo KVS interface. This enables multi-turn conversations that persist across application restarts.

Memory Configuration

// Configure memory settings
memoryConfig := omnillm.MemoryConfig{
    MaxMessages: 50,                    // Keep last 50 messages per session
    TTL:         24 * time.Hour,       // Messages expire after 24 hours
    KeyPrefix:   "myapp:conversations", // Custom key prefix
}

// Create client with memory (using Redis, DynamoDB, etc.)
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider:     omnillm.ProviderNameOpenAI,
    APIKey:       "your-api-key",
    Memory:       kvsClient,          // Your KVS implementation
    MemoryConfig: &memoryConfig,
})

Memory-Aware Completions

// Create a session with system message
err = client.CreateConversationWithSystemMessage(ctx, "user-123", 
    "You are a helpful assistant that remembers our conversation history.")

// Use memory-aware completion - automatically loads conversation history
response, err := client.CreateChatCompletionWithMemory(ctx, "user-123", &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o,
    Messages: []omnillm.Message{
        {Role: omnillm.RoleUser, Content: "What did we discuss last time?"},
    },
    MaxTokens: &[]int{200}[0],
})

// The response will include context from previous conversations in this session

Memory Management

// Load conversation history
conversation, err := client.LoadConversation(ctx, "user-123")

// Get just the messages
messages, err := client.GetConversationMessages(ctx, "user-123")

// Manually append messages
err = client.AppendMessage(ctx, "user-123", omnillm.Message{
    Role:    omnillm.RoleUser,
    Content: "Remember this important fact: I prefer JSON responses.",
})

// Delete conversation
err = client.DeleteConversation(ctx, "user-123")

KVS Backend Support

Memory works with any KVS implementation:

Redis: For high-performance, distributed memory
DynamoDB: For AWS-native storage
In-Memory: For testing and development
Custom: Any implementation of the Sogo KVS interface

// Example with Redis (using a hypothetical Redis KVS implementation)
redisKVS := redis.NewKVSClient("localhost:6379")
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "your-key",
    Memory:   redisKVS,
})

📊 Observability Hooks

OmniLLM supports observability hooks that allow you to add tracing, logging, and metrics to LLM calls without modifying the core library. This is useful for integrating with observability platforms like OpenTelemetry, Datadog, or custom monitoring solutions.

ObservabilityHook Interface

// LLMCallInfo provides metadata about the LLM call
type LLMCallInfo struct {
    CallID       string    // Unique identifier for correlating BeforeRequest/AfterResponse
    ProviderName string    // e.g., "openai", "anthropic"
    StartTime    time.Time // When the call started
}

// ObservabilityHook allows external packages to observe LLM calls
type ObservabilityHook interface {
    // BeforeRequest is called before each LLM call.
    // Returns a new context for trace/span propagation.
    BeforeRequest(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest) context.Context

    // AfterResponse is called after each LLM call completes (success or failure).
    AfterResponse(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, resp *provider.ChatCompletionResponse, err error)

    // WrapStream wraps a stream for observability of streaming responses.
    // Note: AfterResponse is only called if stream creation fails. For streaming
    // completion timing, handle Close() or EOF detection in your wrapper.
    WrapStream(ctx context.Context, info LLMCallInfo, req *provider.ChatCompletionRequest, stream provider.ChatCompletionStream) provider.ChatCompletionStream
}

Basic Usage

// Create a simple logging hook
type LoggingHook struct{}

func (h *LoggingHook) BeforeRequest(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest) context.Context {
    log.Printf("[%s] LLM call started: provider=%s model=%s", info.CallID, info.ProviderName, req.Model)
    return ctx
}

func (h *LoggingHook) AfterResponse(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, resp *omnillm.ChatCompletionResponse, err error) {
    duration := time.Since(info.StartTime)
    if err != nil {
        log.Printf("[%s] LLM call failed: provider=%s duration=%v error=%v", info.CallID, info.ProviderName, duration, err)
    } else {
        log.Printf("[%s] LLM call completed: provider=%s duration=%v tokens=%d", info.CallID, info.ProviderName, duration, resp.Usage.TotalTokens)
    }
}

func (h *LoggingHook) WrapStream(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, stream omnillm.ChatCompletionStream) omnillm.ChatCompletionStream {
    return stream // Return unwrapped for simple logging, or wrap for streaming metrics
}

// Use the hook when creating a client
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider:          omnillm.ProviderNameOpenAI,
    APIKey:            "your-api-key",
    ObservabilityHook: &LoggingHook{},
})

OpenTelemetry Integration Example

type OTelHook struct {
    tracer trace.Tracer
}

func (h *OTelHook) BeforeRequest(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest) context.Context {
    ctx, span := h.tracer.Start(ctx, "llm.chat_completion",
        trace.WithAttributes(
            attribute.String("llm.provider", info.ProviderName),
            attribute.String("llm.model", req.Model),
        ),
    )
    return ctx
}

func (h *OTelHook) AfterResponse(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, resp *omnillm.ChatCompletionResponse, err error) {
    span := trace.SpanFromContext(ctx)
    defer span.End()

    if err != nil {
        span.RecordError(err)
        span.SetStatus(codes.Error, err.Error())
    } else if resp != nil {
        span.SetAttributes(
            attribute.Int("llm.tokens.total", resp.Usage.TotalTokens),
            attribute.Int("llm.tokens.prompt", resp.Usage.PromptTokens),
            attribute.Int("llm.tokens.completion", resp.Usage.CompletionTokens),
        )
    }
}

func (h *OTelHook) WrapStream(ctx context.Context, info omnillm.LLMCallInfo, req *omnillm.ChatCompletionRequest, stream omnillm.ChatCompletionStream) omnillm.ChatCompletionStream {
    return &observableStream{stream: stream, ctx: ctx, info: info}
}

Key Benefits

Non-Invasive: Add observability without modifying core library code
Provider Agnostic: Works with all LLM providers (OpenAI, Anthropic, Gemini, etc.)
Streaming Support: Wrap streams to observe streaming responses
Context Propagation: Pass trace context through the entire call chain
Flexible: Implement only the methods you need; all are called if the hook is set

🔄 Provider Switching

The unified interface makes it easy to switch between providers:

// Same request works with any provider
request := &omnillm.ChatCompletionRequest{
    Model: omnillm.ModelGPT4o, // or omnillm.ModelClaude3Sonnet, etc.
    Messages: []omnillm.Message{
        {Role: omnillm.RoleUser, Content: "Hello, world!"},
    },
    MaxTokens: &[]int{100}[0],
}

// OpenAI
openaiClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "openai-key",
})

// Anthropic
anthropicClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameAnthropic,
    APIKey:   "anthropic-key",
})

// Gemini
geminiClient, _ := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameGemini,
    APIKey:   "gemini-key",
})

// Same API call for all providers
response1, _ := openaiClient.CreateChatCompletion(ctx, request)
response2, _ := anthropicClient.CreateChatCompletion(ctx, request)
response3, _ := geminiClient.CreateChatCompletion(ctx, request)

🧪 Testing

OmniLLM includes a comprehensive test suite with both unit tests and integration tests.

Running Tests

# Run all unit tests (no API keys required)
go test ./... -short

# Run with coverage
go test ./... -short -cover

# Run integration tests (requires API keys)
ANTHROPIC_API_KEY=your-key go test ./providers/anthropic -v
OPENAI_API_KEY=your-key go test ./providers/openai -v
XAI_API_KEY=your-key go test ./providers/xai -v

# Run all tests including integration
ANTHROPIC_API_KEY=your-key OPENAI_API_KEY=your-key XAI_API_KEY=your-key go test ./... -v

Test Coverage

Unit Tests: Mock-based tests that run without external dependencies
Integration Tests: Real API tests that skip gracefully when API keys are not set
Memory Tests: Comprehensive conversation memory management tests
Provider Tests: Adapter logic, message conversion, and streaming tests

Writing Tests

The clean interface design makes testing straightforward:

// Mock the Provider interface for testing
type mockProvider struct{}

func (m *mockProvider) CreateChatCompletion(ctx context.Context, req *omnillm.ChatCompletionRequest) (*omnillm.ChatCompletionResponse, error) {
    return &omnillm.ChatCompletionResponse{
        Choices: []omnillm.ChatCompletionChoice{
            {
                Message: omnillm.Message{
                    Role:    omnillm.RoleAssistant,
                    Content: "Mock response",
                },
            },
        },
    }, nil
}

func (m *mockProvider) CreateChatCompletionStream(ctx context.Context, req *omnillm.ChatCompletionRequest) (omnillm.ChatCompletionStream, error) {
    return nil, nil
}

func (m *mockProvider) Close() error { return nil }
func (m *mockProvider) Name() string { return "mock" }

Conditional Integration Tests

Integration tests automatically skip when API keys are not available:

func TestAnthropicIntegration_Streaming(t *testing.T) {
    apiKey := os.Getenv("ANTHROPIC_API_KEY")
    if apiKey == "" {
        t.Skip("Skipping integration test: ANTHROPIC_API_KEY not set")
    }
    // Test code here...
}

Mock KVS for Memory Testing

OmniLLM provides a mock KVS implementation for testing memory functionality:

import omnillmtest "github.com/agentplexus/omnillm/testing"

// Create mock KVS for testing
mockKVS := omnillmtest.NewMockKVS()

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "test-key",
    Memory:   mockKVS,
})

📚 Examples

The repository includes comprehensive examples:

Basic Usage: Simple chat completions with each provider
Streaming: Real-time response handling
Conversation: Multi-turn conversations with context
Memory Demo: Persistent conversation memory with KVS backend
Architecture Demo: Overview of the provider architecture
Custom Provider: How to create and use 3rd party providers

Run examples:

go run examples/basic/main.go
go run examples/streaming/main.go
go run examples/anthropic_streaming/main.go
go run examples/conversation/main.go
go run examples/memory_demo/main.go
go run examples/providers_demo/main.go
go run examples/xai/main.go
go run examples/ollama/main.go
go run examples/ollama_streaming/main.go
go run examples/gemini/main.go
go run examples/custom_provider/main.go

🔧 Configuration

Environment Variables

OPENAI_API_KEY: Your OpenAI API key
ANTHROPIC_API_KEY: Your Anthropic API key
GEMINI_API_KEY: Your Google Gemini API key
XAI_API_KEY: Your X.AI API key

Advanced Configuration

config := omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "your-api-key",
    BaseURL:  "https://custom-endpoint.com/v1",
    Extra: map[string]any{
        "timeout": 60, // Custom provider-specific settings
    },
}

Logging Configuration

OmniLLM supports injectable logging via Go's standard log/slog package. If no logger is provided, a null logger is used (no output).

import (
    "log/slog"
    "os"

    "github.com/agentplexus/omnillm"
)

// Use a custom logger
logger := slog.New(slog.NewJSONHandler(os.Stderr, &slog.HandlerOptions{
    Level: slog.LevelDebug,
}))

client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   "your-api-key",
    Logger:   logger, // Optional: defaults to null logger if not provided
})

// Access the logger if needed
client.Logger().Info("client initialized", slog.String("provider", "openai"))

The logger is used internally for non-critical errors (e.g., memory save failures) that shouldn't interrupt the main request flow.

Context-Aware Logging

OmniLLM supports request-scoped logging via context. This allows you to attach trace IDs, user IDs, or other request-specific attributes to all log output within a request:

import (
    "log/slog"

    "github.com/agentplexus/omnillm"
    "github.com/grokify/mogo/log/slogutil"
)

// Create a request-scoped logger with trace/user context
reqLogger := slog.Default().With(
    slog.String("trace_id", traceID),
    slog.String("user_id", userID),
    slog.String("request_id", requestID),
)

// Attach logger to context
ctx = slogutil.ContextWithLogger(ctx, reqLogger)

// All internal logging will now include trace_id, user_id, and request_id
response, err := client.CreateChatCompletionWithMemory(ctx, sessionID, req)

The context-aware logger is retrieved using slogutil.LoggerFromContext(ctx, fallback), which returns the context logger if present, or falls back to the client's configured logger.

Retry with Backoff

OmniLLM supports automatic retries for transient failures (rate limits, 5xx errors) via a custom HTTP client. This uses the retryhttp package from github.com/grokify/mogo.

import (
    "net/http"
    "time"

    "github.com/agentplexus/omnillm"
    "github.com/grokify/mogo/net/http/retryhttp"
)

// Create retry transport with exponential backoff
rt := retryhttp.NewWithOptions(
    retryhttp.WithMaxRetries(5),                           // Max 5 retries
    retryhttp.WithInitialBackoff(500 * time.Millisecond),  // Start with 500ms
    retryhttp.WithMaxBackoff(30 * time.Second),            // Cap at 30s
    retryhttp.WithOnRetry(func(attempt int, req *http.Request, resp *http.Response, err error, backoff time.Duration) {
        log.Printf("Retry attempt %d, waiting %v", attempt, backoff)
    }),
)

// Create client with retry-enabled HTTP client
client, err := omnillm.NewClient(omnillm.ClientConfig{
    Provider: omnillm.ProviderNameOpenAI,
    APIKey:   os.Getenv("OPENAI_API_KEY"),
    HTTPClient: &http.Client{
        Transport: rt,
        Timeout:   2 * time.Minute, // Allow time for retries
    },
})

Retry Transport Features:

Feature	Default	Description
Max Retries	3	Maximum retry attempts
Initial Backoff	1s	Starting backoff duration
Max Backoff	30s	Cap on backoff duration
Backoff Multiplier	2.0	Exponential growth factor
Jitter	10%	Randomness to prevent thundering herd
Retryable Status Codes	429, 500, 502, 503, 504	Rate limits + 5xx errors

Additional Options:

WithRetryableStatusCodes(codes) - Custom status codes to retry
WithShouldRetry(fn) - Custom retry decision function
WithLogger(logger) - Structured logging for retry events
Respects Retry-After headers from API responses

Provider Support: Works with OpenAI, Anthropic, X.AI, and Ollama providers. Gemini and Bedrock use SDK clients with their own retry mechanisms.

🏗️ Adding New Providers

🎯 3rd Party Providers (Recommended)

External packages can create providers without modifying the core library. This is the recommended approach for most use cases:

Step 1: Create Your Provider Package

// In your external package (e.g., github.com/yourname/omnillm-gemini)
package gemini

import (
    "context"
    "github.com/agentplexus/omnillm/provider"
)

// Step 1: HTTP Client (like providers/openai/openai.go)
type Client struct {
    apiKey string
    // your HTTP client implementation
}

func New(apiKey string) *Client {
    return &Client{apiKey: apiKey}
}

// Step 2: Provider Adapter (like providers/openai/adapter.go)
type Provider struct {
    client *Client
}

func NewProvider(apiKey string) provider.Provider {
    return &Provider{client: New(apiKey)}
}

func (p *Provider) CreateChatCompletion(ctx context.Context, req *provider.ChatCompletionRequest) (*provider.ChatCompletionResponse, error) {
    // Convert provider.ChatCompletionRequest to your API format
    // Make HTTP call via p.client
    // Convert response back to provider.ChatCompletionResponse
}

func (p *Provider) CreateChatCompletionStream(ctx context.Context, req *provider.ChatCompletionRequest) (provider.ChatCompletionStream, error) {
    // Your streaming implementation
}

func (p *Provider) Close() error { return p.client.Close() }
func (p *Provider) Name() string { return "gemini" }

Step 2: Use Your Provider

import (
    "github.com/agentplexus/omnillm"
    "github.com/yourname/omnillm-gemini"
)

func main() {
    // Create your custom provider
    customProvider := gemini.NewProvider("your-api-key")

    // Inject it directly into omnillm - no core modifications needed!
    client, err := omnillm.NewClient(omnillm.ClientConfig{
        CustomProvider: customProvider,
    })

    // Use the same omnillm API
    response, err := client.CreateChatCompletion(ctx, &omnillm.ChatCompletionRequest{
        Model: "gemini-pro",
        Messages: []omnillm.Message{{Role: omnillm.RoleUser, Content: "Hello!"}},
    })
}

🔧 Built-in Providers (For Core Contributors)

To add a built-in provider to the core library, follow the same structure as existing providers:

Create Provider Package: providers/newprovider/
- newprovider.go - HTTP client implementation
- types.go - Provider-specific request/response types
- adapter.go - provider.Provider interface implementation
Update Core Files:
- Add factory function in providers.go
- Add provider constant in constants.go
- Add model constants if needed
Reference Implementation: Look at any existing provider (e.g., providers/openai/) as they all follow the exact same pattern that external providers should use

🎯 Why This Architecture?

🔌 No Core Changes: External providers don't require modifying the core library
🏗️ Reference Pattern: Internal providers demonstrate the exact structure external providers should follow
🧪 Easy Testing: Both internal and external providers use the same provider.Provider interface
📦 Self-Contained: Each provider manages its own HTTP client, types, and adapter logic
🔧 Direct Injection: Clean dependency injection via ClientConfig.CustomProvider

📊 Model Support

Provider	Models	Features
OpenAI	GPT-5, GPT-4.1, GPT-4o, GPT-4o-mini, GPT-4-turbo, GPT-3.5-turbo	Chat, Streaming, Functions
Anthropic	Claude-Opus-4.1, Claude-Opus-4, Claude-Sonnet-4, Claude-3.7-Sonnet, Claude-3.5-Haiku	Chat, Streaming, System messages
Gemini	Gemini-2.5-Pro, Gemini-2.5-Flash, Gemini-1.5-Pro, Gemini-1.5-Flash	Chat, Streaming
X.AI	Grok-4.1-Fast, Grok-4, Grok-4-Fast, Grok-Code-Fast, Grok-3, Grok-3-Mini, Grok-2	Chat, Streaming, 2M context, Tool calling
Ollama	Llama 3, Mistral, CodeLlama, Gemma, Qwen2.5, DeepSeek-Coder	Chat, Streaming, Local inference
Bedrock*	Claude models, Titan models	Chat, Multiple model families

*Available as external module

🚨 Error Handling

OmniLLM provides comprehensive error handling with provider-specific context:

response, err := client.CreateChatCompletion(ctx, request)
if err != nil {
    if apiErr, ok := err.(*omnillm.APIError); ok {
        fmt.Printf("Provider: %s, Status: %d, Message: %s\n", 
            apiErr.Provider, apiErr.StatusCode, apiErr.Message)
    }
}

🤝 Contributing

Contributions are welcome! Please follow these steps:

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Make your changes

Run tests to ensure everything works:

go test ./... -short        # Run unit tests
go build ./...              # Verify build
go vet ./...                # Run static analysis

Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Adding Tests

When contributing new features:

Add unit tests for core logic
Add integration tests for provider implementations (with API key checks)
Ensure tests pass without API keys using -short flag
Mock external dependencies when possible

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🔗 Related Projects

Anthropic Go SDK - Official Anthropic SDK
OpenAI Go SDK - Official OpenAI SDK
AWS SDK for Go - Official AWS SDK

Made with ❤️ for the Go and AI community

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
docs		docs
examples		examples
models		models
provider		provider
providers		providers
testing		testing
.golangci.yaml		.golangci.yaml
LICENSE		LICENSE
PRESENTATION.md		PRESENTATION.md
README.md		README.md
README_COMMENTARY.md		README_COMMENTARY.md
RELEASE_NOTES_v0.6.0.md		RELEASE_NOTES_v0.6.0.md
RELEASE_NOTES_v0.7.0.md		RELEASE_NOTES_v0.7.0.md
ROADMAP.md		ROADMAP.md
client.go		client.go
client_test.go		client_test.go
constants.go		constants.go
errors.go		errors.go
go.mod		go.mod
go.sum		go.sum
memory.go		memory.go
memory_test.go		memory_test.go
observability.go		observability.go
provider.go		provider.go
providers.go		providers.go
types.go		types.go

License

agentplexus/omnillm

Folders and files

Latest commit

History

Repository files navigation

OmniLLM - Unified Go SDK for Large Language Models

✨ Features

🏗️ Architecture

Key Architecture Benefits

🚀 Quick Start

Installation

Basic Usage

🔧 Supported Providers

OpenAI

Anthropic (Claude)

Google Gemini

AWS Bedrock (External Provider)

X.AI (Grok)

Ollama (Local Models)

🔌 External Providers

Using External Providers

Creating Your Own External Provider

📡 Streaming Example

🧠 Conversation Memory

Memory Configuration

Memory-Aware Completions

Memory Management

KVS Backend Support

📊 Observability Hooks

ObservabilityHook Interface

Basic Usage

OpenTelemetry Integration Example

Key Benefits

🔄 Provider Switching

🧪 Testing

Running Tests

Test Coverage

Writing Tests

Conditional Integration Tests

Mock KVS for Memory Testing

📚 Examples

🔧 Configuration

Environment Variables

Advanced Configuration

Logging Configuration

Context-Aware Logging

Retry with Backoff

🏗️ Adding New Providers

🎯 3rd Party Providers (Recommended)

Step 1: Create Your Provider Package

Step 2: Use Your Provider

🔧 Built-in Providers (For Core Contributors)

🎯 Why This Architecture?

📊 Model Support

🚨 Error Handling

🤝 Contributing

Adding Tests

📄 License

🔗 Related Projects

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Contributors 3

Uh oh!

Languages