LLM Providers

SpoonOS supports multiple language model providers through a unified interface, enabling seamless switching between different AI models.

Supported Providers

OpenAI

Models: GPT-4.1 (default), GPT-4o, GPT-4o-mini, o1-preview, o1-mini
Features: Function calling, streaming, embeddings, reasoning models
Best for: General-purpose tasks, reasoning, code generation

from spoon_ai.chat import ChatBot

# OpenAI configuration with default model
llm = ChatBot(
    model_name="gpt-4.1",  # Framework default
    llm_provider="openai",
    temperature=0.7
)

Anthropic (Claude)

Models: Claude-Sonnet-4-20250514 (default), Claude-3.5 Sonnet, Claude-3.5 Haiku
Features: Large context windows, prompt caching, safety features
Best for: Long documents, analysis, safety-critical applications

# Anthropic configuration with default model
llm = ChatBot(
    model_name="claude-sonnet-4-20250514",  # Framework default
    llm_provider="anthropic",
    temperature=0.1
)

Google (Gemini)

Models: Gemini-2.5-Pro (default), Gemini-2.0-Flash, Gemini-1.5-Pro
Features: Multimodal capabilities, fast inference, large context
Best for: Multimodal tasks, cost-effective solutions, long context

# Google configuration with default model
llm = ChatBot(
    model_name="gemini-2.5-pro",  # Framework default
    llm_provider="gemini",
    temperature=0.1
)

DeepSeek

Models: DeepSeek-Reasoner (default), DeepSeek-V3, DeepSeek-Chat
Features: Advanced reasoning, code-specialized models, cost-effective
Best for: Complex reasoning, code generation, technical tasks

# DeepSeek configuration with default model
llm = ChatBot(
    model_name="deepseek-reasoner",  # Framework default
    llm_provider="deepseek",
    temperature=0.2
)

OpenRouter

Models: Access to multiple providers through one API
Features: Model routing, cost optimization
Best for: Experimentation, cost optimization

# OpenRouter configuration
llm = ChatBot(
    model_name="anthropic/claude-3-opus",
    llm_provider="openrouter",
    temperature=0.7
)

Unified LLM Manager

The LLM Manager provides provider-agnostic access with automatic fallback:

from spoon_ai.llm.manager import LLMManager

# Initialize with multiple providers
llm_manager = LLMManager(
    primary_provider="openai",
    fallback_providers=["anthropic", "google"],
    model_preferences={
        "openai": "gpt-4.1",
        "anthropic": "claude-sonnet-4-20250514",
        "google": "gemini-2.5-pro",
        "deepseek": "deepseek-reasoner"
    }
)

# Use with automatic fallback
response = await llm_manager.generate("Explain quantum computing")

Configuration

Environment Variables

# Provider API Keys
OPENAI_API_KEY=sk-your_openai_key_here
ANTHROPIC_API_KEY=sk-ant-your_anthropic_key_here
GOOGLE_API_KEY=your_google_key_here
DEEPSEEK_API_KEY=your_deepseek_key_here
OPENROUTER_API_KEY=sk-or-your_openrouter_key_here

# Default Settings
DEFAULT_LLM_PROVIDER=openai
DEFAULT_MODEL=gpt-4.1
DEFAULT_TEMPERATURE=0.3

Runtime Configuration

{
  "llm": {
    "provider": "openai",
    "model": "gpt-4.1",
    "temperature": 0.3,
    "max_tokens": 32768,
    "fallback_providers": ["anthropic", "deepseek", "gemini"]
  }
}

Advanced Features

Prompt Caching (Anthropic)

from spoon_ai.llm.cache import PromptCache

# Enable prompt caching for repeated system prompts
llm = ChatBot(
    model_name="claude-sonnet-4-20250514",
    llm_provider="anthropic",
    enable_caching=True
)

Streaming Responses

# Stream responses for real-time interaction
async for chunk in llm.stream("Write a long story about AI"):
    print(chunk, end="", flush=True)

Function Calling

# Define functions for the model to call
functions = [
    {
        "name": "get_weather",
        "description": "Get current weather",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            }
        }
    }
]

response = await llm.generate(
    "What's the weather in New York?",
    functions=functions
)

Model Selection Guide

Task-Based Recommendations

Code Generation

Primary: DeepSeek-Reasoner, GPT-4.1
Alternative: Claude-Sonnet-4

Analysis & Reasoning

Primary: DeepSeek-Reasoner, GPT-4.1, Claude-Sonnet-4
Alternative: Gemini-2.5-Pro

Cost-Sensitive Tasks

Primary: DeepSeek-Reasoner, Gemini-2.5-Pro
Alternative: GPT-4.1

Long Context Tasks

Primary: Gemini-2.5-Pro (250K tokens), Claude-Sonnet-4 (200K tokens)
Alternative: DeepSeek-Reasoner (65K tokens)

Performance Comparison

Provider	Speed	Cost	Context	Quality
OpenAI GPT-4.1	Fast	Medium	128K	Excellent
Anthropic Claude-Sonnet-4	Medium	Medium	200K	Excellent
Google Gemini-2.5-Pro	Very Fast	Low	250K	Very Good
DeepSeek-Reasoner	Fast	Very Low	65K	Superior (Reasoning)
OpenAI o1-preview	Slow	High	128K	Superior (Reasoning)

Error Handling & Fallbacks

Automatic Fallback

The framework provides built-in error handling with automatic fallback between providers:

from spoon_ai.llm.manager import LLMManager

# Configure fallback chain - errors are handled automatically
llm_manager = LLMManager(
    primary_provider="openai",
    fallback_providers=["anthropic", "google"],
    retry_attempts=3,
    timeout=30
)

# Automatic fallback on provider failures
response = await llm_manager.generate("Hello world")

Error Types & Recovery

The framework uses structured error types for clean error handling:

from spoon_ai.llm.errors import RateLimitError, AuthenticationError, ModelNotFoundError

# Simple error handling with specific error types
response = await llm.generate("Hello world")

# Framework handles common errors automatically:
# - Rate limits: automatic retry with backoff
# - Network issues: automatic retry with fallback
# - Authentication: clear error messages
# - Model availability: fallback to alternative models

Graceful Degradation

# Framework provides graceful degradation patterns
llm_manager = LLMManager(
    primary_provider="openai",
    fallback_providers=["deepseek", "gemini"],  # Cost-effective fallbacks
    enable_graceful_degradation=True
)

# If primary fails, automatically uses fallback
# No manual error handling required
response = await llm_manager.generate("Complex reasoning task")

Monitoring & Metrics

Usage Tracking

from spoon_ai.llm.monitoring import LLMMonitor

# Track usage and costs automatically
monitor = LLMMonitor()
response = await llm.generate("Hello", monitor=monitor)

# Get metrics
metrics = monitor.get_metrics()
print(f"Tokens used: {metrics.total_tokens}")
print(f"Cost: ${metrics.total_cost}")

Performance Monitoring

# Monitor response times and success rates
monitor.log_request(
    provider="openai",
    model="gpt-4",
    tokens=150,
    latency=1.2,
    success=True
)

Best Practices

Provider Selection

Test multiple providers for your specific use case
Consider cost vs. quality trade-offs
Use fallbacks for production reliability

Configuration Management

Store API keys securely in environment variables
Use configuration files for easy switching
Monitor usage and costs regularly

Performance Optimization

Cache responses when appropriate
Use streaming for long responses
Batch requests when possible

Error Handling Philosophy

The SpoonOS framework follows a "fail-fast, recover-gracefully" approach:

Automatic Recovery: Common errors (rate limits, network issues) are handled automatically
Structured Errors: Use specific error types instead of generic exceptions
Fallback Chains: Configure multiple providers for automatic failover
Minimal Try-Catch: Let the framework handle errors; only catch when you need custom logic

# Preferred: Let framework handle errors
response = await llm_manager.generate("Hello world")

# Only use explicit error handling for custom business logic
if response.provider != "openai":
    logger.info(f"Fell back to {response.provider}")

Next Steps

Agents - Learn how agents use LLMs
MCP Protocol - Dynamic tool integration
Configuration Guide - Detailed setup instructions

Supported Providers​

OpenAI​

Anthropic (Claude)​

Google (Gemini)​

DeepSeek​

OpenRouter​

Unified LLM Manager​

Configuration​

Environment Variables​

Runtime Configuration​

Advanced Features​

Prompt Caching (Anthropic)​

Streaming Responses​

Function Calling​

Model Selection Guide​

Task-Based Recommendations​

Code Generation​

Analysis & Reasoning​

Cost-Sensitive Tasks​

Long Context Tasks​

Performance Comparison​

Error Handling & Fallbacks​

Automatic Fallback​

Error Types & Recovery​

Graceful Degradation​

Monitoring & Metrics​

Usage Tracking​

Performance Monitoring​

Best Practices​

Provider Selection​

Configuration Management​

Performance Optimization​

Error Handling Philosophy​

Next Steps​