Skip to main content

Table of Contents

Module spoon_ai.llm.cache

LLM Response Caching - Cache LLM responses to avoid redundant API calls.

LLMResponseCache Objects​

class LLMResponseCache()

Cache for LLM responses to avoid redundant API calls.

__init__​

def __init__(default_ttl: int = 3600, max_size: int = 1000)

Initialize the cache.

Arguments:

  • default_ttl - Default time-to-live in seconds (default: 1 hour)
  • max_size - Maximum number of cached entries (default: 1000)

get​

def get(messages: List[Message],
provider: Optional[str] = None,
**kwargs) -> Optional[LLMResponse]

Get cached response if available.

Arguments:

  • messages - List of conversation messages
  • provider - Provider name (optional)
  • **kwargs - Additional parameters

Returns:

  • Optional[LLMResponse] - Cached response if found and not expired, None otherwise

set​

def set(messages: List[Message],
response: LLMResponse,
provider: Optional[str] = None,
ttl: Optional[int] = None,
**kwargs) -> None

Store response in cache.

Arguments:

  • messages - List of conversation messages
  • response - LLM response to cache
  • provider - Provider name (optional)
  • ttl - Time-to-live in seconds (optional, uses default if not provided)
  • **kwargs - Additional parameters

clear​

def clear() -> None

Clear all cached entries.

get_stats​

def get_stats() -> Dict[str, Any]

Get cache statistics.

Returns:

Dict[str, Any]: Cache statistics including size, max_size, etc.

CachedLLMManager Objects​

class CachedLLMManager()

Wrapper around LLMManager that adds response caching.

__init__​

def __init__(llm_manager: LLMManager,
cache: Optional[LLMResponseCache] = None)

Initialize cached LLM manager.

Arguments:

  • llm_manager - The underlying LLMManager instance
  • cache - Optional cache instance (creates new one if not provided)

chat​

async def chat(messages: List[Message],
provider: Optional[str] = None,
use_cache: bool = True,
cache_ttl: Optional[int] = None,
**kwargs) -> LLMResponse

Send chat request with caching support.

Arguments:

  • messages - List of conversation messages
  • provider - Specific provider to use (optional)
  • use_cache - Whether to use cache (default: True)
  • cache_ttl - Custom TTL for this request (optional)
  • **kwargs - Additional parameters

Returns:

  • LLMResponse - LLM response (from cache or API)

chat_stream​

async def chat_stream(messages: List[Message],
provider: Optional[str] = None,
callbacks: Optional[List] = None,
**kwargs)

Send streaming chat request (caching not supported for streaming).

Arguments:

  • messages - List of conversation messages
  • provider - Specific provider to use (optional)
  • callbacks - Optional callback handlers
  • **kwargs - Additional parameters

Yields:

  • LLMResponseChunk - Streaming response chunks

clear_cache​

def clear_cache() -> None

Clear the response cache.

get_cache_stats​

def get_cache_stats() -> Dict[str, Any]

Get cache statistics.

Returns:

Dict[str, Any]: Cache statistics