Table of Contents
Module spoon_ai.llm.cache
LLM Response Caching - Cache LLM responses to avoid redundant API calls.
LLMResponseCache Objects​
class LLMResponseCache()
Cache for LLM responses to avoid redundant API calls.
__init__​
def __init__(default_ttl: int = 3600, max_size: int = 1000)
Initialize the cache.
Arguments:
default_ttl- Default time-to-live in seconds (default: 1 hour)max_size- Maximum number of cached entries (default: 1000)
get​
def get(messages: List[Message],
provider: Optional[str] = None,
**kwargs) -> Optional[LLMResponse]
Get cached response if available.
Arguments:
messages- List of conversation messagesprovider- Provider name (optional)**kwargs- Additional parameters
Returns:
Optional[LLMResponse]- Cached response if found and not expired, None otherwise
set​
def set(messages: List[Message],
response: LLMResponse,
provider: Optional[str] = None,
ttl: Optional[int] = None,
**kwargs) -> None
Store response in cache.
Arguments:
messages- List of conversation messagesresponse- LLM response to cacheprovider- Provider name (optional)ttl- Time-to-live in seconds (optional, uses default if not provided)**kwargs- Additional parameters
clear​
def clear() -> None
Clear all cached entries.
get_stats​
def get_stats() -> Dict[str, Any]
Get cache statistics.
Returns:
Dict[str, Any]: Cache statistics including size, max_size, etc.
CachedLLMManager Objects​
class CachedLLMManager()
Wrapper around LLMManager that adds response caching.
__init__​
def __init__(llm_manager: LLMManager,
cache: Optional[LLMResponseCache] = None)
Initialize cached LLM manager.
Arguments:
llm_manager- The underlying LLMManager instancecache- Optional cache instance (creates new one if not provided)
chat​
async def chat(messages: List[Message],
provider: Optional[str] = None,
use_cache: bool = True,
cache_ttl: Optional[int] = None,
**kwargs) -> LLMResponse
Send chat request with caching support.
Arguments:
messages- List of conversation messagesprovider- Specific provider to use (optional)use_cache- Whether to use cache (default: True)cache_ttl- Custom TTL for this request (optional)**kwargs- Additional parameters
Returns:
LLMResponse- LLM response (from cache or API)
chat_stream​
async def chat_stream(messages: List[Message],
provider: Optional[str] = None,
callbacks: Optional[List] = None,
**kwargs)
Send streaming chat request (caching not supported for streaming).
Arguments:
messages- List of conversation messagesprovider- Specific provider to use (optional)callbacks- Optional callback handlers**kwargs- Additional parameters
Yields:
LLMResponseChunk- Streaming response chunks
clear_cache​
def clear_cache() -> None
Clear the response cache.
get_cache_stats​
def get_cache_stats() -> Dict[str, Any]
Get cache statistics.
Returns:
Dict[str, Any]: Cache statistics