Available Components
InMemory Cache
Simple in-memory caching for LLM responses within a single session
InMemory Embedding Cache
In-memory caching for embeddings within a single session
Momento Cache
Distributed, serverless caching using Momento service for LLM responses
Redis Cache
Caching LLM responses using Redis, suitable for multi-process or multi-server setups
Redis Embeddings Cache
Caching embeddings using Redis for improved efficiency in embedding-heavy applications
Upstash Redis Cache
Serverless Redis caching for LLM responses, ideal for edge computing and serverless environments
Use Cases:
- Improving response times for frequently asked questions or similar queries
- Reducing API costs by minimizing redundant LLM calls or embedding generations
- Enhancing user experience in chatbots or AI assistants with quicker responses
- Optimizing performance in scenarios with repetitive queries or embedding requests
- Sharing cache across multiple processes, servers, or serverless function invocations
- Implementing efficient caching in distributed, edge, or serverless computing environments