**掌握AI缓存：如何为大语言模型调用结果缓存？**

本文链接：https://blog.csdn.net/aesgga/article/details/144414025

# 掌握AI缓存：如何为大语言模型调用结果缓存？

## 引言

在使用大型语言模型（LLM）时，响应时间可能会成为一个挑战。通过缓存LLM调用结果，开发者可以显著提高响应速度，减少请求延迟。这篇文章将探讨几种实现LLM结果缓存的方法，并提供实用的代码示例。

## 主要内容

### 1. 内存缓存（In-Memory Cache）

内存缓存存储在应用程序运行时的内存中，速度非常快，适用于短期缓存。

```python
from langchain_community.cache import InMemoryCache
from langchain.globals import set_llm_cache

set_llm_cache(InMemoryCache())

2. SQLite缓存

SQLite缓存将数据持久化到磁盘，使得应用程序重启后仍能访问缓存。

from langchain_community.cache import SQLiteCache

set_llm_cache(SQLiteCache(database_path=".langchain.db"))

3. Redis缓存

Redis是一种高性能、可扩展的内存存储解决方案，适合需要持久化和分布式缓存的应用程序。

from langchain_community.cache import RedisCache
from redis import Redis

set_llm_cache(RedisCache(redis_=Redis()))

4. Upstash Redis缓存

使用Upstash服务，可以通过HTTP API实现无服务器的Redis缓存方案，适合需要快速配置和低管理开销的场景。

from langchain_community.cache import UpstashRedisCache
from upstash_redis import Redis

set_llm_cache(UpstashRedisCache(redis_=Redis(url="{UPSTASH_REDIS_REST_URL}", token="{UPSTASH_REDIS_REST_TOKEN}")))

代码示例

以下示例展示了如何使用SQLite缓存来存储AI调用的结果：

from langchain_globs import set_llm_cache
from langchain_community.cache import SQLiteCache
from langchain_openai import OpenAI

# 使用API代理服务提高访问稳定性
llm = OpenAI(model="gpt-3.5-turbo-instruct", n=2, best_of=2)

set_llm_cache(SQLiteCache(database_path=".langchain.db"))

# 第一次调用会比较慢，因为缓存里还没有结果
result = llm.invoke("Tell me a joke")
print(result)

# 第二次调用同样的请求，会从缓存中快速返回结果
result_cached = llm.invoke("Tell me a joke")
print(result_cached)