Python应用开发——30天学习Streamlit Python包进行APP的构建（15）：优化性能并为应用程序添加状态

此星光明

于 2024-07-14 16:00:00 发布

阅读量316

点赞数 19

分类专栏： AI 文章标签：学习 python web 应用可视化界面 app

本文链接：https://blog.csdn.net/qq_31988139/article/details/140107343

版权

Caching and state

优化性能并为应用程序添加状态！

Caching

缓存

Streamlit 为数据和全局资源提供了强大的缓存原语。即使从网络加载数据、处理大型数据集或执行昂贵的计算，它们也能让您的应用程序保持高性能。

本页仅包含有关 st.cache_data API 的信息。如需深入了解缓存及其使用方法，请查阅缓存。

st.cache_data

装饰器，用于缓存返回数据的函数（如数据帧转换、数据库查询、ML 推断）。

缓存对象以 "腌制 "形式存储，这意味着缓存函数的返回值必须是可腌制的。缓存函数的每个调用者都会获得自己的缓存数据副本。

您可以使用 func.clear() 清除函数的缓存，或使用 st.cache_data.clear() 清除整个缓存。

要缓存全局资源，请使用 st.cache_resource。有关缓存的更多信息，请访问 https://docs.streamlit.io/develop/concepts/architecture/caching。

Function signature[source]
st.cache_data(func=None, *, ttl, max_entries, show_spinner, persist, experimental_allow_widgets, hash_funcs=None)
Parameters
func (callable)	The function to cache. Streamlit hashes the function's source code.
ttl (float, timedelta, str, or None)	The maximum time to keep an entry in the cache. Can be one of: None if cache entries should never expire (default). A number specifying the time in seconds. A string specifying the time in a format supported by Pandas's Timedelta constructor, e.g. "1d", "1.5 days", or "1h23s". A timedelta object from Python's built-in datetime library, e.g. timedelta(days=1). Note that ttl will be ignored if persist="disk" or persist=True.
max_entries (int or None)	The maximum number of entries to keep in the cache, or None for an unbounded cache. When a new entry is added to a full cache, the oldest cached entry will be removed. Defaults to None.
show_spinner (bool or str)	Enable the spinner. Default is True to show a spinner when there is a "cache miss" and the cached data is being created. If string, value of show_spinner param will be used for spinner text.
persist ("disk", bool, or None)	Optional location to persist cached data to. Passing "disk" (or True) will persist the cached data to the local disk. None (or False) will disable persistence. The default is None.
experimental_allow_widgets (bool)	delete experimental_allow_widgets is deprecated and will be removed in a later version. Allow widgets to be used in the cached function. Defaults to False. Support for widgets in cached functions is currently experimental. Setting this parameter to True may lead to excessive memory use since the widget value is treated as an additional input parameter to the cache.
hash_funcs (dict or None)	Mapping of types or fully qualified names to hash functions. This is used to override the behavior of the hasher inside Streamlit's caching mechanism: when the hasher encounters an object, it will first check to see if its type matches a key in this dict and, if so, will use the provided function to generate a hash for it. See below for an example of how this can be used.

代码

import streamlit as st

@st.cache_data
def fetch_and_clean_data(url):
    # 从 URL 获取数据，然后进行清理。
    return data

d1 = fetch_and_clean_data(DATA_URL_1)
# 实际上执行函数，因为这是第一次遇到它。

d2 = fetch_and_clean_data(DATA_URL_1)
# 不执行函数。而是返回之前计算的值。这意味着现在 d1 中的数据与 d2 中的数据相同。

d3 = fetch_and_clean_data(DATA_URL_2)
# 这是一个不同的 URL，因此函数会执行。

这段代码是使用streamlit库来创建一个web应用程序。代码中定义了一个名为fetch_and_clean_data的函数，用于从指定的URL获取数据并进行清理处理。在函数上使用了@st.cache_data装饰器，表示对函数的结果进行缓存，以便在后续调用时可以直接返回之前计算的数值，而不必重新执行函数。

接下来，代码分别使用fetch_and_clean_data函数来获取和清理两个不同的URL所对应的数据。在第一次调用fetch_and_clean_data时，函数会执行并返回结果，并将结果缓存起来。在后续对相同URL的调用中，函数不会重新执行，而是直接返回之前缓存的结果。当传入不同的URL时，函数会重新执行以获取新的数据。

总之，这段代码展示了如何使用streamlit库来创建一个具有数据缓存功能的web应用程序，并在多次调用同一个函数时避免重复执行。

设置持续参数的命令如下：

import streamlit as st

@st.cache_data(persist="disk")
def fetch_and_clean_data(url):
    # 从 URL 获取数据，然后进行清理。
    return data

这段代码使用了Streamlit库，并定义了一个名为fetch_and_clean_data的函数，使用了@st.cache_data(persist="disk")装饰器。这表示该函数的结果将被缓存，并且可以选择将缓存持久化到磁盘上。

函数的作用是从指定的URL获取数据，然后对数据进行清理和处理，最后返回处理后的数据。在实际调用该函数时，如果输入的URL相同，函数将直接返回缓存中的结果，而不是重新执行获取和清理数据的操作。

默认情况下，缓存函数的所有参数都必须是散列的。任何名称以 _ 开头的参数都不会被散列。对于不可散列的参数，可以将其作为 "逃生舱口"：

import streamlit as st

@st.cache_data
def fetch_and_clean_data(_db_connection, num_rows):
    # 从 URL 获取数据，然后进行清理。
    return data

connection = make_database_connection()
d1 = fetch_and_clean_data(connection, num_rows=10)
# 实际执行该函数，因为这是第一次遇到该函数。

another_connection = make_database_connection()
d2 = fetch_and_clean_data(another_connection, num_rows=10)
# 不执行函数。相反，即使两次调用中的 _database_connection 参数不同，也会返回先前计算出的值。

这段代码是使用Streamlit框架进行数据缓存的示例。在这段代码中，使用了`@st.cache_data`装饰器来缓存`fetch_and_clean_data`函数的结果，以便在后续调用中重复使用已经计算过的数值。

首先，通过`make_database_connection`函数建立了一个数据库连接`connection`，然后调用`fetch_and_clean_data`函数，并传入`connection`和`num_rows=10`作为参数。由于这是第一次调用该函数，因此实际执行了函数并返回了数据`d1`。

接着，又建立了另一个数据库连接`another_connection`，然后再次调用`fetch_and_clean_data`函数，并传入`another_connection`和`num_rows=10`作为参数。由于该函数的结果已经被缓存，所以这次并没有执行函数，而是直接返回之前计算过的数值，赋值给了`d2`。

这样，通过数据缓存，可以避免重复执行耗时的数据获取和清理操作，提高程序的运行效率。

缓存函数的缓存可按程序清除：

import streamlit as st

@st.cache_data
def fetch_and_clean_data(_db_connection, num_rows):
    # 从 _db_connection 抓取数据，然后将其清理干净。
    return data

fetch_and_clean_data.clear(_db_connection, 50)
# 清除所提供参数的缓存条目。

fetch_and_clean_data.clear()
# 清除该函数的所有缓存条目。

这段代码是使用Streamlit库来清除缓存数据的示例。首先，使用`@st.cache_data`装饰器来定义一个函数`fetch_and_clean_data`，该函数可以从数据库连接中获取数据并进行清理，然后返回处理后的数据。

接下来，使用`fetch_and_clean_data.clear(_db_connection, 50)`来清除使用指定参数调用函数时缓存的数据条目。这将清除使用给定数据库连接和行数调用函数时缓存的数据。

然后，使用`fetch_and_clean_data.clear()`来清除该函数的所有缓存条目，而不考虑调用时使用的参数。

这段代码展示了如何使用Strea