PandasAI - 自然语言交互式数据分析平台

最新推荐文章于 2025-06-01 21:58:11 发布

富婆E

最新推荐文章于 2025-06-01 21:58:11 发布

阅读量646

点赞数 14

分类专栏： # AI 开源项目文章标签：数据分析数据挖掘自然语言 PandasAI

本文链接：https://blog.csdn.net/lovechris00/article/details/148290544

版权

AI 开源项目专栏收录该内容

223 篇文章

订阅专栏

文章目录

一、关于 PandasAI

1、项目概览

PandasAI 是一个 Python 平台，支持通过自然语言对数据进行提问。它帮助非技术用户以更自然的方式与数据交互，同时帮助技术用户节省处理数据的时间和精力。

在这里插入图片描述

2、相关链接资源

Github：https://github.com/sinaptik-ai/pandas-ai
官方文档：https://pandas-ai.readthedocs.io/en/latest/
Colab：https://colab.research.google.com/drive/1ZnO-njhL7TBOYPZaqvMvGtsjckZKrv2E?usp=sharing
演示视频：https://github.com/sinaptik-ai/pandas-ai/raw/main/assets/demo.gif
示例：https://github.com/sinaptik-ai/pandas-ai/tree/main/examples
社区支持：Discord
License：MIT
平台入口：https://app.pandabi.ai
企业版咨询：https://getpanda.ai/pricing

3、功能特性

1、自然语言查询
支持通过简单对话形式查询数据内容

2、可视化生成
可自动生成数据可视化图表

3、多表关联
支持跨多个DataFrame的关联查询

4、安全沙箱
提供Docker沙箱环境保障代码安全执行

二、安装配置

系统要求

Python 3.8+ (❤️.12)

# pip安装方式
pip install "pandasai>=3.0.0b2"

# poetry安装方式
poetry add "pandasai>=3.0.0b2"

# Docker支持包
pip install "pandasai-docker"

企业版说明

pandasai/ee目录使用独立许可证，详情见：EE License

三、使用示例

1、提问

import pandasai as pai

# Sample DataFrame
df = pai.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "revenue": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})

# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://app.pandabi.ai (you can also configure it in your .env file)
pai.api_key.set("your-pai-api-key")

df.chat('Which are the top 5 countries by sales?')

hina, United States, Japan, Germany, Australia

或者你可以提出更复杂的问题：

df.chat(
    "What is the total sales for the top 3 countries by sales?"
)

The total sales for the top 3 countries by sales is 16500

2、可视化图表

你也可以让 PandasAI 为你生成图表：

df.chat(
    "Plot the histogram of countries showing for each one the gd. Use different colors for each bar",
)

在这里插入图片描述

3、多数据框处理

您还可以向 PandasAI 传入多个数据框，并针对它们之间的关系提出问题。

import pandasai as pai

employees_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
    'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}

salaries_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Salary': [5000, 6000, 4500, 7000, 5500]
}

employees_df = pai.DataFrame(employees_data)
salaries_df = pai.DataFrame(salaries_data)

# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://app.pandabi.ai (you can also configure it in your .env file)
pai.api_key.set("your-pai-api-key")

pai.chat("Who gets paid the most?", employees_df, salaries_df)

Olivia gets paid the most.

4、Docker 沙箱

您可以在 Docker 沙箱中运行 PandasAI，这提供了一个安全隔离的环境来安全执行代码，并降低恶意攻击的风险。

Python 要求

pip install "pandasai-docker"

使用方法

import pandasai as pai
from pandasai_docker import DockerSandbox

# Initialize the sandbox
sandbox = DockerSandbox()
sandbox.start()

employees_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'],
    'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance']
}

salaries_data = {
    'EmployeeID': [1, 2, 3, 4, 5],
    'Salary': [5000, 6000, 4500, 7000, 5500]
}

employees_df = pai.DataFrame(employees_data)
salaries_df = pai.DataFrame(salaries_data)

# By default, unless you choose a different LLM, it will use BambooLLM.
# You can get your free API key signing up at https://app.pandabi.ai (you can also configure it in your .env file)
pai.api_key.set("your-pai-api-key")

pai.chat("Who gets paid the most?", employees_df, salaries_df, sandbox=sandbox)

# Don't forget to stop the sandbox when done
sandbox.stop()