python常见保存数据的方式

ballball~~

已于 2024-07-18 14:30:03 修改

阅读量702

点赞数 5

分类专栏：数据分析文章标签： python 数据分析

于 2024-07-12 10:15:35 首次发布

本文链接：https://blog.csdn.net/m0_66890670/article/details/140359041

版权

数据分析专栏收录该内容

2 篇文章 0 订阅

订阅专栏

简介：个人学习分享，如有错误，欢迎批评指正。

在Python中保存数据的方式有很多，具体取决于数据的格式和用途。以下是一些常见的数据保存方式及其示例代码：

1. 保存为CSV文件

使用Python的内置函数创建csv文件，再写入csv文件：

import csv

# 创建并写入CSV文件
with open('example.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(["Name", "Age", "City"])
    writer.writerow(["Alice", 30, "New York"])
    writer.writerow(["Bob", 25, "Los Angeles"])

print("CSV file 'example.csv' created and written successfully.")

使用pandas库将数据保存为CSV文件，直接将数据写入csv文件：

import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为CSV文件
df.to_csv('data.csv', index=False)

使用numpy创建并保存CSV文件

import numpy as np
import pandas as pd

# 使用numpy创建一个示例数组
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 将numpy数组转换为pandas DataFrame
df = pd.DataFrame(data, columns=['A', 'B', 'C'])

# 保存为CSV文件
df.to_csv('example.csv', index=False)

print("CSV file 'example.csv' created and saved successfully.")

2. 保存为Excel文件

使用pandas库将数据保存为Excel文件：

import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为Excel文件
df.to_excel('data.xlsx', index=False)

使用numpy创建并保存Excel文件

import numpy as np
import pandas as pd

# 使用numpy创建一个示例数组
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 将numpy数组转换为pandas DataFrame
df = pd.DataFrame(data, columns=['A', 'B', 'C'])

# 保存为Excel文件
df.to_excel('example.xlsx', index=False)

print("Excel file 'example.xlsx' created and saved successfully.")

使用pandas库将不同数据保存到同一个Excel文件中保存多个工作表

import pandas as pd

# 创建两个示例DataFrame
data1 = {'Name': ['Alice', 'Bob', 'Charlie'],
         'Age': [25, 30, 35],
         'City': ['New York', 'Los Angeles', 'Chicago']}
df1 = pd.DataFrame(data1)

data2 = {'Product': ['Product A', 'Product B', 'Product C'],
         'Price': [100, 150, 200],
         'Stock': [50, 30, 20]}
df2 = pd.DataFrame(data2)

# 保存为Excel文件，并指定多个工作表名称
with pd.ExcelWriter('example_multiple_sheets.xlsx', engine='xlsxwriter') as writer:
    df1.to_excel(writer, sheet_name='People', index=False)
    df2.to_excel(writer, sheet_name='Products', index=False)

print("Data saved to 'example_multiple_sheets.xlsx' with multiple sheets successfully.")

在Excel文件中添加新的数据，并重新保存到原Excel文件

import pandas as pd
from openpyxl import load_workbook


# 先创建一个示例DataFrame（为了方便后续的添加操作，自己先创建一个文件）
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为Excel文件
df.to_excel('example.xlsx', index=False)

# 读取刚才创建的Excel文件
file_path = './example.xlsx'
existing_df = pd.read_excel(file_path, sheet_name='Sheet1')

# 创建新的DataFrame
new_data = {'Name': ['David', 'Eve', 'Frank'],
            'Age': [40, 22, 34],
            'City': ['San Francisco', 'Boston', 'Seattle']}
new_df = pd.DataFrame(new_data)

# 将新数据追加到现有的DataFrame
combined_df = pd.concat([existing_df, new_df], ignore_index=True)

# 将合并后的DataFrame保存回同一个Excel文件的同一个工作表
with pd.ExcelWriter(file_path, engine='openpyxl', mode='a', if_sheet_exists='replace') as writer:
    combined_df.to_excel(writer, sheet_name='Sheet1', index=False)

print(f"New data appended to '{file_path}' in the same sheet successfully.")

在Excel文件中添加新的数据，并重新保存到原Excel文件的新工作表中（sheet）

import pandas as pd
from openpyxl import load_workbook

# 先创建一个示例DataFrame（为了方便后续的添加操作，自己先创建一个文件）
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为Excel文件
df.to_excel('example.xlsx', index=False)

import pandas as pd

# 读取已有的Excel文件
existing_file = './example.xlsx'

# 创建一个新的DataFrame
new_data = {'Product': ['Product A', 'Product B', 'Product C'],
            'Price': [100, 150, 200],
            'Stock': [50, 30, 20]}
new_df = pd.DataFrame(new_data)

# 使用ExcelWriter打开现有文件并追加新的工作表
with pd.ExcelWriter(existing_file, engine='openpyxl', mode='a') as writer:
    new_df.to_excel(writer, sheet_name='NewSheet', index=False)

print(f"New data saved to '{existing_file}' in a new sheet named 'NewSheet'.")

3. 保存为图像文件

Pillow是Python Imaging Library (PIL) 的一个分支，常用于图像处理和保存。

from PIL import Image
import numpy as np

# 创建一个示例图像 (100x100 像素的红色图像)
image = Image.new('RGB', (100, 100), color = 'red')

# 保存为JPEG文件
image.save('example.jpg', 'JPEG')

# 保存为PNG文件
image.save('example.png', 'PNG')

print("Images saved as 'example.jpg' and 'example.png'")

OpenCV是一个开源计算机视觉库，常用于图像和视频处理。

import cv2
import numpy as np

# 创建一个示例图像 (100x100 像素的红色图像)
image = np.zeros((100, 100, 3), dtype=np.uint8)
image[:] = (0, 0, 255)  # BGR格式 (红色)

# 保存为JPEG文件
cv2.imwrite('example.jpg', image)

# 保存为PNG文件
cv2.imwrite('example.png', image)

print("Images saved as 'example.jpg' and 'example.png'")

Matplotlib主要用于绘图，但也可以用来保存图像。

import matplotlib.pyplot as plt
import numpy as np

# 创建一个示例图像 (100x100 像素的随机图像)
image = np.random.rand(100, 100)

# 显示图像
plt.imshow(image, cmap='gray')
plt.axis('off')  # 关闭坐标轴

# 保存为JPEG文件
plt.savefig('example.jpg', bbox_inches='tight', pad_inches=0)

# 保存为PNG文件
plt.savefig('example.png', bbox_inches='tight', pad_inches=0)

print("Images saved as 'example.jpg' and 'example.png'")

imageio是一个用于读取和写入图像文件的Python库。

import imageio
import numpy as np

# 创建一个示例图像 (100x100 像素的红色图像)
image = np.zeros((100, 100, 3), dtype=np.uint8)
image[:] = (255, 0, 0)  # RGB格式 (红色)

# 保存为JPEG文件
imageio.imwrite('example.jpg', image)

# 保存为PNG文件
imageio.imwrite('example.png', image)

print("Images saved as 'example.jpg' and 'example.png'")

scikit-image是一个用于图像处理的Python库。

from skimage import io
import numpy as np

# 创建一个示例图像 (100x100 像素的红色图像)
image = np.zeros((100, 100, 3), dtype=np.uint8)
image[:] = (255, 0, 0)  # RGB格式 (红色)

# 保存为JPEG文件
io.imsave('example.jpg', image)

# 保存为PNG文件
io.imsave('example.png', image)

print("Images saved as 'example.jpg' and 'example.png'")

4. 保存为文本文件

使用Python内置函数将数据保存为文本文件：

# 保存为文本文件
data = ["This is the first line.", "This is the second line.", "This is the third line."]

with open('data.txt', 'w') as file:
    for line in data:
        file.write(line + '\n')

numpy提供了np.savetxt方法将数组保存为文本文件。

import numpy as np

# 创建一个示例numpy数组
data = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 保存为文本文件
np.savetxt('data_numpy.txt', data, delimiter=',', fmt='%d')

print("Numpy array saved to 'data_numpy.txt' successfully.")

pandas提供了DataFrame.to_csv方法将DataFrame保存为文本文件。虽然方法名是to_csv，但可以通过参数设置将数据保存为任何文本格式。

import pandas as pd

# 创建一个示例DataFrame
data = {'A': [1, 4, 7],
        'B': [2, 5, 8],
        'C': [3, 6, 9]}
df = pd.DataFrame(data)

# 保存为文本文件
df.to_csv('data_pandas.txt', sep=',', index=False, header=True)

print("Pandas DataFrame saved to 'data_pandas.txt' successfully.")

5. 保存为JSON文件

使用json模块创建JSON文件：

import json

# 创建一个示例字典
data = {'Name': 'Alice', 'Age': 30, 'City': 'New York'}

# 创建并写入JSON文件
with open('example.json', 'w') as file:
    json.dump(data, file)

print("JSON file 'example.json' created and written successfully.")

使用pandas库将数据保存为JSON文件：

import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为JSON文件
df.to_json('data.json', orient='records', lines=True)

6. 保存到SQL数据库

sqlite3是Python内置的库，用于与SQLite数据库交互。

import sqlite3
import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)

# 连接到SQLite数据库（如果数据库不存在，将自动创建）
conn = sqlite3.connect('example.db')

# 将DataFrame保存到SQL数据库的表中
df.to_sql('people', conn, if_exists='replace', index=False)

# 关闭数据库连接
conn.close()

print("Data saved to SQLite database 'example.db' successfully.")

SQLAlchemy是一个SQL工具包和对象关系映射（ORM）库，支持多种数据库。

from sqlalchemy import create_engine
import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)

# 创建SQLAlchemy引擎（连接到SQLite数据库）
engine = create_engine('sqlite:///example.db')

# 将DataFrame保存到SQL数据库的表中
df.to_sql('people', engine, if_exists='replace', index=False)

print("Data saved to SQLite database 'example.db' successfully using SQLAlchemy.")

结合使用pandas和sqlalchemy库，可以方便地将数据保存到各种SQL数据库中。

import pandas as pd
from sqlalchemy import create_engine

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)

# 创建SQLAlchemy引擎（连接到SQLite数据库）
engine = create_engine('sqlite:///example.db')

# 将DataFrame保存到SQL数据库的表中
df.to_sql('people', engine, if_exists='replace', index=False)

print("Data saved to SQLite database 'example.db' successfully using pandas and SQLAlchemy.")

如果要将数据保存到MySQL数据库，可以使用pymysql或mysql-connector-python库与SQLAlchemy结合。

import pandas as pd
from sqlalchemy import create_engine

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}
df = pd.DataFrame(data)

# 创建SQLAlchemy引擎（连接到MySQL数据库）
# 请替换用户名、密码和数据库名称
engine = create_engine('mysql+pymysql://username:password@localhost/database_name')

# 将DataFrame保存到SQL数据库的表中
df.to_sql('people', engine, if_exists='replace', index=False)

print("Data saved to MySQL database successfully using SQLAlchemy.")

7. 保存为二进制文件

使用Python的内置函数创建二进制文件：

# 创建并写入二进制文件
data = bytes([1, 2, 3, 4, 5])

with open('example.bin', 'wb') as file:
    file.write(data)

print("Binary file 'example.bin' created and written successfully.")

使用numpy库将数据保存为二进制文件：

import numpy as np

# 创建一个示例数组
data = np.array([1, 2, 3, 4, 5])

# 保存为二进制文件
np.save('data.npy', data)

# 读取二进制文件
loaded_data = np.load('data.npy')
print(loaded_data)

8. 保存为Pickle文件

使用pickle库将数据序列化并保存为文件：

import pickle

# 创建一个示例字典
data = {'Name': 'Alice', 'Age': 25, 'City': 'New York'}

# 保存为Pickle文件
with open('data.pkl', 'wb') as file:
    pickle.dump(data, file)

# 读取Pickle文件
with open('data.pkl', 'rb') as file:
    loaded_data = pickle.load(file)
print(loaded_data)

9. 保存为Parquet文件

使用pandas库将数据保存为Parquet文件：

import pandas as pd

# 创建一个示例DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 35],
        'City': ['New York', 'Los Angeles', 'Chicago']}

df = pd.DataFrame(data)

# 保存为Parquet文件
df.to_parquet('data.parquet')

结~~~