文章目录
Python Pandas使用说明(期货行情数据分析)
安装pandas模块
pip3 install pandas
导入pandas模块
代码:
import pandas as pd
从CSV文件读取数据
df = pd.read_csv(filename)
沪金2102.csv文件
截取文件前21行如下,第1行:标题行;第2-21行:数据
Date,Time,Open,Max,Min,Close,TradeNumber,HolderNumber,Average
2020/12/23,02:27,391.8600,391.8600,391.7000,391.7000,59.0000,50041.0000,391.8169
2020/12/23,02:28,391.7000,391.9200,391.6800,391.8400,75.0000,50000.0000,391.8222
2020/12/23,02:29,391.8800,391.9200,391.6000,391.6000,131.0000,49936.0000,391.7632
2020/12/23,09:00,391.3400,391.5800,391.2200,391.5200,689.0000,49902.0000,391.4023
2020/12/23,09:01,391.5200,391.8000,391.5000,391.6000,296.0000,49869.0000,391.5945
2020/12/23,09:02,391.5800,391.8600,391.5400,391.8200,430.0000,49827.0000,391.7182
2020/12/23,09:03,391.8000,391.9200,391.6600,391.7800,268.0000,49859.0000,391.7683
2020/12/23,09:04,391.7800,391.8600,391.6000,391.7000,211.0000,49833.0000,391.7079
2020/12/23,09:05,391.6800,391.7200,391.5400,391.6000,244.0000,49834.0000,391.6280
2020/12/23,09:06,391.6200,391.6200,391.5000,391.5800,188.0000,49842.0000,391.5675
2020/12/23,09:07,391.5600,391.6000,391.5000,391.5000,207.0000,49848.0000,391.5408
2020/12/23,09:08,391.5200,391.5200,391.4200,391.4600,169.0000,49851.0000,391.4600
2020/12/23,09:09,391.4400,391.4400,391.3400,391.4400,267.0000,49821.0000,391.3938
2020/12/23,09:10,391.4400,391.5600,391.3800,391.4800,181.0000,49825.0000,391.4754
2020/12/23,09:11,391.4800,391.6200,391.3400,391.6000,254.0000,49831.0000,391.4771
2020/12/23,09:12,391.6200,391.7400,391.5600,391.7400,134.0000,49826.0000,391.6423
2020/12/23,09:13,391.7200,391.7400,391.5600,391.5800,210.0000,49776.0000,391.6367
2020/12/23,09:14,391.5400,391.7000,391.5400,391.7000,86.0000,49766.0000,391.5933
2020/12/23,09:15,391.6800,391.9200,391.5800,391.7800,373.0000,49744.0000,391.7805
2020/12/23,09:16,391.7600,392.0200,391.7600,391.9800,513.0000,49802.0000,391.9118
Futures1.py代码
#!/usr/local/bin/python3
# -*- coding:utf-8 -*-
import os
import pandas as pd
def get_cwd():
return os.path.dirname(__file__)
def get_path():
return os.path.realpath(__file__)
def read_data(filename, nrows = None):
try:
if nrows == None:
df = pd.read_csv(filename)
else:
df = pd.read_csv(filename, nrows = nrows)
return df
except:
print("Cannot open file %s." %filename)
def print_data(df):
print(df)
print(df[['Date', 'Time', 'Open', 'Max', 'Min', 'Close', 'TradeNumber', 'HolderNumber']])
if __name__ == '__main__':
def main():
filename = get_cwd() + '/1分钟/沪金2102.csv'
df = read_data(filename, nrows = 10)
print_data(df)
main()
运行结果:
Date Time Open ... TradeNumber HolderNumber Average
0 2020/12/23 02:27 391.86 ... 59.0 50041.0 391.8169
1 2020/12/23 02:28 391.70 ... 75.0 50000.0 391.8222
2 2020/12/23 02:29 391.88 ... 131.0 49936.0 391.7632
3 2020/12/23 09:00 391.34 ... 689.0 49902.0 391.4023
4 2020/12/23 09:01 391.52 ... 296.0 49869.0 391.5945
5 2020/12/23 09:02 391.58 ... 430.0 49827.0 391.7182
6 2020/12/23 09:03 391.80 ... 268.0 49859.0 391.7683
7 2020/12/23 09:04 391.78 ... 211.0 49833.0 391.7079
8 2020/12/23 09:05 391.68 ... 244.0 49834.0 391.6280
9 2020/12/23 09:06 391.62 ... 188.0 49842.0 391.5675
[10 rows x 9 columns]
Date Time Open Max Min Close TradeNumber HolderNumber
0 2020/12/23 02:27 391.86 391.86 391.70 391.70 59.0 50041.0
1 2020/12/23 02:28 391.70 391.92 391.68 391.84 75.0 50000.0
2 2020/12/23 02:29 391.88 391.92 391.60 391.60 131.0 49936.0
3 2020/12/23 09:00 391.34 391.58 391.22 391.52 689.0 49902.0
4 2020/12/23 09:01 391.52 391.80 391.50 391.60 296.0 49869.0
5 2020/12/23 09:02 391.58 391.86 391.54 391.82 430.0 49827.0
6 2020/12/23 09:03 391.80 391.92 391.66 391.78 268.0 49859.0
7 2020/12/23 09:04 391.78 391.86 391.60 391.70 211.0 49833.0
8 2020/12/23 09:05 391.68 391.72 391.54 391.60 244.0 49834.0
9 2020/12/23 09:06 391.62 391.62 391.50 391.58 188.0 49842.0
读取csv文件所有行(从filename变量指定的文件读取)
df = pd.read_csv(filename)
读取csv文件指定行数
读取前10行(第0至9行)数据
df = pd.read_csv(filename, nrows = 10)
打印df的所有数据
print(df)
列引用
单列引用
dataframe类型的对象使用方括号[ ]引用列数据。引用’Date’列:
df['Date']
多列引用
引用’Date’, 'Time’两列数据,说明:多列使用了列表[ ]形式,因此出现连续两个方括号。
df[['Date', 'Time']]
如下代码打印Date, Time, Open, Max, Min, Close, TradeNumber, HolderNumber所在列的数据,没有打印Average列的数据。
print(df[['Date', 'Time', 'Open', 'Max', 'Min', 'Close', 'TradeNumber', 'HolderNumber']])
Futures2.py代码
以下仅列出print_data函数与main函数的代码(与Future1.py不同的部分)
def print_data(df):
print(df[: