股票数据处理仿真
1. 数据获取
stockData=ts.get_hist_data('600848')
stockData.volume[stockData.open>stockData.close]
!pip install tushare
import tushare as ts
help(ts.get_hist_data)
Help on function get_hist_data in module tushare.stock.trading:
get_hist_data(code=None, start=None, end=None, ktype='D', retry_count=3, pause=0.001)
获取个股历史交易记录
Parameters
------
code:string
股票代码 e.g. 600848
start:string
开始日期 format:YYYY-MM-DD 为空时取到API所提供的最早日期数据
end:string
结束日期 format:YYYY-MM-DD 为空时取到最近一个交易日数据
ktype:string
数据类型,D=日k线 W=周 M=月 5=5分钟 15=15分钟 30=30分钟 60=60分钟,默认为D
retry_count : int, 默认 3
如遇网络等问题重复执行的次数
pause : int, 默认 0
重复请求数据过程中暂停的秒数,防止请求间隔时间太短出现的问题
return
-------
DataFrame
属性:日期 ,开盘价, 最高价, 收盘价, 最低价, 成交量, 价格变动 ,涨跌幅,5日均价,10日均价,20日均价,5日均量,10日均量,20日均量,换手率
import tushare as ts
import pandas as pd
stockList = ['600848','600547']
stockDataRaw = []
for stockCode in stockList:
tmpData = ts.get_hist_data(stockCode)
tmpData['code'] = stockCode
stockDataRaw.append(tmpData)
stockDataRaw[1].head()
open high close low volume price_changep_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20 code
date
2020-04-03 34.30 35.10 34.32 34.13 450174.44 0.42 1.24 34.316 33.815 32.911 476105.19 593758.48 558680.44 600547
2020-04-02 34.17 34.18 33.90 33.42 406198.56 -0.86 -2.47 34.484 33.393 33.021 527955.09 591879.77 575459.22 600547
2020-04-01 33.16 34.90 34.76 32.84 562402.19 0.42 1.22 34.586 32.896 33.101 574545.03 595051.70 576480.42 600547
2020-03-31 33.95 34.92 34.34 33.55 379957.53 0.08 0.23 34.434 32.368 33.118 618449.48 580082.07 583015.25 600547
2020-03-30 34.43 35.40 34.26 33.73 581793.25 -0.90 -2.56 34.166 31.940 33.130 729601.73 593872.79 589913.88 600547
stockDataRaw[0].head()
open high close low volume price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20 code
date
2020-03-19 19.11 19.25 19.20 18.66 61516.41 0.00 0.00 19.556 20.263 20.915 59916.25 59714.33 70934.42 600848
2020-03-18 19.58 19.78 19.20 19.05 48093.72 -0.20 -1.03 19.848 20.507 21.046 57235.60 59874.07 71795.66 600848
2020-03-17 20.00 20.07 19.40 19.18 52294.24 -0.40 -2.02 20.214 20.734 21.166 60959.79 60316.61 72138.82 600848
2020-03-16 20.31 20.61 19.80 19.70 65306.74 -0.38 -1.88 20.512 20.924 21.284 64058.86 61356.19 72836.33 600848
2020-03-13 19.85 20.36 20.18 19.58 72370.12 -0.48 -2.32 20.714 21.052 21.378 65007.02 60015.78 74068.74 600848
2. 数据探索
2.1 统计性描述
stockDataRaw[0].describe()
open high close low volume price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20
count 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000 531.000000
mean 25.269529 25.862750 25.299454 24.809868 94063.363032 -0.014557 -0.015009 25.327944 25.362077 25.433814 94429.341921 94677.080151 95666.325009
std 4.470016 4.666589 4.480347 4.292618 72605.740257 0.828934 3.076116 4.399293 4.307300 4.137030 62339.064202 57344.750402 51714.109126
min 14.760000 16.300000 15.700000 14.760000 14082.000000 -3.520000 -10.010000 16.298000 16.684000 17.516000 17098.100000 24382.830000 35231.770000
25% 21.915000 22.240000 21.895000 21.630000 45430.220000 -0.365000 -1.490000 22.028000 22.108500 22.548500 49811.225000 54176.465000 56236.895000
50% 24.400000 24.800000 24.370000 24.040000 69768.170000 0.020000 0.110000 24.410000 24.441000 24.441000 73995.030000 77534.590000 77430.850000
75% 27.790000 28.215000 27.640000 27.335000 115424.025000 0.335000 1.400000 27.582000 27.472500 27.527500 118145.360000 119103.220000 120233.255000
max 37.800000 38.810000 37.490000 36.200000 501915.410000 3.080000 10.010000 36.126000 35.766000 34.274000 404443.540000 360028.160000 269280.790000
stockDataRaw[0].apply(lambda x: pd.Series((min(x), max(x)), index=['最小值','最大值']))
open high close low volume price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20 code
最小值 14.76 16.30 15.70 14.76 14082.00