akshare库 | A股股票价格指数数据获取

腾讯课堂 | Python网络爬虫与文本分析(戳一戳)~~

A股

函数类型功能
ak.stock_sse_summary()数据总貌当日上海证券交易所-股票数据总貌
ak.stock_szse_summary()数据总貌当日深圳证券交易所-股票数据总貌
ak.stock_zh_a_spot()实时行情单次返回所有 A 股上市公司的实时行情数据
ak.stock_zh_a_daily(symbol, start_date, end_date, adjust)历史行情数据某股票的历史行情数据

数据总貌

数据源

  • 上交所: http://www.sse.com.cn/market/stockdata/statistic/

  • 深交所: http://www.szse.cn/market/overview/index.html

代码

  • ak.stock_sse_summary() 当前交易日(周六周日以周五收盘为准)上海证券交易所-股票数据总貌

  • ak.stock_szse_summary() 当前交易日(周六周日以周五收盘为准)深圳证券交易所-股票数据总貌

import akshare as ak

#当前上交所交易日
ak.stock_sse_summary()

typeitemnumber
0总貌上市公司/家1774
1总貌总股本/亿股(份)42333.54
2总貌总市值/亿元449338
3总貌平均市盈率/倍16.53
0总貌上市股票/只1817
1总貌流通股本/亿股(份)37055.18
2总貌流通市值/亿元372768.37
0主板上市公司/家1575
1主板总股本/亿股41719.30
2主板总市值/亿元417542.76
3主板平均市盈率/倍15.70
0主板上市股票/只1618
1主板流通股本/亿股36894.08
2主板流通市值/亿元363550.23
0科创板上市公司/家199
1科创板总股本/亿股(份)614.24
2科创板总市值/亿元31795.24
3科创板平均市盈率/倍93.84
0科创板上市股票/只199
1科创板流通股本/亿股(份)161.10
2科创板流通市值/亿元9218.14
#当前上交所交易日
ak.stock_szse_summary()

证券类别数量(只)成交金额(元)成交量总股本总市值流通股本流通市值
0股票2284464774868848.3036234796444222159665542727065137543076.86181555584215621045462688142.41
1主板A股46097759499936.1799281476268157863372167864786582883.697153609517286943989894131.95
2主板B股4686268155.78256280421262676781847596578540.851249663954347063848276.06
3中小板960201352604926.081618324155495904904500111307409526445.997516396551638669554702599.64
4创业板A股818165576495830.27100977792224341345053927845344855206.333360585957225384854243134.76
5基金55113625235438.858488174062178495219217241727724562.97178495219217241727724562.97
6ETF10011654359536.03629887509797160183202162829410392.1297160183202162829410392.12
7LOF250733576824.219056277514129917213040431564774.314129917213040431564774.31
8封闭式基金1552757.105900811761500762244048.50811761500762244048.50
9分级基金2001236746321.5012836653143922410238537704505348.033922410238537704505348.03
10债券7175137138889356.151267097444



11债券现券660029113574087.95184874494036839310540657.2117730355013001823573039491.96
12债券回购13105459181000.001055429160



13ABS5622566134268.2026793790488157240326484463227241.83488157240326484463227241.83
14期权108244155964.00374868



股市实时行情

  • ak.stock_zh_a_spot() 单次返回所有 A 股上市公司的实时行情数据

注意: 重复运行本函数会被新浪暂时封 IP, 建议增加时间间隔

new_df = ak.stock_zh_a_spot()

new_df.to_csv('data/沪深实时行情数据.csv')
new_df.head()
Please wait for a moment: 100%|██████████| 52/52 [00:29<00:00,  1.75it/s]

symbolcodenametradepricechangechangepercentbuysellsettlementopenhighlowvolumeamountticktimeperpbmktcapnmcturnoverratio
0sh600000600000浦发银行10.17-0.07-0.68410.1610.1710.2410.2410.2510.0547336450.0479183315.015:00:005.2150.5822.985112e+072.985112e+070.16127
1sh600004600004白云机场14.99-0.12-0.79414.9915.0015.1115.1515.2814.958879917.0133815978.015:00:1831.2291.9353.547711e+063.101911e+060.42912
2sh600006600006东风汽车6.350.010.1586.346.356.346.256.666.1477593723.0501942922.015:00:0028.7071.6361.270000e+061.270000e+063.87969
3sh600007600007中国国贸12.92-0.10-0.76812.9212.9313.0212.9913.0512.881529840.019789398.015:00:0013.3201.6681.301409e+061.301409e+060.15188
4sh600008600008首创股份3.04-0.01-0.3283.033.043.053.053.053.0232314837.098099591.015:00:1818.0311.4472.231540e+062.231540e+060.44022

股票历史行情数据

  • ak.stock_zh_a_daily(symbol, start_date, end_date, adjust) 某股票的历史行情数据(考虑复权)

  • ak.stock_zh_a_cdr_daily(symbol, start_date, end_date) 某股票的历史行情数据(不考虑复权)


  • symbol 股票代码,symbol='sh600000'; 股票代码可以在 ak.stock_zh_a_spot() 中获取

  • start_date 开始查询的日期;start_date='20201103';

  • end_date 结束查询的日期;start_date='20201106';

  • adjust 默认返回不复权的数据; qfq: 返回前复权后的数据; hfq: 返回后复权后的数据; hfq-factor: 返回后复权因子; hfq-factor: 返回前复权因子

#万科A 后复权
sz000002 = ak.stock_zh_a_daily(symbol = 'sz000002', 
                               start_date = '20201103', 
                               end_date = '20201116', 
                               adjust = 'hfq')
sz000002

openhighlowclosevolumeoutstanding_shareturnover
date






2020-11-034010.194031.673982.984014.4861766600.09.714315e+090.006358
2020-11-044017.354045.993995.874033.1045499180.09.714315e+090.004684
2020-11-054074.644204.974054.594189.21120119594.09.714315e+090.012365
2020-11-064202.104235.044157.704219.2985288066.09.714315e+090.008780
2020-11-094256.534276.584180.624235.0481118542.09.714315e+090.008350
2020-11-104253.664312.384182.054200.6761377060.09.714315e+090.006318
2020-11-114206.404332.434189.214263.6988521186.09.714315e+090.009112
2020-11-124262.264269.424207.834262.2645905719.09.714315e+090.004726
2020-11-134233.614252.234129.064160.5766013466.09.714315e+090.006795
2020-11-164209.264225.024153.414182.0551657638.09.714315e+090.005318
#九号公司 后复权
sh689009 = ak.stock_zh_a_cdr_daily(symbol='sh689009', 
                                   start_date='20201103', 
                                   end_date='20201116')

sh689009

openhighlowclosevolume
date




2020-11-0356.5059.5553.3657.3925121445.0
2020-11-0457.4557.8051.9054.4020846450.0
2020-11-0555.9565.2854.6061.0028843507.0
2020-11-0659.8068.6059.4868.6023162768.0
2020-11-0970.5071.6863.5068.0422494134.0
2020-11-1068.0070.7065.1167.9315952778.0
2020-11-1165.8065.9155.7056.0023125126.0
2020-11-1256.0061.6655.0458.8918607788.0
2020-11-1358.0863.8855.5061.1814904776.0
2020-11-1662.1873.4262.1873.4217134827.0

股票数据复权

为何要复权?

由于股票存在配股、分拆、合并和发放股息等事件,会导致股价出现较大的缺口。若使用不复权的价格处理数据、计算各种指标,将会导致它们失去连续性,且使用不复权价格计算收益也会出现错误。为了保证数据连贯性,常通过前复权和后复权对价格序列进行调整。

前/后复权

前复权:保持当前价格不变,将历史价格进行增减,从而使股价连续。前复权用来看盘非常方便,能一眼看出股价的历史走势,叠加各种技术指标也比较顺畅,是各种行情软件默认的复权方式。这种方法虽然很常见,但也有两个缺陷需要注意。

  • 为了保证当前价格不变,每次股票除权除息,均需要重新调整历史价格,因此其历史价格是时变的。这会导致在不同时点看到的历史前复权价可能出现差异。

  • 对于有持续分红的公司来说,前复权价可能出现负值。

后复权:保证历史价格不变,在每次股票权益事件发生后,调整当前的股票价格。后复权价格和真实股票价格可能差别较大,不适合用来看盘。其优点在于,可以被看作投资者的长期财富增长曲线,反映投资者的真实收益率情况。

在量化投资研究中普遍采用后复权数据

次新股

次新股的内涵是伴随着时间的推移而相应变化的。一般来说一个上市公司在上市后的一年之内如果还没有分红送股,或者股价未被市场主力明显炒作的话,基本上就可以归纳为次新股板块。

在临近年末的时候,次新股由于上市的时间较短,业绩方面一般不会出现异常的变化,这样年报的业绩风险就基本不存在,可以说从规避年报地雷的角度来说,次新股是年报公布阶段相对最为安全的板块。

ak.stock_zh_a_new() 单次返回所有次新股行情数据

stock_zh_a_new = ak.stock_zh_a_new()
stock_zh_a_new

symbolcodenameopenhighlowvolumeamountmktcapturnoverratio
0sh601187601187厦门银行12.57013.17012.350746322419452618943.409753e+0628.27913
1sh601568601568北元集团10.68010.69010.6009001838956365523.831389e+062.49282
2sh601686601686友发集团18.52018.53016.670515032918943884332.409527e+0636.26992
3sh601995601995中金公司63.12064.50063.1202061433013162691633.074480e+077.92010
4sh605007605007五洲特纸26.50027.74026.25088114732369721851.094027e+0622.02318
.................................
56sz300912300912N凯龙52.11062.00052.1101868876710574077396.723678e+0570.35515
57sz300913300913N兆龙46.00049.00042.610189838478463496785.243000e+0565.35439
58sz300915300915C海融114.500119.900112.02030612543541366856.882000e+0520.40836
59sz300916300916C朗特131.310132.450128.00018607982417490295.488562e+0517.47228
60sz300999300999金龙鱼70.68073.50070.3503827311627591332093.891618e+0710.72862

61 rows × 10 columns

实时股票指数

股票指数数据是从新浪财经获取的数据, 单次返回所有指数的实时行情数据

ak.stock_zh_index_spot()

stock_zh_index_spot = ak.stock_zh_index_spot()
stock_zh_index_spot

symbolnametradepricechangechangepercentbuysellsettlementopenhighlowvolumeamountcodeticktime
0sh000001上证指数3418.1683-26.413-0.767003444.58143446.64783449.57823414.515722663706830202769933600000114:35:44
1sh000002A股指数3582.8628-27.724-0.768003610.58673612.75683615.82473579.018022638114730131767498300000214:35:44
2sh000003B股指数241.8948-0.266-0.11000242.1604242.1342243.1541241.85841637638583953400000314:35:44
3sh000004工业指数2947.2927-9.719-0.329002957.01172961.39352968.18592944.600613718045820555576757600000414:35:44
4sh000005商业指数3301.6900-29.175-0.876003330.86473331.38143338.83473298.9413197244112595036666000000514:35:44
................................................
574sz399998中证煤炭1321.328-12.350-0.9260.0000.0001333.6781337.8861353.3521320.545958502075611803684939999814:35:48
575sz980001
1651.641-17.535-1.0510.0000.0001669.1761672.0311672.0311645.79520252512356268854575898000114:35:46
576sz980017
9298.40538.8130.4190.0000.0009259.5929449.9739503.9289292.8646456873532662857326598001714:35:46
577sz980023
2909.856-12.650-0.4330.0000.0002922.5062925.4692929.0472898.4357358923991564549194798002314:35:46
578sz980068
2067.003-5.000-0.2410.0000.0002072.0032069.6112075.7332063.1851085307561835951480898006814:35:46

579 rows × 15 columns

近期文章Python网络爬虫与文本数据分析
bsite库 | 采集B站视频信息、评论数据

爬虫实战 | 采集&可视化知乎问题的回答
pdf2docx库 | 转文件格式,支持抽取文件中的表格数据
rpy2库 | 在jupyter中调用R语言代码
tidytext | 耳目一新的R-style文本分析库
reticulate包 | 在Rmarkdown中调用Python代码
plydata库 | 数据操作管道操作符>>
plotnine: Python版的ggplot2作图库

七夕礼物 | 全网最火的钉子绕线图制作教程

读完本文你就了解什么是文本分析

文本分析在经管领域中的应用概述  
综述:文本分析在市场营销研究中的应用

plotnine: Python版的ggplot2作图库
小案例: Pandas的apply方法  
stylecloud:简洁易用的词云库 
用Python绘制近20年地方财政收入变迁史视频  
Wow~70G上市公司定期报告数据集

漂亮~pandas可以无缝衔接Bokeh  
YelpDaset: 酒店管理类数据集10+G  


“分享”和“在看”是更好的支持!

代码链接:https://pan.baidu.com/s/12peR3mrLAG5wHwdk-vcWSA  密码:41d4

以下是使用akshare获取股票数据并进行强化学习的完整代码示例: ```python import akshare as ak import pandas as pd import numpy as np import random from collections import deque import tensorflow as tf from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout, LSTM from tensorflow.keras.optimizers import Adam # 获取股票数据 stock_zh_a_daily = ak.stock_zh_a_daily(symbol='sh600000', adjust="hfq") df = pd.DataFrame(stock_zh_a_daily) df = df[['open', 'high', 'low', 'close', 'volume']] df['H-L'] = df['high'] - df['low'] df['O-C'] = df['open'] - df['close'] df = df[['close', 'H-L', 'O-C', 'volume']] df = df[-1000:] # 取最近1000天的数据 print(df.head()) # 定义强化学习环境 class Environment: def __init__(self, data, initial_investment=20000): self.actions = ["buy", "sell", "hold"] self.data = data self.n_step = len(self.data) - 1 self.initial_investment = initial_investment self.current_step = None self.stock_owned = None self.cash_in_hand = None self.action_history = [] self.state_history = [] self.reward_history = [] self.value_history = [] def reset(self): self.current_step = 0 self.stock_owned = 0 self.cash_in_hand = self.initial_investment self.action_history = [] self.state_history = [] self.reward_history = [] self.value_history = [] return self._get_observation() def step(self, action): assert action in self.actions # 更新当前持仓 prev_val = self._get_val() self.action_history.append(action) self.current_step += 1 self.state_history.append(self._get_observation()) if action == "sell": self.stock_owned -= 1 self.cash_in_hand += self.data.iloc[self.current_step]['close'] self.reward_history.append(self.data.iloc[self.current_step]['close'] - prev_val) elif action == "buy": self.stock_owned += 1 self.cash_in_hand -= self.data.iloc[self.current_step]['close'] self.reward_history.append(prev_val - self.data.iloc[self.current_step]['close']) else: self.reward_history.append(0) self.value_history.append(self._get_val()) done = (self.current_step == self.n_step) info = {"stock_owned": self.stock_owned, "cash_in_hand": self.cash_in_hand} return self._get_observation(), sum(self.reward_history), done, info def _get_observation(self): obs = np.array([self.data.iloc[self.current_step]['close'], self.data.iloc[self.current_step]['H-L'], self.data.iloc[self.current_step]['O-C'], self.data.iloc[self.current_step]['volume']]) return obs def _get_val(self): return self.stock_owned * self.data.iloc[self.current_step]['close'] + self.cash_in_hand # 构建神经网络模型 class ReplayBuffer: def __init__(self, obs_dim, act_dim, size=int(1e6)): self.obs_buf = np.zeros([size, obs_dim], dtype=np.float32) self.act_buf = np.zeros([size, act_dim], dtype=np.float32) self.rew_buf = np.zeros(size, dtype=np.float32) self.next_obs_buf = np.zeros([size, obs_dim], dtype=np.float32) self.done_buf = np.zeros(size, dtype=np.float32) self.ptr, self.size, self.max_size = 0, 0, size def store(self, obs, act, rew, next_obs, done): self.obs_buf[self.ptr] = obs self.act_buf[self.ptr] = act self.rew_buf[self.ptr] = rew self.next_obs_buf[self.ptr] = next_obs self.done_buf[self.ptr] = done self.ptr = (self.ptr+1) % self.max_size self.size = min(self.size+1, self.max_size) def sample_batch(self, batch_size=32): idxs = np.random.randint(0, self.size, size=batch_size) return dict(obs=self.obs_buf[idxs], act=self.act_buf[idxs], rew=self.rew_buf[idxs], next_obs=self.next_obs_buf[idxs], done=self.done_buf[idxs]) def get_model(obs_dim, act_dim): model = Sequential([ Dense(256, input_shape=(obs_dim,), activation='relu'), Dense(256, activation='relu'), Dense(256, activation='relu'), Dense(act_dim) ]) return model # 训练模型 def train(): env = Environment(df) obs_dim = env.reset().shape[0] act_dim = len(env.actions) model = get_model(obs_dim, act_dim) target_model = get_model(obs_dim, act_dim) target_model.set_weights(model.get_weights()) replay_buffer = ReplayBuffer(obs_dim=obs_dim, act_dim=act_dim) def get_action(state, epsilon): if random.random() < epsilon: return random.choice(env.actions) else: state = np.expand_dims(state, axis=0) q_values = model.predict(state) return env.actions[np.argmax(q_values)] def compute_loss(batch): obs, act, rew, next_obs, done = batch['obs'], batch['act'], batch['rew'], batch['next_obs'], batch['done'] target_q = target_model.predict(next_obs).max(axis=1) target_q = rew + (1-done) * gamma * target_q q = model.predict(obs) q = tf.reduce_sum(q * tf.one_hot(act, act_dim), axis=1) loss = tf.reduce_mean((q - target_q)**2) return loss optimizer = Adam(learning_rate=lr) epsilon = 1.0 gamma = 0.99 batch_size = 32 updates_per_step = 10 update_target_every = 2000 replay_start_size = 10000 total_timesteps = 200000 start_steps = 10000 step_count = 0 episode_reward = 0 episode_timesteps = 0 state = env.reset() for i in range(total_timesteps): if i < start_steps: action = random.choice(env.actions) else: action = get_action(state, epsilon) next_state, reward, done, info = env.step(action) replay_buffer.store(state, env.actions.index(action), reward, next_state, done) state = next_state episode_reward += reward episode_timesteps += 1 step_count += 1 if len(replay_buffer.obs_buf) < replay_start_size: continue if step_count % updates_per_step == 0: for j in range(updates_per_step): batch = replay_buffer.sample_batch(batch_size=batch_size) loss = compute_loss(batch) grads = tf.gradients(loss, model.trainable_variables) optimizer.apply_gradients(zip(grads, model.trainable_variables)) if step_count % update_target_every == 0: target_model.set_weights(model.get_weights()) if done or (episode_timesteps == env.n_step): print('Episode: {}, episode reward: {}, episode timesteps: {}'.format(i, episode_reward, episode_timesteps)) state = env.reset() episode_reward, episode_timesteps = 0, 0 train() ``` 这段代码中,我们首先使用akshare获取上证指数的股票数据,并对数据进行处理。然后我们定义了一个强化学习环境类`Environment`,可以根据当前的股票数据状态,选择买入、卖出或持有不动。接下来我们用神经网络模型来拟合这个环境,并使用经验回放缓存`ReplayBuffer`来加速训练。最后,我们使用`train()`函数来训练模型,并输出训练过程中的奖励和步数信息。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值