pandas - DataFrame ,Series ;Numpy -array, Scipy ;groupby()

学习python模块pandas ,numpy,matplotlib

1.pandas

pandas读取文件

import pandas as pd
file = pd.read_csv("f1.csv,encoding=gbk")

将修改后的file文件再写入csv

import pandas as pd
file.to_csv("f2.csv,encoding=gbk")

DataFrame是Python中Pandas库中的一种数据结构,它类似excel,是一种二维表.

import numpy as np     
import pandas as pd     
import scipy as sp   
      
df1=pd.DataFrame([[1,2,3,4],[11,22,33,44],
                  [33,44,55,66],[44,55,66,77]],
                 index=list('我爱中国'),columns=list('ABCD'))

输出结果为:
1

  • df1['A'].values #查看某列的值,根据列名

输出结果为:

array([ 1, 11, 33, 44], dtype=int64)

  • df1.loc['我'] #查看某行的值,根据行名
    输出结果为:
A    1
B    2
C    3
D    4
Name: 我, dtype: int64
  • df1.iloc[0] #查看某行的值,根据行索引
    输出结果为:
A    1  
B    2
C    3
D    4
Name: 我, dtype: int64

还可以对df1进行转置: df1.T

对每列求和: df1.sum()

对df1进行扩增列: 如

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
import pandas as pd import numpy as np import os import datetime import statsmodels.api as sm import scipy as sp import math import chardet def TS_SUM(series, number): number = int(number) n = range(0, number-1) shift=series for k in n: shift=shift.shift(1) series=series+shift return series def ExpoDecay(array,halflife,number,): halflife=int(halflife) d=math.pow(0.5,1/halflife) DecayWGT=np.logspace(0,number-1,number,base=d) return sum(array*DecayWGT)/sum(DecayWGT) def TS_AVERAGE(series, number): number = int(number) l = len(series) shift = pd.Series(series) for k in range(0, number-1): shift = shift.shift(1) series = series + shift series = series / number return series def TS_wgdStd(series, number, halflife): halflife = int(halflife) d = math.pow(0.5, 1 / halflife) DecayWGT = np.logspace(0, number - 1, number, base=d) avg = TS_AVERAGE(series, number) square = (series - avg) * (series - avg) print('正在计算DASTD') l=len(series) loop=range(0,l) loop=pd.Series(loop) result=[1]*l for k in loop: if k<number-1: result[k]=np.nan else: sub_square=square.iloc[k-number+1:k+1] result[k]=math.sqrt(np.average(sub_square,weights=DecayWGT)) return result #计算波动因子(DASTD) def DASTD(data): data=pd.DataFrame(data) data['DASTD']=data.groupby('code')['ret_td'].transform(lambda x: TS_wgdStd(x,250,halflife=40)) print(data['DASTD']) print('done') DASTD=data['DASTD'] return DASTD total=pd.read_csv(r"C:\Users\lenovo\Desktop\实习\python\所有数据.csv") pingan=total[total['code']=='000001.SZ'] pingan['DASTD']=TS_wgdStd(pingan['ret_td'],250,halflife=40) print(pingan)
07-15

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值