Python in Finance

本文不含DF切片、选择、赋值等操作,仅涉及各method。

Science solutions

purposefuncreturnnote
solutionfsolve(func,x0,args)unpack[0]一元一次、二元一次…
solutionbisect(func,a,b)func(a)*func(b)<0
optimizeminimize(func,x0,bounds,constraints)res.fun=minimum value,
res.x=x result
consts:type ineq$\to \geq$0
bds:inf=None
mathderivative(func,x0,dx,n,args)n=nth order
mathquad(func,a,b,args)unpack[0]a=low
mathdblquad(func,a,b,gfun,hfun)unpack[0]func(x,y)会先对x积分,
对应ground gfun=lambda x:3
reginterp1d(x,y,kind)必过全部点
kind=‘quadratic’,‘cubic’
regnp.polyfit(x,y,deg)
polyval(paramt,xnew)
paramt是系数array
顺序:最高 → \to 最低order+intercept

find solution

scipy.optimize.fsolve(func,x0,args)

>>> return: array([2.,])

scipy.optimize.bisect(func,a,b)

func(a)*func(b)<0

find best

scipy.optimize.minimize(func,x0,bounds,constraints)

constraints = (
    {'type':'ineq','fun':lambda x:x[0]+x[1]},
    {'type':'eq','fun':lambda x:x[0]+x[1]}
)
bounds = (
    (0,None),
    (None,None)
)

type = ‘ineq’ → \to ≥ 0 \geq 0 0

Derivation

scipy.misc.derivative(func,x0,dx=1.0,n=1,args)

n=nth derivative order
dx=accuracy

DF.apply(
    lambda x:derivative(
        lambda s: BS(*(x[:4]),s,x[5]),
        x[4],
        dx=0.01
        ),
    axis=1
)

scipy.integrade.quad(func,a,b,args)

a=lower bound
b=higher bound

quand(fun,0,np.inf)
# return a tuple of rst and error
>>> return (21.33333,2.34e-13) 

scipy.integrade.dblquad(func,a,b,gfun,hfun)

gfun=lower bound function, even for a int
hfun=higher bound function
for below function:
f u n ( y , x ) = ∫ 0 2 ∫ 0 1 x y 2 d y d x fun(y,x)=\int_0^2 \int_0^1xy^2dydx fun(y,x)=0201xy2dydx
int order is first y then x, put y ahead of x in lambda function(fun).

# put y ahead of x for real order
fun = lambda y,x: y**2*x
# always first lower bound then higher bound
dblquad(fun,0,2,lambda x:0,lambda x:1)
# return a tuple of rst and error
>>> return (0.66667,7.401e-15) 

Interpolation

scipy.interpolate.interp1d(x,y,kind=‘linear’) CHECK 1Cp65

It returns a function, just call the function

from scipy.interpolate import interp1d
x=np.arange(0,10)
y=np.exp(-x/3.0)
f=interp1d(x,y)
xnew=np.arange(0,9,0.1)
ynew=f(xnew)
plt.plot(x,y,'o',xnew,ynew,'-')

numpy.polyfit(x,y,deg), polyval(parameters,xnew)

It returns parameters of regression equation, use polyval to use it.
polyfit result = [ x k x^k xk-coef, x k − 1 x^{k-1} xk1-coef, … \dots , x x x-coef, intercept]

p2 = np.polyfit(x,y,2)
f2 = interp1d(x,y,kind='quadratic')
p3 = np.polyfit(x,y,3)
f3 = interp1d(x,y,kind='cubic')

table operations

PurposeMethodargsnote
generatepd.read_csvheader, name, index_col, phrase_date(日期不是index是否还phrase?), sep=’,’注意路径双反斜杠
重名column变成a,a.1
重名包括和index name重名
generatenp.linspacestart,stop,num分割成n个点,n-1段
generatenp.ones/zerosnp.ones(3)/np.ones((3,3)) 1d可直接数字,2d必须tuple
gen randnp.random.rand/randnrand(3,3),randn(3,3,3)
gen randnp.random.random/standard_normalrandom((3,3))/standard_normal((3,3))
gen randnp.random.seed
transformnp.arangestart,stop(取不到),step
transformnp.append
df.append
np默认flatten后append,df只能纵向增加条目
transformnp.concatenate
df.concat
list of tables, axis
transformarray.astype, df.astypechange element type
labelSeries.rename(name/dict/func)
df.rename(columns/index)
Series: name: 改Sereis.name
func/dict: 改label(index)
DF: func/dict: 改columns/index
DF.rename(index=str)转成str
deldf.droplabel(index like的东西??),axis=0,inplacedel 无法删除行,因为无法用.loc
deldf.dropnahow,thresh,subset,inplacethresh=2: not Nan value >= 2
how=any 全部nan都干掉,all=全部是nan再干掉
opnp.reshapearray.reshape(4,-1)
opdf.rollingwindow,min_periods,axisdf[‘x’].rolling(10).mean()
opdf.diff
calnp.any, np.allaxis=Nonenp.any(array, axis=0)
calnp.roundnp.round(array,3)
calnp.isin vs df.isin vs Series.isinnp.isin/Series.isin 不考虑label, df.isin考虑label
cal npexp/sqrt/mean/std/
sum/cumsum/prod/cumprod
cal npmaximumelement wise comparison
fancydf.applyfunc, axis,
result_type=None,'broadcast’输入=输出shape,'expand’返回的是df,
args(给apply的method用)
只能by row/columns使用
fancydf.applymap对所有元素使用
fancynp.tiletile(array,(2,2)),(2,2)是scaler,在x,y维度翻倍
fancydf.where
df.mask
where: 不满足条件的用nan/指定table覆盖
mask: 满足条件的覆盖nan/指定table覆盖
sortdf.sort_values,
np.sort,np.argsort
df: by, axis, ascending,
np.sort: default: ascending, inplace change;
np.argsort: default: ascending

DF.apply high level skills

import pandas as pd
import numpy as np

data = pd.DataFrame(np.arange(20).reshape(5,-1),columns=list('abcd'),index=list('abcde'))

# expand: specify name with dict
cal_df=data.apply(lambda x:{'sum':sum(x),'std':np.std(x)}, 
                    axis=1, 
                    result_type='expand')
# expand: list of function also work, column name = function name, lambda func name = <lambda>
data.apply([np.sum,np.std], axis=1, result_type='expand')

# if want to concat results, jsut use pd.concat
rst = pd.concat([data,data.apply(lambda x:{'sum':sum(x),'std':np.std(x)}, 
                    axis=1, 
                    result_type='expand')],
                axis=1)

# 以下代码会输出DF!!!!
# 因为每行输出结果都是Series,正好多行黏在一起就成了DF
data.apply(lambda x: pd.Series([x['a']**2, x['c']**2+np.sum(x)],index=['cc','dd']),axis=1)

Table Elements

objargseg
pd.Seriesindex,name(默认不显示)pd.Series([1,2,3],index=list('abc'),name='e')

Other

looped lambda function

# a function return functions
fun = lambda a,b: lambda x: a*x+b

sqrt for negative numbers

need to def a function to deal with.

assigning value simutaniously

# 系统逻辑是先读取等号右边的数据,存内存,然后再在左边assign
data['a'], data['b'] = data['b'].copy(), data['a'].copy()
# 以下也work
data['a'], data['b'] = data['b'], data['a'].copy()

# 以下能成功是奇怪的,因为suppose会match label,即永远b to b, a to a
# fancy indexing可,未match label
data[['a','b']]=data[['b','a']] 
data[['a','b']]=data.loc[:,['b','a']] 

# loc, iloc会match label, 无法直接操作
data.loc[:,['a','b']]=data.loc[:,['b','a']] # 失败
data.loc[:,['a','b']]=data[['b','a']] # 失败
data.loc[:,['a','b']]=data[['b','a']].values # 成功,清除label后可操作

仅接受整型的情况

np.random.randn(3/1)# error, 因为3/1返回float,需要3//1

Sample Codes

Binomial Tree (ndarray operations)

# 2
# N+1因为要从T时刻回归到0时刻,所以一共N+1行
fc = np.zeros((N+1,N+1)) 
fp = np.zeros((N+1,N+1))
# 一共产生N+1个数据
j=np.arange(0,N+1,1)
Ss=S*(u**j)*(d**(N-j))
# 处理最后一期数据,这里用N合适,因为对应第N期,回归到第0期
fc[N,:]=np.maximum(0,Ss-K)
fp[N,:]=np.maximum(0,K-Ss)
# 3
p1=1-p
ert=np.exp(-r*dt)
# range可以从任意位置开始,包括-1
for i in range(N-1,0-1,-1):
    # 上一期价格=下一期上升和下降期望
    # 一定要选中而不是直接对行内每个元素都这么操作的原因?
    # 避免0和last value?
    fc[i,0:i+1]=ert*(p*fc[i+1,0+1:i+1+1]+p1*fc[i+1,0:i+1])
    fp[i,0:i+1]=ert*(p*fp[i+1,0+1:i+1+1]+p1*fp[i+1,0:i+1])
# 4
c=fc[0,0]
p=fp[0,0]

Self-designed

比较两个数据表格(shape同)并提取较大的,nan改成0。两边数据都有nan,两个数据label不同
max ⁡ ( d a t a , d a t a 1 , 0 ) where 0 will replace nan \max(data,data1,0)\quad \textnormal{where 0 will replace nan} max(data,data1,0)where 0 will replace nan
np.where/mask

# 题干
data = pd.DataFrame(np.arange(9).reshape(3,-1),columns=list('abc'))
data1 = pd.DataFrame(np.arange(9).reshape(3,-1),columns=list('efg'))
# data 奇数=nan
data=data.mask(data%2==1)
# data1 3的倍数=nan
data1=data1.mask(data1%3==0)

# 解答
np.maximum(data.mask(data.isna(),data1.values),data1.mask(data1.isna(),data.values)).fillna(0)

Matplotlib

typepltaxfig
Titleplt.title(name)ax.set_title(name)
x/y-labelplt.ylabel(name)ax.set_ylabel(name)
limitplt.ylim((0,1.3))
savefig.savefig(path)
configplt.rcParams[‘figure.figsize’]=[12,8]

line styles

‘-’
‘–’
‘-.’
‘:’
‘o’ dot

plot

# plt.plot
plt.plot(x,y,'k-',x1,y1,'go')
# DF.plot
DF.plot(ax=ax,style=['bo'])

scatter

# plt
plt.plot(x,y,'ko')

# DF scatter
DF.plot.scatter(
    ax=ax,x='xname',y='yname',
    style=['go'], label='y-axisname')

check

# 1. DF can be assigned to Series?
df1['D']=df1[['D']].rolling(3).mean()

# 2. 2D array slicing
array[[1,2,3],[1,1,1]]# 提取的是1-1,2-1,3-1的value
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值