第一阶段、一个简单策略入门量化投资
1-3移动均线交叉策略2
上一篇文章1-2 移动均线交叉策略1中我们最后提到:
如果我们从第一天买入股票,一直持有股票,最后一天卖出,获得的收益是每股124.02美元,收益率为412%
如果按照我们的策略进行买卖,总共完成了21笔交易,收益为美股82.35美元,收益率为273%
仔细分析后,我发现有以下两点需要改进:
1.回测需要使用调整后价格。
2.策略收益的计算有问题,需要充分考虑早先交易产生的收益或亏损对后续交易的收益产生的影响。
针对这两点,修改代码。
假设投资的初始资金为100万,得到的资产变化图如下:
修正后,我们的策略在截止日期的资金总额为298万,也就是说平均收益率为198%
虽然这回收益的计算错误已经改正,但结果是令人沮丧的,我们使用滑动平均模型得到的收益确实比不操作要少许多。
注:由于第一次写的博客不小心被自己盖掉了,你看到的是重写了一遍的,内容简略了许多,实在是不高兴再写一遍了,见谅,各位施主直接看代码吧。
完整代码
import numpy as np
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
import datetime
import time
import draw_candle
import stockdata_preProcess as preProcess
##### in stock_movingAverageCross_Strategy.py
# we have give a simple strategy to trade stocks called Moving Average Model
# it is still imperfect and we find it is necessary to consider the adjust price
# so in stock_movingAverageCross_Strategy2.py we try to improve the strategy
##### read the data from csv
apple=pd.read_csv(filepath_or_buffer='data_AAPL.csv')
# note that some format(data type) of data we read from .csv has changed
# for example the attribute 'Date' should be the index of the dataframe, and the date type changed from datetime to string
# this changes would made our methods in draw_candle.py got trouble
# So we need to make the following changes
date_list = []
for i in range(len(apple)):
date_str = apple['Date'][i]
t = time.strptime(date_str, "%Y-%m-%d")
temp_date = datetime.datetime(t[0], t[1], t[2])
date_list.append(temp_date)
apple['DateTime'] = pd.Series(date_list,apple.index)
del apple['Date']
apple = apple.set_index('DateTime')
##### it seems we need to consider the adjust price
# yahoo only provides the adjust price of 'Close'
# but it is easy to adjust the price by the proportion of 'Adj Close' and 'Close'
# now we will use the adjust data apple_adj in the following code
apple_adj = preProcess.ohlc_adjust(apple)
##### compute the trade information like before, use adjust price
apple_adj["20d"] = np.round(apple_adj["Close"].rolling(window = 20, center = False).mean(), 2)
apple_adj["50d"] = np.round(apple_adj["Close"].rolling(window = 50, center = False).mean(), 2)
apple_adj["200d"] = np.round(apple_adj["Close"].rolling(window = 200, center = False).mean(), 2)
apple_adj['20d-50d'] = apple_adj['20d'] - apple_adj['50d']
apple_adj["Regime"] = np.where(apple_adj['20d-50d'] > 0, 1, 0)
apple_adj["Regime"] = np.where(apple_adj['20d-50d'] < 0, -1, apple_adj["Regime"])
regime_orig = apple_adj.ix[-1, "Regime"]
apple_adj.ix[-1, "Regime"] = 0
apple_adj["Signal"] = np.sign(apple_adj["Regime"] - apple_adj["Regime"].shift(1))
apple_adj.ix[-1, "Regime"] = regime_orig
apple_adj_signals = pd.concat([
pd.DataFrame({"Price": apple_adj.loc[apple_adj["Signal"] == 1, "Close"],
"Regime": apple_adj.loc[apple_adj["Signal"] == 1, "Regime"],
"Signal": "Buy"}),
pd.DataFrame({"Price": apple_adj.loc[apple_adj["Signal"] == -1, "Close"],
"Regime": apple_adj.loc[apple_adj["Signal"] == -1, "Regime"],
"Signal": "Sell"}),
])
apple_adj_signals.sort_index(inplace = True)
apple_adj_long_profits = pd.DataFrame({
"Price": apple_adj_signals.loc[(apple_adj_signals["Signal"] == "Buy") &
apple_adj_signals["Regime"] == 1, "Price"],
"Profit": pd.Series(apple_adj_signals["Price"] - apple_adj_signals["Price"].shift(1)).loc[
apple_adj_signals.loc[(apple_adj_signals["Signal"].shift(1) == "Buy") & (apple_adj_signals["Regime"].shift(1) == 1)].index
].tolist(),
"End Date": apple_adj_signals["Price"].loc[
apple_adj_signals.loc[(apple_adj_signals["Signal"].shift(1) == "Buy") & (apple_adj_signals["Regime"].shift(1) == 1)].index
].index
})
#draw_candle.pandas_candlestick_ohlc(apple_adj, stick = 45, otherseries = ["20d", "50d", "200d"])
##### take a simple analysis again
# compute a rough profit (don't consider fee of the deal)
rough_profit = apple_adj_long_profits['Profit'].sum()
print(rough_profit)
# compute the profit if we don't take any operation
# (take long position at the first day and sale it on the last day of the date)
no_operation_profit = apple['Adj Close'][-1]-apple['Adj Close'][0]
print(no_operation_profit)
tradeperiods = pd.DataFrame({"Start": apple_adj_long_profits.index,"End": apple_adj_long_profits["End Date"]})
apple_adj_long_profits["Low"] = tradeperiods.apply(lambda x: min(apple_adj.loc[x["Start"]:x["End"], "Low"]), axis = 1)
#print(apple_adj_long_profits)
cash = 1000000
apple_backtest = pd.DataFrame({"Start Port. Value": [],
"End Port. Value": [],
"End Date": [],
"Shares": [],
"Share Price": [],
"Trade Value": [],
"Profit per Share": [],
"Total Profit": [],
"Stop-Loss Triggered": []})
port_value = 1
batch = 100
stoploss = .2
for index, row in apple_adj_long_profits.iterrows():
# Maximum number of batches of stocks invested in
# The arithmetic operator "//" represents an integer division that returns a maximum integer that is not greater than the result
batches = np.floor(cash * port_value) // np.ceil(batch * row["Price"])
trade_val = batches * batch * row["Price"]
if row["Low"] < (1 - stoploss) * row["Price"]: # Account for the stop-loss
#share_profit = np.round((1 - stoploss) * row["Price"], 2)
share_profit = np.round(- stoploss * row["Price"], 2) # ??? I think this line need to be modified as left shows
stop_trig = True
else:
share_profit = row["Profit"]
stop_trig = False
profit = share_profit * batches * batch
apple_backtest = apple_backtest.append(pd.DataFrame({
"Start Port. Value": cash,
"End Port. Value": cash + profit,
"End Date": row["End Date"],
"Shares": batch * batches,
"Share Price": row["Price"],
"Trade Value": trade_val,
"Profit per Share": share_profit,
"Total Profit": profit,
"Stop-Loss Triggered": stop_trig
}, index = [index]))
cash = max(0, cash + profit)
print(apple_backtest)
apple_backtest["End Port. Value"].plot()
plt.show()