python机器学习

最新推荐文章于 2022-04-24 17:35:43 发布

Py小白白白白白

最新推荐文章于 2022-04-24 17:35:43 发布

阅读量162

点赞数 1

分类专栏： python机器学习文章标签： python 机器学习数据科学个人学习笔记

python机器学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

简单概述

机器学习：统计学，人工智能，计算机科学三门学科的综合。
机器学习：机器学习可以简单的理解为将大量数据（将数据按照一定的分配方式分为训练集数据和测试集数据）放入到一个黑箱子（某种算法），通过大量数据训练测试产生有价值的信息。即：数据————>算法（黑箱子）————>信息
机器学习分为：监督式学习（supervised learning），非监督式学习（unsupervised learning）

例子:预测平安银行未来7天股价的变化，并判断其准确性(代码实现)

import pandas as pd
import tushare as ts  #(若没有安装 ：pip install tushare /  conda install tushare)
import math
import numpy as np
from sklearn import preprocessing,svm
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

df=ts.get_hist_data('000001')
df=df[['open','high','close','low','volume']]
df['High_Low_Pct']=(df['high']-df['low'])/df['low']*100
df['Change_Pct']=(df['close']-df['open'])/df['open']*100

df=df[['close','High_Low_Pct','Change_Pct','volume']]

pd.set_option('display.max_rows',1000)
pd.set_option('display.max_columns',1000)

#print(df.head())
future_value='close'
df.fillna(value=-99999,inplace=True)
how_far_I_want_to_forecast=int(math.ceil(0.01*len(df)))
#print(how_far_I_want_to_forecast)
df['label']=df[future_value].shift(-how_far_I_want_to_forecast)
df.dropna(inplace=True)

#print(df.head())

x=np.array(df.drop(['label'],1))
x=preprocessing.scale(x)

x_recent_real_data=x[-how_far_I_want_to_forecast:]

y=np.array(df['label'])

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)


black_box=LinearRegression()
black_box.fit(x_train,y_train)

forecast_set=black_box.predict(x_recent_real_data)
print(forecast_set)

#accuracy=black_box.score(x_test,y_test)
#print(accuracy)
----------------------------------
#运行结果
[9.52053828 9.86518547 9.39170026 9.41294818 9.34341797 9.34482912
 9.29581423]
0.9106378525453049