机器学习实战
进击的小杨人
越努力,越不费力
展开
-
1-逻辑回归实战
Logistic RegressionThe data我们将建立一个逻辑回归模型来预测一个学生是否被大学录取。假设你是一个大学系的管理员,你想根据两次考试的结果来决定每个申请人的录取机会。你有以前的申请人的历史数据,你可以用它作为逻辑回归的训练集。对于每一个培训例子,你有两个考试的申请人的分数和录取决定。为了做到这一点,我们将建立一个分类模型,根据考试成绩估计入学概率。import nump...原创 2019-03-13 00:13:09 · 324 阅读 · 1 评论 -
机器学习实战2之科比篮球生涯得分数据分析
import numpy as npimport pandas as pdimport matplotlib.pyplot as plt导入数据filename = './data/kobe.csv'raw = pd.read_csv(filename)print(raw.shape)print(raw.head())(30697, 25) action_ty...原创 2019-03-29 17:39:12 · 3623 阅读 · 2 评论 -
机器学习实战3.1之基于pandas的时间序列
import numpy as npimport pandas as pd时间序列时间戳(timestamp)固定周期(period)时间间隔(interval)date_range可以指定开始时间与周期,也可以与常数结合H:小时D:天(默认参数)M:月# TIMES #2019 March 1 3/1/2019 1/3/2019 2019-03-01 2019/0...原创 2019-03-29 17:40:16 · 195 阅读 · 0 评论 -
机器学习实战3.2之数据重采样和滑动窗口
import numpy as npimport pandas as pd数据重采样时间数据由一个频率转换到另一个频率降采样升采样rng = pd.date_range('2019-03-29', periods=30, freq='D')ts = pd.Series(np.random.randn(len(rng)), index=rng)print(ts.head())...原创 2019-03-29 17:41:38 · 2558 阅读 · 0 评论 -
机器学习实战3.3之差分法和ARIMA模型
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltimport seaborn as snsimport statsmodels.api as smimport statsmodels.formula.api as smfimport statsmodels.tsa.api as smt一些可视...原创 2019-03-29 17:44:15 · 1442 阅读 · 0 评论 -
机器学习实战3.4之ARIMA模型参数选择
import pandas as pdimport numpy as np# TSA from Statsmodelsimport statsmodels.api as smimport statsmodels.formula.api as smfimport statsmodels.tsa.api as smt# Display and Plottingimport matplot...原创 2019-03-29 17:46:19 · 5864 阅读 · 2 评论 -
机器学习实战3.5之预测股票走势
import pandas as pdimport pandas_datareaderimport datetimeimport matplotlib.pylab as pltimport seaborn as snsfrom matplotlib.pylab import stylefrom statsmodels.tsa.arima_model import ARIMAfrom ...原创 2019-03-29 17:48:40 · 1163 阅读 · 0 评论 -
机器学习实战1.1之线性回归:汽车燃油效率分析
导入相关库和数据集import pandas as pdimport matplotlib.pyplot as plt#由于数据集中每列数据没有标签,因此需要先手动添加,且用空格来隔开columns = [ 'mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model ...原创 2019-03-25 23:06:32 · 1460 阅读 · 0 评论 -
机器学习实战1.2之逻辑回归之是否成功申请高校
导入库和数据集import pandas as pdimport matplotlib.pyplot as pltadmissions = pd.read_csv('./data/admissions.csv')print(admissions.head()) admit gpa gre0 0 3.177277 594.1029921...原创 2019-03-26 19:56:02 · 276 阅读 · 0 评论 -
机器学习实战1.3之模型效果衡量标准详解
import numpy as npimport pandas as pdimport matplotlib.pyplot as pltfrom sklearn.linear_model import LogisticRegressionadmissions = pd.read_csv('./data/admissions.csv')model = LogisticRegression...原创 2019-03-26 22:09:00 · 311 阅读 · 0 评论 -
10-基于Xgboost的糖尿病分类任务实战
import xgboostfrom numpy import loadtxtfrom xgboost import XGBClassifierfrom sklearn.model_selection import train_test_splitfrom sklearn.metrics import accuracy_score导入数据集:印度糖尿病病人dataset = load...原创 2019-03-25 21:59:40 · 2945 阅读 · 3 评论 -
9-神经网络
推荐一个神经网络可视化网站,以便更好地理解神经网络的工作原理。网址:https://cs.stanford.edu/people/karpathy/convnetjs/demo/classify2d.html内容和代码待更新原创 2019-03-25 19:25:35 · 146 阅读 · 0 评论 -
8-PCA降维实战
import numpy as npimport pandas as pddf = pd.read_csv('./data/iris.data')print(df.head()) 5.1 3.5 1.4 0.2 Iris-setosa0 4.9 3.0 1.4 0.2 Iris-setosa1 4.7 3.2 1.3 0.2 Iris-setosa...原创 2019-03-18 17:46:47 · 263 阅读 · 3 评论 -
2-交易数据异常检测
交易数据异常检测参考学习链接:https://blog.csdn.net/livan1234/article/details/81048085import numpy as npimport pandas as pdimport matplotlib.pyplot as pltdata = pd.read_csv('./data/creditcard.csv')data.head()...原创 2019-03-13 00:15:56 · 406 阅读 · 0 评论 -
3-决策树实战
import matplotlib.pyplot as pltimport pandas as pdfrom sklearn.datasets.california_housing import fetch_california_housinghousing = fetch_california_housing()print(housing.DESCR).. _california_...原创 2019-03-13 00:18:00 · 696 阅读 · 0 评论 -
4-Titanic船员生存实战
数据预处理import pandas as pdtitanic = pd.read_csv('./data/titanic_train.csv')titanic.head() PassengerId Survived Pclass Name Sex Age SibSp ...原创 2019-03-18 17:24:53 · 200 阅读 · 0 评论 -
5-1贝叶斯拼写检查器实现
基于贝叶斯公式的实时拼接检查器import re, collections# 将语料库中字母转换为小写,并匹配所有字母def words(text): return re.findall('[a-z]+', text.lower())# 遇到新单词设置默认词频率,并统计词频def train(features): # 遇到新的单词,设置词频默认为1(表示很小的概率)...原创 2019-03-18 17:26:48 · 325 阅读 · 0 评论 -
5-2基于贝叶斯算法的新闻分类任务实战
import pandas as pdimport jieba数据源:http://www.sogou.com/labs/resource/ca.phpdf_news = pd.read_table('./data/val.txt', names=['category','theme','URL','content'],encoding='utf-8')df_news = df_news...原创 2019-03-18 17:28:39 · 1067 阅读 · 1 评论 -
6-1基于sklearn求解SVM
import numpy as npimport matplotlib.pyplot as pltfrom scipy import statsimport seaborn as sns;sns.set()%%html<img src='./images/3.png', width='900'><img src=’./images/3.png’, width=‘9...原创 2019-03-18 17:33:20 · 612 阅读 · 0 评论 -
6-2基于SVM的人脸识别实战
人脸识别作为SVM实际的一个应用,让我们看一下人脸识别问题。将使用sklearn数据集中带有标签的人脸图片from sklearn.datasets import fetch_lfw_peoplefaces = fetch_lfw_people(min_faces_per_person=60)print(faces.target_names)print(faces.images.sha...原创 2019-03-18 17:36:38 · 2073 阅读 · 0 评论 -
7-1基于Kmeans的简单图像压缩
# -*- coding: utf-8 -*-import numpy as npfrom skimage import iofrom sklearn.cluster import KMeansimage = io.imread('./images/smile.jpg')io.imshow(image)io.show()rows = image.shape[0]cols =...原创 2019-03-18 17:39:04 · 267 阅读 · 0 评论 -
7-2基于Kmeans和DBSCAN算法的啤酒成分分析实战
beer数据集 聚类分析import pandas as pdbeer = pd.read_csv('./data/data.txt', sep=' ')print(beer) name calories sodium alcohol cost0 Budweiser 144 15 ...原创 2019-03-18 17:43:42 · 2197 阅读 · 0 评论 -
机器学习实战1.4之逻辑回归:多分类问题解决方法
import pandas as pdimport matplotlib.pyplot as plt#由于数据集中每列数据没有标签,因此需要先手动添加,且用空格来隔开columns = [ 'mpg', 'cylinders', 'displacement', 'horsepower', 'weight', 'acceleration', 'model year', 'or...原创 2019-03-26 22:47:32 · 934 阅读 · 0 评论