![](https://img-blog.csdnimg.cn/20201014180756928.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
python数据分析实战练习
文章平均质量分 51
Gunther17
我很菜
展开
-
python之实战----广义线性模型
之前看着别人源码写的但是有错误,,,,,,,现在出发动手实践# -*- coding: utf-8 -*-""" 广义线性模型 ~~~~~~~~~~~~~~~~~~~~~~~~~~ LinearRegression :copyright: (c) 2016 by the huaxz1986. :license: lgpl-3.0, see LICE原创 2017-10-21 00:35:09 · 3111 阅读 · 0 评论 -
代码理解-机器学习实战k近邻算法(kNN)笔记(Python3)
今天学习了《机器学习实战》这本书介绍的第一个机器学习算法—k近邻算法。争取理解代码每一步。code and analysis'''Created on Sep 16, 2010kNN: k Nearest NeighborsInput: inX: vector to compare to existing dataset (1xN) dat...原创 2018-08-16 00:29:29 · 187 阅读 · 0 评论 -
MovieLens 1M之python数据分析练习
数据集来源https://grouplens.org/datasets/movielens/1m/ 代码区:import pandas as pduname=['user_id','gender','age','occupation','zip']users=pd.read_table(r'D:\demo1\ml-1m\users.dat',sep='::',header=N...原创 2018-03-01 23:06:41 · 2456 阅读 · 0 评论 -
bit.ly的1.usa.gov数据练习
数据来源http://1usagov.measuredvoice.com/2013/代码区域:path="D:\demo1\usagov_bitly_data2013-05-17-1368832207/usagov_bitly_data2013-05-17-1368832207.txt"print open(path).readline()result:{ "a...原创 2018-03-01 11:33:44 · 1902 阅读 · 0 评论 -
Preprocessing data-sklearn数据预处理
数据标准化preprocessing.scale(X,axis=0, with_mean=True, with_std=True, copy=True):将数据转化为标准正态分布(均值为0,方差为1)preprocessing.minmax_scale(X,feature_range=(0, 1), axis=0, copy=原创 2017-12-16 20:48:27 · 488 阅读 · 0 评论 -
基于混淆矩阵的评估度量,代码实践,f1,auc
from sklearn.metrics import classification_report# y_pred是预测标签y_pred, y_true =[1,0,1,0], [0,0,1,0]print(classification_report(y_true=y_true, y_pred=y_pred))混淆矩阵from sklearn.metrics import原创 2017-12-17 11:20:29 · 723 阅读 · 0 评论 -
初次实践XGBoosting
我先做一个小例子热身一下:import numpy as np dataset=np.array([[1.,-1.,2.],[2.,0.,0.],[0.,1.,-1.]])X=dataset[0]#[ 1. -1. 2.]第一行X=dataset[:1]#[[ 1. -1. 2.]]第一行很少用X=dataset[:,1]#[-1. 0. 1.]第二列X=dataset[:原创 2017-12-22 20:58:48 · 1920 阅读 · 0 评论 -
python之实战----PCA、SVD、(NOnlinear PCA)KernelPCA、战iris
PCA#-*- coding=utf-8 -*-import numpy as npfrom sklearn import datasets,decomposition,manifoldimport matplotlib.pyplot as plt def load_data(): iris=datasets.load_iris() return iris.data原创 2017-11-11 21:39:47 · 1070 阅读 · 0 评论 -
python之实战----朴素贝叶斯之手写数字位图
先导入数据,输出看看,在画画图吧#-*- coding=utf-8 -*-import numpy as npfrom sklearn import datasets,naive_bayesfrom sklearn.model_selection import train_test_splitimport matplotlib.pyplot as plt def show_dig原创 2017-11-07 16:25:56 · 1216 阅读 · 0 评论 -
python之实战----决策树(ID3,C4.5,CART)战sin(x)+随机噪声
ID3是采用了信息增益作为特征的选择的度量(越大越好),而C4.5采用的是信息增益比(越大越好),CART分类树采用基尼指数选择最优特征(越小越好)。原创 2017-10-23 14:17:09 · 1004 阅读 · 0 评论 -
python之实战----线性判别分析(LDA)战iris
LDA预测# -*- coding: utf-8 -*-import matplotlib.pyplot as plt import numpy as np from sklearn.model_selection import train_test_split from sklearn import datasets, discriminant_analysisdef la原创 2017-10-21 21:40:03 · 5389 阅读 · 0 评论 -
python之实战----逻辑回归战iris
# -*- coding: utf-8 -*-import matplotlib.pyplot as plt import numpy as np from sklearn.model_selection import train_test_split from sklearn import datasets, linear_modeldef laod_data(): i原创 2017-10-21 20:25:53 · 2812 阅读 · 0 评论 -
机器学习实战决策树算法 笔记(Python3)
计算香农熵from math import logdef calcShannonEnt(dataset): num=len(dataset) labelCounts={} for featVec in dataset: currentlabel=featVec[-1] #print(currentlabel) if cu...原创 2018-09-05 21:40:52 · 203 阅读 · 0 评论