Machine Learning in Python实验笔记

Machine Learning in Python

3.19

print("Zeroth Value: %d" % mylist[0])

=

print("Zeroth Value:" , mylist[0])

3.31 Line Plot

(点大小的区别,我好无聊。。)

plt.plot([1, 2, 3])

在这里插入图片描述

plt.plot(numpy.array([1, 1, 4]))

plt.plot(numpy.array([1, 1, 4]))
3.32 Scatter Plot

plt.scatter(x,y)

在这里插入图片描述

plt.scatter(x,y,x)

在这里插入图片描述

plt.scatter(x,y,y)

在这里插入图片描述

plt.scatter(x,y,z)

在这里插入图片描述
3.33 Pandas Series

print(myseries[0])
print(myseries['a'])

————————————

在这里插入图片描述


Chapter 3 summary

在这里插入图片描述
————————————————————————————————————————————————————————
4.3 Load CSV Files with NumPy

ValueError: could not convert string to float
  • float类型之外的数据集导入
    用dtype
data = loadtxt(raw_data, delimiter=",", dtype=numpy.str)

4.5 loading a CSV URL using NumPy

ValueError: Wrong number of columns at line 161

可能是line161列数超出了前面统一的列数。wrong point 见 npyio.py line1058 read_data()里

            if len(vals) != N:
                line_num = i + skiprows + 1
                raise ValueError("Wrong number of columns at line %d" % line_num)

eg.

1 2
3 4 5
6 7

不会解决。。

numpy教程

4.7 loading a CSV file using Pandas

  • 看line0 用 data[0:1] , 数据类型是?
  • Numpy和Python用data[0]

4.9 loading a CSV URL using Pandas

names是列,data['preg']

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']

看line161 print(data[161:162])


Chapter 4 summary
在这里插入图片描述

————————————————————————————————————————————————————————

5.13 Skew of Univariate Distributions


  • 偏态分布的意义

Skew refers to a distribution that is assumed Gaussian (normal or bell curve) that is shifted orsquashed in one direction or another. Many machine learning algorithms assume a Gaussiandistribution. Knowing that an attribute has a skew may allow you to perform data preparationto correct the skew and later improve the accuracy of your models.


  • 反映偏态分布的集中趋势往往用中位数

峰左移,右偏,正偏(positive skew)
峰右移,左偏,负偏(negative skew)

在这里插入图片描述

与正态分布相对而言,偏态分布有两个特点:

一是左右不对称(即所谓偏态);

二是当样本增大时,其均数趋向正态分布。


5 summary

在这里插入图片描述
————————————————————————————————————————————————————————

Understand Your Data With Visualization

  • 【参考书】【Python数据可视化之matplotlib实践】
  • 查看matplotlib可制作的各种图表,单击画廊中图表可查看用于生成图表的代码
  • colorbar()

6.3 Box and Whisker Plots
6.4

# Correction Matrix Plot 
from matplotlib import pyplot 
from pandas import read_csv 
import numpy 
filename = ' pima-indians-diabetes.data.csv' 
names = [' preg' , ' plas' , ' pres' , ' skin' , ' test' , ' mass' , ' pedi' , ' age' , ' class' ] 
data = read_csv(filename, names=names) 
correlations = data.corr() 
# plot correlation matrix 
fig = pyplot.figure() #初始化一个新的视图,尽管它可以调用绘图命令并自动启动。而plt.show()命令,将关闭正在操作的图形,然后新建一个图形
ax = fig.add_subplot(111) #mnp 一块画布分成m*n块,第p块
cax = ax.matshow(correlations, vmin=-1, vmax=1) #plot a matrix or an array as an image
fig.colorbar(cax)#Add a colorbar to a plot.
ticks = numpy.arange(0,9,1)
ax.set_xticks(ticks) 
ax.set_yticks(ticks) 
ax.set_xticklabels(names) 
ax.set_yticklabels(names)
pyplot.show()#打开matplotlib查看器

6.6

  • from pandas.tools.plotting import scatter_matrix

报错: ModuleNotFoundError: No module named ‘pandas.tools’

  • from pandas.plotting import scatter_matrix
    

OK

——————————————————————————————————————————————————————

7 pre-processing
在这里插入图片描述


rescale 不对
不懂


数据处理
——————————————————————————————————————————————————————

8 Feature Selection

PCA
通过计算数据矩阵的协方差矩阵,然后得到协方差矩阵的特征值特征向量,选择特征值最大(即方差最大)的k个特征所对应的特征向量组成的矩阵。这样就可以将数据矩阵转换到新的空间当中,实现数据特征的降维。

——————————————————————————————————————————————————————

10 Machine Learning Algorithm Performance Metrics



————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————————
在这里插入图片描述

data leakage

————————————————————————————————————————————————————————

BaggingClassifier

$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$

二项检验

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Title: Machine Learning in Python: Essential Techniques for Predictive Analysis Author: Michael Bowles Length: 360 pages Edition: 1 Language: English Publisher: Wiley Publication Date: 2015-04-20 ISBN-10: 1118961749 ISBN-13: 9781118961742 Learn a simpler and more effective way to analyze data and predict outcomes with Python Machine Learning in Python shows you how to successfully analyze data using only two core machine learning algorithms, and how to apply them using Python. By focusing on two algorithm families that effectively predict outcomes, this book is able to provide full descriptions of the mechanisms at work, and the examples that illustrate the machinery with specific, hackable code. The algorithms are explained in simple terms with no complex math and applied using Python, with guidance on algorithm selection, data preparation, and using the trained models in practice. You will learn a core set of Python programming techniques, various methods of building predictive models, and how to measure the performance of each model to ensure that the right one is used. The chapters on penalized linear regression and ensemble methods dive deep into each of the algorithms, and you can use the sample code in the book to develop your own data analysis solutions. Machine learning algorithms are at the core of data analytics and visualization. In the past, these methods required a deep background in math and statistics, often in combination with the specialized R programming language. This book demonstrates how machine learning can be implemented using the more widely used and accessible Python programming language. * Predict outcomes using linear and ensemble algorithm families * Build predictive models that solve a range of simple and complex problems * Apply core machine learning algorithms using Python * Use sample code directly to build custom solutions Machine learning doesn't have to be complex and highly specialized. Python makes this technology more acces
Machine Learning in Python: Essential Techniques for Predictive Analysis Paperback: 360 pages Publisher: Wiley; 1 edition (April 27, 2015) Language: English ISBN-10: 1118961749 ISBN-13: 978-1118961742 Learn a simpler and more effective way to analyze data and predict outcomes with Python Machine Learning in Python shows you how to successfully analyze data using only two core machine learning algorithms, and how to apply them using Python. By focusing on two algorithm families that effectively predict outcomes, this book is able to provide full descriptions of the mechanisms at work, and the examples that illustrate the machinery with specific, hackable code. The algorithms are explained in simple terms with no complex math and applied using Python, with guidance on algorithm selection, data preparation, and using the trained models in practice. You will learn a core set of Python programming techniques, various methods of building predictive models, and how to measure the performance of each model to ensure that the right one is used. The chapters on penalized linear regression and ensemble methods dive deep into each of the algorithms, and you can use the sample code in the book to develop your own data analysis solutions. Machine learning algorithms are at the core of data analytics and visualization. In the past, these methods required a deep background in math and statistics, often in combination with the specialized R programming language. This book demonstrates how machine learning can be implemented using the more widely used and accessible Python programming language. * Predict outcomes using linear and ensemble algorithm families * Build predictive models that solve a range of simple and complex problems * Apply core machine learning algorithms using Python * Use sample code directly to build custom solutions Machine learning doesn't have to be complex and highly specialized. Python makes this technology more accessible to a much wider audience, using methods that are simpler, effective, and well tested. Machine Learning in Python shows you how to do this, without requiring an extensive background in math or statistics.
Pro Machine Learning Algorithms: A Hands-On Approach to Implementing Algorithms in Python and R by V Kishore Ayyadevara Bridge the gap between a high-level understanding of how an algorithm works and knowing the nuts and bolts to tune your models better. This book will give you the confidence and skills when developing all the major machine learning models. In Pro Machine Learning Algorithms, you will first develop the algorithm in Excel so that you get a practical understanding of all the levers that can be tuned in a model, before implementing the models in Python/R. You will cover all the major algorithms: supervised and unsupervised learning, which include linear/logistic regression; k-means clustering; PCA; recommender system; decision tree; random forest; GBM; and neural networks. You will also be exposed to the latest in deep learning through CNNs, RNNs, and word2vec for text mining. You will be learning not only the algorithms, but also the concepts of feature engineering to maximize the performance of a model. You will see the theory along with case studies, such as sentiment classification, fraud detection, recommender systems, and image recognition, so that you get the best of both theory and practice for the vast majority of the machine learning algorithms used in industry. Along with learning the algorithms, you will also be exposed to running machine-learning models on all the major cloud service providers. You are expected to have minimal knowledge of statistics/software programming and by the end of this book you should be able to work on a machine learning project with confidence. What You Will Learn Get an in-depth understanding of all the major machine learning and deep learning algorithms Fully appreciate the pitfalls to avoid while building models Implement machine learning algorithms in the cloud Follow a hands-on approach through case studies for each algorithm Gain the tricks of ensemble learning to build more accurate models Discover the basics of programming in R/Python and the Keras framework for deep learning Who This Book Is For Business analysts/ IT professionals who want to transition into data science roles. Data scientists who want to solidify their knowledge in machine learning.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值