入门机器学习实用指南Hands-On Machine Learning with Scikit-Learn & TensorFlow (第二章)

我对本书中的代码做了详尽的注释,放到了我的github,star我哦(✪ω✪)

Q:端到端(end to end)

end-to-end(端对端)的方法,一端输入我的原始数据,一端输出我想得到的结果。只关心输入和输出,中间的步骤全部都不管。转自TopCoderの陳澤澤

Q:对收入中位数进行缩放,分类

你或许会这么想,书中为什么要将收入中位数进行分类呢?
书中把所有的数据除以1.5,并将大于5的归于5,其实是在将收入中位数进行分类,然后根据每一类来进行分层抽样,保证每一类别在验证集中都占有同样的比例。那么为什么不直接进行分类,而是要除以1.5呢?我分先画个图看看收入的分布:
收入分布
图中我们可以看到,数据是不是大多集中2-6,如果我们要分类的话得分类别(2,3,4,5,6),因为要保证每一个类别都需要有足够多的实例,实例少的就归到其他类。作者希望训练的分类从1开始,因此作者把数据都除以1.5,这样也就可以分类为(1,2,3,4,5)了。所有这些对数据的操作只是为了切分训练集与验证集,切分完了之后会将这些分类数据删掉,并不会同原始数据一起训练,所以对训练的结果不会产生影响。
分类情况

Q:绘制点图
housing.plot(kind="scatter", x="longitude", y="latitude", alpha=0.4,
    s=housing["population"]/100, label="population", figsize=(10,7),
    c="median_house_value", cmap=plt.get_cmap("jet"), colorbar=True,
    sharex=False)

DataFrame.plot()讲解:pandas.DataFrame.plot其实是集成了matplotlib的多个绘图函数。本例中kind="scatter"就相当与是matlotlib的scatter函数。传入的参数也就是scatter函数的参数。
x,y:x坐标和y坐标
alpha:透明度
s:Size首字母。s可以设置点图中每一个点的大小,s是每个点的大小的一个集合。x,y决定点的顺序,对应着s集合中的顺序。也就是说第i个(x,y)坐标对应的点的大小就是s[i]。当点数超过s集合的大小时,又重新从s[0]开始。在本例中,s取的是人口的数量,每个坐标点的大小刚好对应该点人口数,直接取原始数据画出的点太大,通过除以100来限制点的大小。weixin_39462002这篇博客值得一看
c:Color首字母。同s类似,c是点的颜色的序列(x,y)[i]也与c[i]对应。可以自定义颜色序列如[‘b’,‘r’],b:blue,r:red。在本例中没有使用自定义的颜色序列这种方法,而是使用了另一种方法,即c为一个数字序列,然后使用matplotlib自带的colormap来对这个数字序列根据数字大小对应颜色,见下一参数。
cmap:Colormap。如上方所说,可以使用matplotlib的colormap来作为绘图的颜色序列。这里使用的是‘jet’,所有的colormap见这里
colormap----jet如图所示是名为‘jet’的colormap,本例中c=“median_house_value”,即选取房价中位数作为数字序列,数字越大的颜色越红,即画出的图表中,越红的点房价越高。房价人口分布colorbar:是否显示颜色条,即上图右方颜色条,即jet。
sharex=False:这句用来修复一个图表显示的bug(x轴的值和图例显示不出来)

When most people hearMachine Learning,” they picture a robot: a dependable butler or a deadly Terminator depending on who you ask. But Machine Learning is not just a futuristic fantasy, it’s already here. In fact, it has been around for decades in some specialized applications, such as Optical Character Recognition (OCR). But the first ML application that really became mainstream, improving the lives of hundreds of millions of people, took over the world back in the 1990s: it was the spam filter. Not exactly a self-aware Skynet, but it does technically qualify as Machine Learning (it has actually learned so well that you seldom need to flag an email as spam anymore). It was followed by hundreds of ML applications that now quietly power hundreds of products and features that you use regularly, from better recommendations to voice search. Where does Machine Learning start and where does it end? What exactly does it mean for a machine to learn something? If I download a copy of Wikipedia, has my computer really “learned” something? Is it suddenly smarter? In this chapter we will start by clarifying what Machine Learning is and why you may want to use it. Then, before we set out to explore the Machine Learning continent, we will take a look at the map and learn about the main regions and the most notable landmarks: supervised versus unsupervised learning, online versus batch learning, instance-based versus model-based learning. Then we will look at the workflow of a typical ML project, discuss the main challenges you may face, and cover how to evaluate and fine-tune a Machine Learning system. This chapter introduces a lot of fundamental concepts (and jargon) that every data scientist should know by heart. It will be a high-level overview (the only chapter without much code), all rather simple, but you should make sure everything is crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s get started!
When most people hearMachine Learning,” they picture a robot: a dependable butler or a deadly Terminator depending on who you ask. But Machine Learning is not just a futuristic fantasy, it’s already here. In fact, it has been around for decades in some specialized applications, such as Optical Character Recognition (OCR). But the first ML application that really became mainstream, improving the lives of hundreds of millions of people, took over the world back in the 1990s: it was the spam filter. Not exactly a self-aware Skynet, but it does technically qualify as Machine Learning (it has actually learned so well that you seldom need to flag an email as spam anymore). It was followed by hundreds of ML applications that now quietly power hundreds of products and features that you use regularly, from better recommendations to voice search. Where does Machine Learning start and where does it end? What exactly does it mean for a machine to learn something? If I download a copy of Wikipedia, has my computer really “learned” something? Is it suddenly smarter? In this chapter we will start by clarifying what Machine Learning is and why you may want to use it. Then, before we set out to explore the Machine Learning continent, we will take a look at the map and learn about the main regions and the most notable landmarks: supervised versus unsupervised learning, online versus batch learning, instance-based versus model-based learning. Then we will look at the workflow of a typical ML project, discuss the main challenges you may face, and cover how to evaluate and fine-tune a Machine Learning system. This chapter introduces a lot of fundamental concepts (and jargon) that every data scientist should know by heart. It will be a high-level overview (the only chapter without much code), all rather simple, but you should make sure everything is crystal-clear to you before continuing to the rest of the book. So grab a coffee and let’s get started!
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron English | 13 Mar. 2017 | ASIN: B06XNKV5TS | 581 Pages | AZW3 | 21.66 MB Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple, efficient tools to implement programs capable of learning from data. This practical book shows you how. By using concrete examples, minimal theory, and two production-ready Python frameworks—scikit-learn and TensorFlow—author Aurélien Géron helps you gain an intuitive understanding of the concepts and tools for building intelligent systems. You’ll learn a range of techniques, starting with simple linear regression and progressing to deep neural networks. With exercises in each chapter to help you apply what you’ve learned, all you need is programming experience to get started. Explore the machine learning landscape, particularly neural nets Use scikit-learn to track an example machine-learning project end-to-end Explore several training models, including support vector machines, decision trees, random forests, and ensemble methods Use the TensorFlow library to build and train neural nets Dive into neural net architectures, including convolutional nets, recurrent nets, and deep reinforcement learning Learn techniques for training and scaling deep neural nets Apply practical code examples without acquiring excessive machine learning theory or algorithm details
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值