- 博客(24)
- 收藏
- 关注
原创 你是如何开始能写python爬虫?
好问题,这个问题不禁让我回忆起我学爬虫的路。2014年底,我在学校参与到一个项目,是用数据来预测新店铺的选址,这里就不免需要购物中心的数据。当时没有同学会爬虫获取数据,于是有些同学开始手动复制粘贴,我并不想做这种机械性的工作。于是,我上网搜到了爬虫这个概念,在我仅仅会vba和c的情况下,开始自学python,装library,上网搜教程,各种折腾了一个星期后,终于从网上爬下来需要的数据。
2017-11-16 05:13:13 1031
原创 零基础如何学爬虫技术?
我自学 Python 爬虫,到这个月出书《Python 网络爬虫:从入门到实践》(机械工业出版社出版),一共也就过去两年。这两年自学的过程,走过了无数的坑,多亏了各位大神无私地回答我的问题,我想我是有资格帮你解决零基础学爬虫技术的。作为零基础的你,我想你可能是想解决工作中的一个实际问题,或者仅仅是很想学习一下爬虫的技术,多一技之长。其实我准备开始学 Python 爬虫的时候也是一样,老板派了任
2017-11-16 05:08:17 5908
原创 如何有系统地学习Python爬虫?
近年来,大数据成为业界与学术界最火热的话题之一,数据已经成为每个公司极为重要的资产。互联网大量的公开数据为个人和公司提供了以往想象不到的可以获取的数据量。而掌握网络爬虫技术可以帮助你获取这些有用的公开数据集。我是从商科自学转到数据科学的,因此编程和数据挖掘能力都是上网自学的。在这个过程中,我深刻地体会到,与不知所云的教学相比,深入浅出的教学对学习效率有很大提升。因此,学习知识最重要的两点是,通
2017-11-16 04:57:30 896
原创 5分钟入门网络爬虫 - 原来可以这么简单易懂
爬虫在大数据时代占据了重要的位置,在网上有大量的公开数据可以轻松获取。爬虫入门其实非常简单,就算你是编程小白,也可以轻松爬下一些网站。下面就以爬取笔者的个人博客网站(大数据分析@唐松)为例,教大家学会一个简单的爬虫。。一方面,由于这个网站的设计和框架不会更改,因此本书的网络爬虫代码可以一直使用; 另一方面,由于这个网站由笔者拥有,因此避免了一些法律上的风险。
2017-11-16 04:55:01 1490 3
原创 重学 Statistics, Cha16 General Linear Model
Curvilnear Relationship当我们用 scatter diagram 来看的时候,发现 x y 的关系不完全是一条直线 另外把 residual 和 y 做一个 plot,也看到,是一个弧形的: 所以,我们用二次模型, second order model. 结果就比较好,r square 也高。Interaction怎么发现 x1 和 x2 之间有 interaction?
2016-08-01 00:09:42 472
原创 重学 Statistics, Cha15 Multiple Regression
怎么算两列数之间的 correlatoin coefficient?15.1 Multiple Regression Model15.3 Coefficient of DeterminationWhy Adjusted? Avoid overestimating the impact of adding an independent variable on the amount of variab
2016-07-31 23:34:07 709
原创 重学Statistics, Cha14 Simple Linear Regression
14.1 Simple Linear Regression ModelSimple Linear Regression Model: y = β0 + β1 x + εβ0 β1 are referred to as parameters of the modelε is a random variable referred to as the error term, which is the
2016-07-28 21:35:13 1186
原创 重学Statistics, Cha13 Experimental Design and Analysis of Variance
本章内容: 1. 介绍Experimental Design 和 ANOVA(Analysis of Variance)13.1 An Introduction of Experimental Design and Analysis of Varianceμ1=mean number of units produced per week using methodA μ2=mean number
2016-07-26 17:02:54 1300
原创 重学 Statistics,Cha12 Tests of Goodness of Fit and Independence
Goodness of Fit Test: A Multinomial PopulationMultinomial population: each element of a population is assigned to one and only one of several classes or categories Binomial Distribution: to one and on
2016-07-25 17:01:47 683
原创 重学 Statistics, Cha11 Inferences About Population Variances
11.1 Inferences About a Population Variance对于 chi-square distribution, 百度百科是这样写的: Interval EstimationWe will use the notation X2α to denote the value for the chi-square distribution that provides an
2016-07-24 18:08:15 567
原创 重学 statistics, Cha10 Inference About Means and Proportions with Two Populations
Cha10 要学的是: 1. involving two populations when the difference between the two population means or the two population proportions is of prime importance. 2. 例如:我们会学建立 interval estimate来看男性平均薪水和女性平均薪水之间
2016-07-24 17:34:25 669
原创 重学 Statistics, Cha9 Hypothesis Tests
9.1 Developing Null and Alternative HypothesesNull Hypothesis H0: a tentative assumption about a population parameter such as a population mean or a population proportion. Alternative Hypothesis Ha: t
2016-07-24 15:41:20 2501
原创 重学 Statistics,Cha8 Interval Estimation
这一章讲的是:怎么用 interval estimate 来找到整体平均值和整体比例的区间 The general form of interval estimate of a population mean is x̅ ± Margin of error The general form of interval estimate of a population proportion is p̄
2016-07-23 16:37:41 1585
原创 重学 Statistics,Cha7 Sampling and Sampling Distribution
7.1 Selecting a Sample1. Simple Random Sampling (Finite)A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same pr
2016-07-20 22:20:15 3395
原创 重学statistics, Cha6 Continuous Probability Distributions
Uniform Probability DistributionE(x) = (a+b)/2Var(x) = (b-a)2 /12Normal Probability DistributionBell-Shaped Normal Curve1. Normal Distribution is symmetric: the shape of
2015-10-31 15:22:08 761
原创 重學Statistics, Cha5 Discrete Probabiliry Distributions
A random variable is a numerical description of the outcome of an experiment.A random variable that may assume either a finite number of values or an infinite sequence of values such as 0, 1, 2,
2015-10-24 15:20:42 1128
原创 重学Statistics, Cha4 Introduction to Probability
The sample space for an experiment is the set of all experimental outcomes.Combination: CNn = N!/ (n!(N-n)!)Permutation: PNn = N!/(N-n)!Event: An event is a collection of sample points.Pro
2015-10-14 21:15:48 820
原创 重学statistics,Cha3 Descriptive Statistics: numerical measures
If the measures are computed for data from a sample, they are calledsample statistics. If the measures are computed for data from a population, they are calledpopulation parameters.Sample Me
2015-10-12 14:50:06 1191
原创 重学Statistics, Cha2 Descriptive Statistics (Categorical and Quantitative Data)
Summary Categorical Data Frequency Distribution: A frequency distribution is a tabular summary of data showing the number (frequency) of items in each of several nonoverlapping classes.Relativ
2015-10-11 16:02:37 942 1
原创 重学Statistics,Cha1 Data and Statistics
Elements are the entities on which data are collected。A variable is a characteristic of interest for the elements.The set of measurements obtained for a particular element is called an observation
2015-10-08 17:29:10 1103
原创 echarts学习笔记(4) ---- 如何使用 formatter 和 grid 这两把利器
在谈 formatter 之前,先来说说 grid。在官方文档中,grid
2014-09-07 00:22:32 34112
空空如也
空空如也
TA创建的收藏夹 TA关注的收藏夹
TA关注的人