1、看到的对机器学习本质的一种解释:
--Machine learning isessentially a form of applied statistics with increased emphasis on the use ofcomputers to statistically estimate complicated functions and a decreased emphasison proving confidence intervals around these functions
2、机器学习的统计学的两种方法:
--present the two central approaches to statistics: frequentist estimators and Bayesian inference.
3、如何设计训练数据集(图片数据集除外):
--One common way of describing a dataset is with a design matrix. A design matrix is a matrix containing a different example in each row. Each column of thematrix corresponds to a different feature.
4、区分映射函数和放射函数:
--the mapping from parameters to predictions is still a linear function but themapping from features to predictions is now an affine function.
5、数据采集与测试集训练集误差的联系:
--Suppose we have a probability distributionp(x, y) and we sample from it repeatedly to generate the train set and the testset. For some fixed valuew, the expected training set error is exactly the same asthe expected test set error, because both expectations are formed using the samedataset sampling process. The only difference between the two conditions is thename we assign to the dataset we sample.
--Of course, when we use a machine learning algorithm, we do not fix theparameters ahead of time, then sample both datasets. We sample the training set,then use it to choose the parameters to reduce training set error, then sample thetest set.