机器学习
文章平均质量分 56
crella___
当你还不能写出自己满意的程序时,你就不要去睡觉。
展开
-
normal equation
normal equation: Method to solve for θ\theta analytically. one step to get the optimal value. θ=(XTX)−1XTy\theta=(X^TX)^{-1}X^Ty Octave: pinv(X'*X)*X'*yGradient Descent - Need to choose α\alpha原创 2017-03-02 11:03:53 · 342 阅读 · 0 评论 -
Principal Component Analysis(PCA)
function [U, S] = pca(X)%PCA Run principal component analysis on the dataset X% [U, S, X] = pca(X) computes eigenvectors of the covariance matrix of X% Returns the eigenvectors U, the eigenvalues原创 2017-03-11 10:31:19 · 436 阅读 · 0 评论 -
Machine Learning Diagnostic
Once we have done some trouble shooting for errors in our predictions by:Getting more training examplesTrying smaller sets of features Trying additional features Trying polynomial features Incr原创 2017-03-05 15:33:50 · 417 阅读 · 0 评论 -
Code for Machine Learning Diagnostic
Learning curvesfunction [error_train, error_val] = ... learningCurve(X, y, Xval, yval, lambda)%LEARNINGCURVE Generates the train and cross validation set errors needed %to plot a learning curve%原创 2017-03-11 12:32:21 · 237 阅读 · 0 评论 -
Recommander System
recommender system problem formulation—a content based recommendationsNotations Algorithm(with linear regression) featuring learning—Collaborative Filtering (Low rank matrix fractorization)iterati原创 2017-03-11 16:47:17 · 313 阅读 · 0 评论 -
Large Scale Machine Learning
Before learning with large datasets, plot a learning curve with a much small datasets, and to see if it appears to be have high bias.Stochastic Gradient Decent (taking linear regression for example)原创 2017-03-11 19:02:06 · 355 阅读 · 0 评论 -
Anomaly Detection
build a model for the probability of x—p(x) a small threshold— ϵ\epsilon Gaussian(Normal) Distributionx~N(μ,σ2\mu,\sigma^2) p(x; μ,σ2\mu,\sigma^2) Parameter estimationif x(i)x^{(i)}~N(μ,σ2)N(\mu,\s原创 2017-03-12 18:48:55 · 306 阅读 · 0 评论 -
Code for Anomaly Detection
Estimating parameters for a Gaussianfunction [mu sigma2] = estimateGaussian(X)%ESTIMATEGAUSSIAN This function estimates the parameters of a %Gaussian distribution using the data in X% [mu sigma2] =原创 2017-03-13 01:22:55 · 418 阅读 · 0 评论 -
Code for Recommender Systems
Two ways to eliminate extra zero onesidx = find(R(i, :)==1) Thetatemp = Theta(idx,:) Ytemp = Y(i,idx) Xgrad(i,:) = (X(i,:)∗ThetaTtemp −Ytemp)∗Thetatemp.R .* M sum(sum(R.*M))function [J, grad] = c原创 2017-03-13 02:52:55 · 361 阅读 · 0 评论 -
Neural Networks
Neural Networks —a much better way to learn complex hypotheses even when n is large,compared to algorithm above.Neuron model Logistic unit x0x_0 calls bias unit Sigmoid(logistic) activation funct原创 2017-03-03 06:17:38 · 452 阅读 · 0 评论 -
Machine Learning Application
Photo OCRSlide Windows Two things are recommended: 1.Plot Learning Curves. 2.Get 10x Data Set which actually takes much less time than we expect.Ceiling AnalysisWhen there’s a pipeline with seve原创 2017-03-12 16:06:12 · 602 阅读 · 0 评论 -
Linear regression
m=Number of training examples h(hypothesis)=output function Linear regression with one variable. Univariate linear regression.Idea: Choose θ0,θ1\theta_0,\theta_1 so that hθ(x)h_\theta(x)is close to原创 2017-03-04 10:55:05 · 260 阅读 · 0 评论 -
Code for K-Means
K-means% Initialize centroids centroids = kMeansInitCentroids(X, K); for iter = 1:iterations % Cluster assignment step: Assign each data point to the % closest centroid. idx(i) corresponds to cˆ(i),原创 2017-03-11 09:30:14 · 376 阅读 · 0 评论 -
Octave
% comment ~= not equal PS1(‘>> ”); ; suppress the print output a ——a=? disp(a) ——value of a disp(sprintf(‘2 decimals: %0.2f’,a)) ——2 decimals: 3.14 format long a ——a=?? matrix or vector:原创 2017-03-02 16:14:49 · 383 阅读 · 0 评论 -
Machine Learning
ML is the field of study that gives computer the ability to learn without being explicitly programmed. ML——E=T+P A computer program is said to learn from with respect to some class of tasks T,and pe原创 2017-03-04 09:22:48 · 324 阅读 · 0 评论 -
Support Vector Machines(SVM)
(or large margin classifiers) C=1λ\frac{1}{\lambda} (C is very large) KernelsUsing Kernels and SVM to define extra features with landmarks and similarity functions to learn more complex nonlinear原创 2017-03-08 04:33:27 · 203 阅读 · 0 评论 -
SVM Code
Most SVM software packages (including svmTrain.m) automatically add the extra featurex0 = 1 for you and automatically take care of learning the intercept term ✓0. So when passing your training data to原创 2017-03-08 06:16:17 · 431 阅读 · 0 评论 -
Linear Regression Code
Gradient descentX = [ones(m, 1), data(:,1)]; % Add a column of ones to xtheta = zeros(2, 1); % initialize fitting parameters% Some gradient descent settingsiterations = 1500;alpha = 0.01;1.Computin原创 2017-03-05 02:18:14 · 362 阅读 · 0 评论 -
the problem of overfitting
underfitting or high bias—hypothesis function h maps poorly to the trend of the data. usually caused by a function that is too simple or uses too few features.overfitting or high variance—fits the ava原创 2017-03-01 15:17:09 · 344 阅读 · 0 评论 -
Logistic Regression Code
Visualizing the data %Find Indices of Positive and Negative Examplespos = find(y==1); neg = find(y == 0);% Plot Examplesplot(X(pos, 1), X(pos, 2), 'k+','LineWidth', 2, ...'MarkerSize', 7);plot(X(原创 2017-03-05 04:07:49 · 289 阅读 · 0 评论 -
classification problems——logistic regression
classification problems ——where the variable y that you want to predict is valued. negative class-positive class “-“-“+” y(i)->the label for the training example way1: linear regression thres原创 2017-02-27 05:29:06 · 478 阅读 · 0 评论 -
Skewed Data
Skewed Classes—the ratio of positive to negative examples is very close to one of two extremes. In that case, it becomes much harder to use just classification accuracy.It’s not clear that the decreas原创 2017-03-05 16:44:44 · 824 阅读 · 0 评论 -
To solve machine learning problems
improve the accuracy of certain algorithm - Collect lots of data - Develop sophisticated features - Develop algorithms to process your input in different waysThe recommended approach to solving m原创 2017-03-05 16:25:36 · 223 阅读 · 0 评论 -
K-means
cluster centroid distortion cost function —using that to make sure k-means is converging and working properly.cluster assignment step move centroids steprandom initialization(also for global optima)原创 2017-03-11 03:55:48 · 439 阅读 · 0 评论 -
Dimensionality Reduction—PCA
Dimensionality Reduction ——allows us to compress the data and have it therefore use up less computer memory or disk space, but it will also allow us to speed up our learning algorithms. ——to visuali原创 2017-03-11 04:27:14 · 523 阅读 · 0 评论 -
Polynomial Regression
choosing your features defining new features you might actually get a better model 1. combine multiple features into one. to predict the price of a house there’s two features: the fron原创 2017-02-27 03:40:16 · 720 阅读 · 0 评论