MACHINE LEARNING ----BY HUNGYILEE （love u）

最新推荐文章于 2022-02-14 16:50:42 发布

456878921324

最新推荐文章于 2022-02-14 16:50:42 发布

阅读量250

点赞数

分类专栏： MachineLearning

本文链接：https://blog.csdn.net/qq_40590753/article/details/88916950

版权

MachineLearning 专栏收录该内容

22 篇文章 0 订阅

订阅专栏

A NEW BEGINNING????‍♀️

notice : in the reinforcement learning the competiter with pc is another pc, not human ,caz human is too slow to follow the pc’s action.
notice: no matter which method of machine learning u use ,you actually is finding a better function which is suit the problem u want to solve.so what u need to do is choose a good❤️ function!!!❤️
notice: what we really care is the error in the new data (which is also called “testing data”, ), not on the traning data.
notice : what is deep learning : the deep le6arning is use a deep structure (eg. deep nerual network) as a machine learning’s method.
notice: if. a nerual network has a hideen layer 's number is more than 2 layer it can be called a deep nerual network.

notice：
在这里插入图片描述

look at the two pictures above. u can find that the more complex the function the training data’s error will be lower, kbut the testing data’s error well not be lower (maybe it lower at beginning ,but be higher in the end),so when u choose the function u use u should consider the error both in the training set and testing set.
(caz there may will have overfitting)

recommend a website

: https://www.zybuluo.com/hanbingtao/note/433855❤️❤️❤️
在这里插入图片描述

BIAS AND VARIANCE:
在这里插入图片描述
the use of bias and vaeiance is as below:

the complex model ,the lower bias .the complex model ,the higher variance.
so above all :
in conclusion:

so u should find a point in the blue line(which consider both bias and variance)

在这里插入图片描述

why feature scaling:
在这里插入图片描述

beliw is something interesting :
in the logistic classifcation u may be will occur this kind of stutation :
the logistic regression can only draw a line to classfy two kind of sample, like the picture belw, it is difficult to find a line which can classfy the data set as below well
在这里插入图片描述
so at this time we maybe need some feature transformation :

but now we , choose the feature tranformation function by oursleves not the pc automatically, so we should build a function to let the pc do free.

here we use the cascading logistic regression models(do u think the structure od this model just looks like the nerual network).
so notice : we can use this kind of structure to transfer the data set to what we like .(it is a kind of data process)
so ww now can go to deep learning
we can use the structure above to build a neuron:
在这里插入图片描述

neuron is also called preceptron：

在这里插入图片描述
now i will show u a beatuiful structure :

how beautiful isn’t it?

backpropagation:

在这里插入图片描述
why we use nerual network : is because we do not want our model be a simple linear function model, we want it be more complex.

ReLU:
when use ReLU some feature will dispear, and in some angle a ReLU can be regard as a linear model, but if the neural network all use ReLU, the whole networl is not a linear model. it is a non-linear model
在这里插入图片描述
different kind of ReLU:(soem extend of ReLU):

notice : how to solve the problem “local minimum”----use “momentum”
在这里插入图片描述

now if we has poor performance on the training data, we can use (early stopping, regularization, dropout) to let it be better.
now i will talk about the “dropout”:
在这里插入图片描述

CNN（a convolution neural network ia also work like a neural network）
在这里插入图片描述
but CNN IS not fully connect NN, it use less parameter,
above picture is just work as a neuron,
the number in filter can be regard as weight in neural network,
in CNN we has shared weight ,caz even less weight,

then is maxpooling:

then we will flatten the image and send it to a fully connected neural network,then we will get the result which we want.

so the whole structure of CNN is just like below :
在这里插入图片描述
above chart means : u use convolution and max pooling to process the parameter, to let the paramter be less and even less, then put the parameter into the fully connected network, in the end get the result u want.(it can be simplify as first process the parameter, second use NN get result).
why we can do as this on a image : caz even subsample a image it can not influence the result of hunman indentify the content of a image. For pc, it only need a set of data to train, no matter the data is form a whole image or a part of image or the data has been processed, because the data wu put into the NN has been processed by a same way, so for a pc , the data are all the same.

the most important:

find a image which can let the filter be max
在这里插入图片描述

some funny application:
process a piece of voice into an image, then use CNN to identify what the voice content is.

the advantage is that: it can tarin data in fewer sample(caz it can share something)

unsupervised learning:

在这里插入图片描述

dimension reduction method :PCA

在这里插入图片描述
what is PCA:
to be simply: it is just use project(投影) to decrease the dimesion.

GAN：

regard generator and discriminator as a whole (as a network),we can adjust the generator’s layer’s parameter,but the parameter of the discriminator’s layer should be fixed.
在这里插入图片描述
how to change the parameter in the layer of generator: the goal of change the parameter of the genernator layer is to let the output of the whole GAN structure to be 1.0 (which means after we adjust the parameter of the generator layer , the discriminator will reagrd the image produce by the generator be true.)

transfer learning:

why transfer learning(when use model fine-tuning): in the real world, we have a lot of data unrelated with task, but a few of data related with a task. we want to use the unrelated data(also so called as source data) to find and build a structure, then use the related data(also called target data) to finetune the structure,
在这里插入图片描述
how to finetune the structure build by the source data: maybe we can copy some layers of the whole structure and we use the target data to train the rest layer.but for different application,we should copy different number of layer to make the application performance better:
在这里插入图片描述

SVM:

在这里插入图片描述
in SVM we always use the loss function called “Hinge Loss”, in “hinge loss” when you class or regression correct you will get the loss garde as “zero”, when it is work in the whole structure it means, the correct classify or regression data have no infect to the performance of the classigier or the regressionor, only the support vector(which hinge loss is non-zero) will affect the performance of the whole SVM.
在这里插入图片描述
so the advantage of SVM is: not all data would have influence on the machine learning algorithim, only the special data can.

在这里插入图片描述

RNN:

we have different kind of RNN, u can use hidden layer’s output as return result, or u can use the output layer’s output as return result.
在这里插入图片描述
the advantage of RNN is: u can regard the whole application as a relant sequence, it means the forward factor will infect the back one.differnt order of the sequence will caz differnt meanings.

在这里插入图片描述
in RNN we always use the memory cell to store the memory, the common memory cell we now often use is the as below structure(use this memory cell u can control when u wnat to input to store the memory. and when u want to output to use the memory store and also u can decide when to forget the memory u store.)
在这里插入图片描述

ENSEMBLE：

bagging：
在这里插入图片描述

boosting：

change the different weight in f1 's dataset, let the error predict data’s weight be higher, the correct data’s weight be lower, then use the changed dataset to train f2.

something fun :

在这里插入图片描述

ANOMALY DETECTOR:

在这里插入图片描述
at first we will train a model, then we can out different inout into this model, the model will classify which label the image belongto, if the model put a very different probability on different kind of label the image will be, the image maybe normal, but if the model have difficult in identify which label the image should be,which means the probability of different kind label has little different, we may say this image is anomaly image.
在这里插入图片描述