机器学习,监督学习,非监督学习,强化学习

What is Machine learning?
What is Learning for a machine?
A machine is said to be learning from past Experiences(data feed in) with respect to some class of Tasks, if it’s Performance in a given Task improves with the Experience.
在这里插入图片描述
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning focuses on the development of computer programs that can access data and use it learn for themselves.

The process of learning begins with observations or data, such as examples, direct experience, or instruction, in order to look for patterns in data and make better decisions in the future based on the examples that we provide. The primary aim is to allow the computers learn automatically without human intervention or assistance and adjust actions accordingly.

Training the system:
While training the model, data is usually split in the ratio of 80:20 i.e. 80% as training data and rest as testing data. In training data, we feed input as well as output for 80% data. The model learns from training data only. We use different machine learning algorithms(which we will discuss in detail in next articles) to build our model. By learning, it means that the model will build some logic of its own.
Once the model is ready then it is good to be tested. At the time of testing, input is fed from remaining 20% data which the model has never seen before, the model will predict some value and we will compare it with actual output and calculate the accuracy.

Some machine learning methods
Machine learning algorithms are often categorized as supervised or unsupervised.
在这里插入图片描述

Supervised&Unsupervised machine learning

Supervised machine learning algorithms can apply what has been learned in the past to new data using labeled examples to predict future events. Starting from the analysis of a known training dataset, the learning algorithm produces an inferred function to make predictions about the output values. The system is able to provide targets for any new input after sufficient training. The learning algorithm can also compare its output with the correct, intended output and find errors in order to modify the model accordingly.
根据已有的数据集,知道输入和输出结果之间的关系。根据这种已知的关系,训练得到一个最优的模型。也就是说,在监督学习中训练数据既有特征(feature)又有标签(label),通过训练,让机器可以自己找到特征和标签之间的联系,在面对只有特征没有标签的数据时,可以判断出标签。

  1. Classification : It is a Supervised Learning task where output is having defined labels(discrete value). The goal here is to predict discrete values belonging to a particular class and evaluate on the basis of accuracy.
    It can be either binary or multi class classification. In binary classification, model predicts either 0 or 1 ; yes or no but in case of multi class classification, model predicts more than one class.
    和回归最大的区别在于,分类是针对离散型的,输出的结果是有限的。分类就是,要通过分析输入的特征向量,对于一个新的向量得到其标签。
    Example: Gmail classifies mails in more than one classes like social, promotions, updates, forum.
  2. Regression : It is a Supervised Learning task where output is having continuous value.The goal here is to predict a value as much closer to actual output value as our model can and then evaluation is done by calculating error value. The smaller the error the greater the accuracy of our regression model.
    回归问题针对连续型变量,通俗一点就是,对已经存在的点(训练数据)进行分析,拟合出适当的函数模型y=f(x),这里y就是数据的标签,而对于一个新的自变量x,通过这个函数模型得到标签y。
  3. 监督学习最常见的四种算法

In contrast, unsupervised machine learning algorithms are used when the information used to train is neither classified nor labeled. Unsupervised learning studies how systems can infer a function to describe a hidden structure from unlabeled data. The system doesn’t figure out the right output, but it explores the data and can draw inferences from datasets to describe hidden structures from unlabeled data.
我们不知道数据集中数据、特征之间的关系,而是要根据聚类或一定的模型得到数据之间的关系。可以这么说,比起监督学习,无监督学习更像是自学,让机器学会自己做事情,是没有标签(label)的。

  1. Clustering: A clustering problem is where you want to discover the inherent groupings in the data, such as grouping customers by purchasing behavior.
    聚类,就是根据数据的“相似性”将数据分为多类的过程。估算两个不同样本之间的相似性,通 常使用的方法就是计算两个样本之间的“距离”,最常用的就是欧式距离,此外还有马氏距离,曼哈顿距离,余弦距离等。
  2. Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such as people that buy X also tend to buy Y.
  3. 吴恩达机器学习笔记(十三)-无监督学习

Semi-supervised machine learning

Semi-supervised machine learning algorithms fall somewhere in between supervised and unsupervised learning, since they use both labeled and unlabeled data for training – typically a small amount of labeled data and a large amount of unlabeled data. The systems that use this method are able to considerably improve learning accuracy. Usually, semi-supervised learning is chosen when the acquired labeled data requires skilled and relevant resources in order to train it / learn from it. Otherwise, acquiring unlabeled data generally doesn’t require additional resources.

Reinforcement machine learning

Reinforcement machine learning algorithms is a learning method that interacts with its environment by producing actions and discovers errors or rewards. Trial and error search and delayed reward are the most relevant characteristics of reinforcement learning. This method allows machines and software agents to automatically determine the ideal behavior within a specific context in order to maximize its performance. Simple reward feedback is required for the agent to learn which action is best; this is known as the reinforcement signal.

在这里插入图片描述
在这里插入图片描述
强化学习是一种试错方法,其目标是让软件智能体在特定环境中能够采取回报最大化的行为。强化学习在马尔可夫决策过程环境中主要使用的技术是动态规划(Dynamic Programming)。流行的强化学习方法包括自适应动态规划(ADP)、时间差分(TD)学习、状态-动作-回报-状态-动作(SARSA)算法、Q 学习、深度强化学习(DQN);其应用包括下棋类游戏、机器人控制和工作调度等。

  • 智能体(Agent):可以采取行动的智能个体
  • 行动(Action):A是智能体可以采取的行动的集合。一个行动(action)几乎是一目了然的,但是应该注意的是智能体是在从可能的行动列表中进行选择。
  • 环境(Environment):指的就是智能体行走于其中的世界。这个环境将智能体当前的状态和行动作为输入,输出是智能体的奖励和下一步的状态。如果你是一个智能体,那么你所处的环境就是能够处理行动和决定你一系列行动的结果的物理规律和社会规则。
  • 状态(State,S):一个状态就是智能体所处的具体即时状态;也就是说,一个具体的地方和时刻,这是一个具体的即时配置,它能够将智能体和其他重要的失事物关联起来
  • 奖励(Reward,R):奖励是我们衡量某个智能体的行动成败的反馈。面对任何既定的状态,智能体要以行动的形式向环境输出,然后环境会返回这个智能体的一个新状态(这个新状态会受到基于之前状态的行动的影响)和奖励(如果有任何奖励的话)。奖励可能是即时的,也可能是迟滞的。它们可以有效地评估该智能体的行动。

环境就是能够将当前状态下采取的动作转换成下一个状态和奖励的函数;智能体是将新的状态和奖励转换成下一个行动的函数。我们可以知悉智能体的函数,但是我们无法知悉环境的函数。环境是一个我们只能看到输入输出的黑盒子。强化学习相当于智能体在尝试逼近这个环境的函数,这样我们就能够向黑盒子环境发送最大化奖励的行动了。

当然,奖励并不是强化学习的专利。在马尔科夫决策过程(MDP)中最优策略的定义也涉及到奖励。 最佳政策是最大化预期总回报的政策。 强化学习的任务是使用观察到的奖励来学习再当前环境中的最优(或接近最优)策略。

Machine learning enables analysis of massive quantities of data. While it generally delivers faster, more accurate results in order to identify profitable opportunities or dangerous risks, it may also require additional time and resources to train it properly. Combining machine learning with AI and cognitive technologies can make it even more effective in processing large volumes of information.

相关文章:
这可能是最简单易懂的机器学习入门
一文横扫机器学习要点:监督学习、无监督学习、深度学习
强化学习

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

TonyHsuM

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值