分清big data,ML,AI之间的关系

原创 2016年06月01日 11:46:15

How are big data and machine learning related?(大数据与机器学习间关系)

Big data and machine learning are not related, but when used together can do real wonder. (没有直接联系,但是在一起效果更好)

Machine Learning & Big Data: The learning comes from extensive calculations done over existing datasets to create a learning model(in most cases). A normal system can’t handle very large dataset calculation and data size is increasing day by day, thus the obtained model should be adapted accordingly. To obtain this we have to implement distributed computing using big data technologies like Apache Mahout, Spark, R-Hadoop or initial analytics processing in projects like hive/ pig and feed output to machine learning algorithms for model/ learning generation.(机器学习需要对已经存储的数据集进行广泛计算进而产生学习模型。但是常规的系统不能处理大量的数据集,并且数据大小与日俱增,随着时间推移,已经得到的模型需要进行更新。为了达成这个目标,我们需要用分布式计算,利用大数据的技术,来产生模型和机器学习算法。)

You can apply machine learning algorithms to big data and/or you can apply big data processing techniques to machine learning.(两种技术可以相互渗透)

An example of the first case would be training a neural network or logistic regression with a large dataset using online gradient descent.(在大数据集上用在线梯度下降来训练神经网络或逻辑回归)

An example of the second case would be parallelizing gradient descent to run in a Map-Reduce environment.(在Map-Reduce环境下执行并行梯度下降)

In Machine learning large datasets usually mean you need to use simpler algorithms and they perform much better than on smaller datasets.

There are two types of insights anyone can get from a dataset :
Q1. Direct (group by/join/ sum/ max / average)(直接)
Q2. Inductive (if something is.. then something else is.. else anything is..)(推测)

Mind that the first type of insights are always exact, so you need to use computational tools like excel in small data and hadoop in big data to calculate.
The inductive insights on the other hand are approximations on seeing the data. For small amount of Data, a human can try and infer things seeing charts/graphs etc. However, when the data is huge, its beyond human capacity to infer rules from data. This is exactly when Machine Learning comes in.

One of the biggest reason’s why we use big data is to extract some meaning out of it, so that we can make better decisions. And that’s what machine learning does! It is the science of training systems to learn from data and output appropriate response without being explicitly programmed for that .But, on flip side without big data machine learning would be totally irrelevant, because to learn anything from data you need to have a large number of ‘training examples’ so that all possible scenarios are exhausted and also to avoid faulty training due to few erroneous datasets.
So, they are deeply interconnected.(一句话,大数据集让机器学习出来的模型不偏颇)

I have often found these terms used in an interchangeable way, which is totally wrong.
Big data has got more to do with High Performance Computing(大数据跟高性能计算相关), while Machine Learning is a part of Data Science(机器学习是数据科学的一部分). What happens in Big Data is large volumes of data which cannot be processed in reasonable amount of time, is processed quickly by various techniques and tools. In Machine Learning, a system learns from past experiences and is able to build a model which would most likely be able to comprehend future instances.
One of the main reason why big data and machine learning are used together is because big data is more likely to be a preprocessing step to machine learning.

Machine Learning is a science of studying patterns in the data. These patterns explain how the data is correlated. This correlated data is used to make future predictions.

Big Data is an art of working with large amount of data. As such, machine learning could be done on a smaller set of data, but larger the data; better the predictions.

So if I were to give a short answer; When you have a lot of structured/unstructured data that you want to study and find patterns, then you use big data and run your Machine Learning algorithms and find patterns that make a business use case.

Machine Learning - Build models. When people hear the term “machine learning”, they make mental images of robots who walk, climb or clean houses. In reality, machine learning starts alot closer to home. When you open your emails, spam has been filtered out from your important messages by an algorithm that has learnt to classify “spam” and “not spam”. Your Facebook news feed features posts from your closest friends because an algorithm has examined your likes, tags and photos to decipher who you connect with most. When you upload a photo and the website identifies your face, it’s fuelled by a facial recognition algorithm. When you use a search engine, you see the best and most relevant content first because of a sophisticated search ranking algorithm. In short, machine learning permeates our lives i.e it builds models for self learning algorithms.
Data Mining - It is an analytic process designed to explore data and consequently find Patterns in data. It is a practice of applying algorithms (mostly Machine learning algorithms ) to find patterns in data.
Artificial Intelligence - Behaves and Reasons. Science to develop a system or software to mimic human to respond and behave in a circumference. As field with extremely broad scope, AI has defined its goal into multiple chunks. Later each chuck has become a separate field of study to solve its problem.
Major list of AI goal :-
Knowledge Representation
Computer Vision
Machine Leaning
Natural Language
General intelligence, or strong AI
Machine learning is field emerged from one the AI goal to help machine to learn on it own to solve problems it’s can come across.

Natural language processing is another such field emerged from AI goal to help machine to communicate with real human.

Computer vision is a field emerged from AI goal to identify and distinguish objects that the machine could see.

Robotics is a field emerged from AI goal to give a physical appearance for a machine to do physical actions.



AI, ML, DL的区别

人工智能(Artificial Intelligence)、机器学习(Machine Learning)、深度学习(Deep Learning)经常混叫,虽然没有非常准确的定义,但基本上是下面这些图所...
  • anjy
  • anjy
  • 2017年12月07日 00:22
  • 219


1.一个故事说明什么是机器学习 2.机器学习的定义 4.机器学习的方法 5.机器学习的应用–大数据 6.机器学习的子类–深度学习 7.机器学习的父类–人工智能 ...
  • wishchin
  • wishchin
  • 2015年03月23日 23:13
  • 5338

分清big data,ML,AI之间的关系

How are big data and machine learning related?(大数据与机器学习间关系)下面是回答: 1. Big data and machine learning...
  • he_world
  • he_world
  • 2016年06月01日 11:46
  • 1605

Big Data and AI Strategies Machine Learning and Alternative Data Approach to Inv

  • 2017年06月29日 21:32
  • 10.79MB
  • 下载

Big Data、AI、ML for IDS

UADIUADI 是 Unsupervised Anomaly Detection in Intrusion Detection Systems 的缩写。 这是 EHU 的一个研究团队于 2010 ...
  • MachineIntellect
  • MachineIntellect
  • 2017年03月23日 16:37
  • 151

搞 AI/ML 公司中 90% 从事的业务与 AI/ML 根本不沾边!

IT派 - {技术青年圈}持续关注互联网、大数据、人工智能领域关注往期精彩回顾  重大改变!Excel即将接入Python!办公软件也要革命2017年大数据领域薪资有多高?上新 | 五一劳动节样式欣赏...
  • j2IaYU7Y
  • j2IaYU7Y
  • 2017年12月27日 00:00
  • 109

Internet of Things(IoT) with AI, Big Data and Cloud(ABC)_Kai Zhao_赵锴

  • 2017年09月08日 22:55
  • 17KB
  • 下载

Big Data, AI for 保险。。。

对于大数据改造,大致可分为五大阶段: 阶段一:移动互联设备作为入口。比如说透过穿戴设备,透过大数据整理出1000万个客户的所有资料,血糖、身高、心跳、出行习惯等等资料全部收集起来。 ...
  • YuAngGongNingWoSen
  • YuAngGongNingWoSen
  • 2016年10月26日 16:50
  • 122

大数据(big data)究竟是什么?

“大数据”这个词最近两三年在IT界越来越热门,搞IT的如果嘴里不说起大数据,就好象是落了伍。大数据的意思不同人有不同的说法,比较实在含义是特指以Hadoop为代表的大型并发机群(Massively P...
  • xhanfriend
  • xhanfriend
  • 2012年11月30日 16:23
  • 1572
您举报文章:分清big data,ML,AI之间的关系