机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率和模糊逻辑的常见问题解答

最新推荐文章于 2020-08-09 22:23:03 发布

Kylin-Xu

最新推荐文章于 2020-08-09 22:23:03 发布

阅读量1.7k

点赞数

分类专栏： story 文章标签： machine learning

story 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

Machine Learning, Big Data, Deep Learning, Data Mining, Statistics, Decision & Risk Analysis, Probability, Fuzzy Logic FAQ

Posted on 2 December 2012 by Matt Briggs

Last Updated 3 December 2012, 7:24 AM EST.

What’s the difference between machine learning, deep learning, big data, statistics, decision & risk analysis, probability, fuzzy logic, and all the rest?

None, except for terminology, specific goals, and culture. They are all branches of probability, which is to say the understanding and sometime quantification of uncertainty. Probability itself is an extension of logic.

So what’s the difference between probability and logic?

Not much, except probability deals with uncertainty and logic with certainty. Machine learning, statistics, and all the rest are matters of uncertainty. A statement of logic is a list of premises and a conclusion, and either the conclusion follows validly and the conclusion is true, else it is false. A statement of probability is also a list of premises and a conclusion, though usually the conclusion does not follow with certainty.

In mathematics there are many “logic” theories that have more than one truth value, and not just one universal “logic.” What’s up with that?

The study of “logics” is just one more branch of math. Plus, these special many-valued “truth” logics are all evaluated with the standard, Aristotelian two-value logic, sometimes called “meta-logic”, where there is only truth and falsity, right and wrong, yes and no. There is only one logic at base.

Is probability a branch of philosophy, specifically epistemology?

Of course probability is part of epistemology, as evidenced by the enormous number of books and papers written by philosophers on the subject, and over the period of centuries, most or all of which remain hidden from mathematical practitioners. See inter alia Howson & Urbach, or Adams, or Swinburne, Carnap, Hempel, Stove, that guy who just wrote a book on objective Bayes whose name escapes me, and on and on for a long stretch. Look to this space for a bibliography.
Probability can also be pure mathematical manipulation: theorems, proofs, lemmas, papers, tenure, grants. Equations galore! But the very instant you apply that math to propositions (e.g. “More get better with drug A”) you have entered the realm of philosophy, from which there is no escape. Same applies for applied math: it’s pure mathematics until it’s applied to any external proposition (“How much weight will this bridge hold?”).

Isn’t fuzzy logic different than probability?

No. It sometimes has, like mathematics, many-valued “truths” (but so can probability models), but the theory itself is also evaluated with standard logic like probability. Fuzzy logic in practical applications makes statements of uncertainty or of things which are not certain, and that makes it probability. Fuzzy logic is one of the many rediscoveries of probability, but the best in the sense of possessing a cuddly slogan. Doesn’t fuzzy logic sound cute? Meow.

What is a model?

A list of premises said to support some conclusion. Premises are usually propositions like “I observed x1 = 12″ or “My uncertainty in the outcome is quantified by this probability distribution”, but they can be as simple as “I have a six-sided object, just one side of which is labeled 6, which when tossed will show only one side.” The conclusions (like premises) are always up to us to plug in: the conclusion arises from our desires and wants. Thus I might choose, with that last proposition in mind, “A 6 shows.” We now have a complete probability model, from which we candeduce the conclusion has probability 1/6. Working probability models, such as those described below, are brocaded with more and fancier premises and complex conclusions, but the philosophy is identical.
Physical models, that is, models of physical systems, are squarely within this definition. There is nothing in the framework of a model which insists outcomes must be uncertain, so even so simple a (deterministic) equation y = a + b*x (where a and b are known with certainty) is a model. If the parameters (a and b) are not known with certainty, the model switches from deterministic to probabilistic.

Surely exploratory data analysis (EDA) isn’t a model?

Yes it is, and don’t call me Shirley. Once a picture, plot, figure, table, or summary is printed and then it is acted on it in the sense of explaining the uncertainty of some proposition, you have a premises (the pictures, assumptions) probative toward some conclusion. The model is not a formal mathematical one, but a model it still is.

What is reification?

This is when ugliness of reality is eschewed in favor of a beautiful model. The model, created by great credentialed brains, is a jewel, an object of adoration so lovely that flaws noted by outsiders are seen as gratuitous insults. The model is such an intellectual achievement that reality, which comes free, is felt an intrusion; the third wheel in the torrid love affair between modeler and model. See, e.g., climate models, econometrics.

What’s the difference between probability and decision analysis?

A bet, which if made on an uncertain outcome, becomes a decision. The probability, given standard evidence of throwing a 6 with a die is 1/6, but if you bet a six will show you have made a decision. The amount of money wagered depends on a host of factors, such as your total fortune, the level of your sanity, whether it is your money or a taxpayer’s, and so forth. Decision analysis is thus the marriage of psychology with probability.
Probability models (in all their varied forms) sometimes become decisions when instead of telling us the uncertainty of some outcome, the model insists (based on non-deducible evidence) that the outcome will be some thing or that it will take a specific value or state. See machine learningbelow.

Is all probability quantifiable?

We had a saying in the Air Force which began, “Not only no…” This answer applies here with full force. The mad rush to quantify that which is unquantifiable is the primary cause of the fell plague of over-certainty which inflicts mankind.
Example? Premise: “Some X are F & Y is X”. Conclusion: “Y is F”. Only an academic could quantify that conclusion with respect to that (and no other) premise.

What is a statistical model?

Same as a regular model, but with the goal of telling us not about the conclusion or outcome, but about the premises. In a statistical model, some premises will say something like, “I quantify the uncertainty in the outcome with this distribution, which itself has parameters a, b, c, …” The conclusion(s) ignore the outcome per se and say things instead like, “The parameter a will take these values…” This is well and good when done in a Bayesian fashion (see Bayesian andfrequentism below), but becomes a spectacular failure when the user forgets he was talking about the parameters and assumes the results speak of the actual outcome.
This all-too-common blunder is the second great cause of over-certainty. It occurs nearly always when using statistical models, but only rarely when using machine learning or deep learningmodels, whose practitioners usually have the outcomes fixed firmly in mind.

What is a neural network?

In statistics they are called non-linear regressions. These are models which take inputs or “x” values, have multitudinous parameters associated with these x values, all provided as functions of the uncertainty of some outcome or “y” values. Just like any other statistical model. But neural nets sound slick and mysterious. One doesn’t “fit” the parameters of a neural network, as one does in a non-linear regression, one lets the network “learn”, a process which when contemplated puts one in mind of Skynet.

What is machine learning?

Statistical modeling, albeit with some “hard code” written into the models more blatantly. A hard code is a rule such as “If x17 < 32 then y = ‘artichoke’.” Notice there is no uncertainty in that rule: it’s strictly if-then. These hard codes are married to typical uncertainty apparatuses, with the exception that the goal is to make direct statements about the outcome. Machine learning is therefore modeling with uncertainty with a direct view to making decisions.
This is the right approach for many applications, except when the tolerance for uncertainty of the user does not match that of the modeler

What is big data?

Whatever the labeler wants it to be; data that is not small; a faddish buzz word; a recognition that it’s difficult to store and access massive databases; a false (but with occasional, and temporary, bright truths) hope that if characteristics down to the microsecond are known and stored we can predict everything about that most unpredictable species, human beings. See this Guardianarticle. See also false hope (itself contained in the hubris entry in any encyclopedia).
Big data is a legitimate computer science topic, where timely access to tidbits buried under mountains of facts is a major concern. It is also of interest to programmers who must take and use these data in the models spoken of above, all in finite time. But more data rather than less does not imply a new or different philosophy of modeling or uncertainty.

What is data mining?

Another name for modeling, but with attempts at automating the modeling process, such that fooling yourself happens faster and with more reliability than when it was done by hand. Data mining can be useful however as the first step in a machine learning process, because if the user has big data going through by hand is not possible.

What is “deep learning”?

The opposite of shallow learning? It is nothing more than the fitting or estimating of the parameters of complex models, which are (to repeat) long lists of human-chosen premises married to human-chosen conclusions. It is also a brilliant marketing term, one of many which flow from the fervid, and very practically minded, field of computer science.
The models are usually a mixture of neural networks and hard codes, and the concentration is on the outcomes, so these practices are sound in nature. The dangers are when practitioners either engage in reification (man is a loving creature) or when they start believing their own press, as in “If the New York Times thinks I’m a genius, I must be.”

The latter is all to apt to happen (and has, many times in the past) because it is to be noted that “deep learning” applications are also simple in the sense that (e.g.) when a human being mouths the sounds flee it’s a 50-50 bet the model predicts free (perhaps, too, the locutor is saying freewith an accent; as in what do you call a Japanese lady with one leg shorter than the other?Irene.). Accomplishments in this field are thus over celebrated. In contrast, no model, “deep learning” or otherwise, is going to predict skillfully where and when the next wars will occur for the next fifty years. See artificial intelligence, or AI.

What is artificial intelligence?

Another name for probability models (but with much hard coding and few statements on uncertainty). Also, See neural nets or entries under New and Improved!.

What is Bayesianism?

Another name for probability theory, with hat tip to the God-fearing Reverend Thomas Bayes who earned naming rights with his Eighteenth century mathematical work.

What is frequentism?

A walking-dead philosophy of probability which, via self-inflicted wounds, handed in its dinner pail about eighty years ago; but the inertia of its followers, who have not yet all fallen, ensures it will be with us for a short while longer.

What is a p-value?

Something best discussed with your urologist? Also an unfortunate frequentist measure, which contributes mightily to the great blunder mentioned above.

Where can I learn more about these fascinating subjects?

Glad you asked: click here for the world’s wonders to be described (scroll down to statistics).

This is only a draft, folks, with the intention of being a permanently linked resource. I’m bound to have forgotten much. There is even the distinct possibility of a typo. Remind me of my blindnesses below. Still to come: within-page HTML anchors.

机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率和模糊逻辑的常见问题解答

1、机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率、模糊逻辑等有什么不同？

答：除了属于、具体目标和文化等，并没有什么本质的区别。它们都是概率的分支，对不确定性的理解和量化。概率本身就是逻辑的一种扩展。

2、概率和逻辑之间有什么不同？

答：两者并没有特别大的区别，除了，概率对不确定性的处理，逻辑对应着确定性。机器学习、统计等都是一种不确定性的事情。

逻辑对应着一系列条件和一个结论，要么结论遵循条件是真实的，否则是假的。

概率也对应着一系列条件和一个结论，虽然结论不是一种确定性，仅仅是一种概率。

3、在数学中，很多逻辑理论有不止一个的真值，而不是一般意义上的逻辑。那到底是如何解释呢？

答：逻辑的研究是数学上一个分支。此外，标准来评估这些特殊的真值逻辑，亚里士多德的二值逻辑，有时称为“元逻辑”，对应着唯一的对与错。仅存在一个逻辑。

4、概率是哲学的一个分支吗？特别是从认识论的角度分析。

答：当然，概率是认知论的一部分，正如很多书和文章上所证实的那样。几个世纪以来，几乎所有的这些仍然从数学实践者中隐藏。

概率也可以是纯粹的数据操作：理论、论证、文章、引理等。方程一应俱全。但是，数学命题（如更多更好的药物A）就是哲学的境界。这同样适用于应用数据、理论

数学及其其他任何外部的命题（如：桥的承重是多少？）。

5、模糊逻辑和概率有么不同吗？

答：没有。有时模糊逻辑像数学一样，有很多真值，但是像概率一样，标准逻辑也用来评估理论自身。在实际应用中，模糊逻辑可以称之为事情的不确定性。模糊逻辑是概率的重发现之一，另一种说法吧·

6、什么是模型？

答：利用模型产生期望的结果。如果y = a + b*x，如果a和b是已知的，则这个方程是一个模型。如果a和b是不确定的，方程从决策模型变为概率模型。

7、探索性分析（EDA exploratory data analysis）是不是一个模型呢？

答：是的。有一些假设导向一些结论。模型不仅仅是一个数学模型。

8、概率和决策分析之间有什么不同？

答：一个赌注作用在不确定结果中，将变为一种决策。

有时，概率模型不是告诉我们结果的不确定性，而是一种决策。

9、统计模型？

答：有些过程无法用理论分析方法导出其模型，但可通过试验或直接由工业过程测定数据，经过数理统计法求得各变量之间的函数关系，称为统计模型。

12、大数据？

第一，数据体量巨大。从TB级别，跃升到PB级别；

第二，数据类型繁多。前文提到的网络日志、视频、图片、地理位置信息等等。

第三，价值密度低。以视频为例，连续不间断监控过程中，可能有用的数据仅仅有一两秒。

第四，处理速度快。

13、数据挖掘？

数据挖掘（Data Mining，DM）又称数据库中的知识发现（Knowledge Discover in Database，KDD），是目前人工智能和数据库领域研究的热点问题，所谓数据挖掘是指从数据库的大量数据中揭示出隐含的、先前未知的并有潜在价值的信息的非平凡过程。数据挖掘是一种决策支持过程，它主要基于人工智能、机器学习、模式识别、统计学、数据库、可视化技术等，高度自动化地分析企业的数据，做出归纳性的推理，从中挖掘出潜在的模式，帮助决策者调整市场策略，减少风险，做出正确的决策。

14、深度学习？

答：浅学习的反义~ 复杂模型参数的估计或者拟合~

深度学习模型是一种神经网络~

转自：http://wmbriggs.com/blog/?p=6465

Kylin-Xu

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率和模糊逻辑的常见问题解答

机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率和模糊逻辑的常见问题解答1、机器学习、大数据、深度学习、数据挖掘、统计、决策和风险分析、概率、模糊逻辑等有什么不同？答：除了属于、具体目标和文化等，并没有什么本质的区别。它们都是概率的分支，对不确定性的理解和量化。概率本身就是逻辑的一种扩展。 2、概率和逻辑之间有什么不同？答：两者并没有
复制链接

扫一扫