统计学 相关性 因果
You might remember this simple mantra from your statistics class:
您可能还记得统计课上的这个简单口头禅:
"Correlation does not imply causation."
“相关并不意味着因果关系。”
So maybe you think you know what this phrase means.
因此,也许您认为您知道此短语的含义。
Like, if you studied really hard in statistics, got a good grade, and then got into college, it must mean that you got into college because you aced Statistics class.
就像,如果您真的很努力地学习统计学,获得了一个不错的成绩,然后进入了大学,那一定意味着您进入了大学,因为您获得了统计学课的荣誉。
While that grade, along with the skills you learned, probably helped, you can't ignore the other factors at play - and likely can't argue that your Stats grade was the cause of your acceptance into college.
尽管该年级以及您所学到的技能可能有所帮助,但您不能忽略其他影响因素-并且可能无法说出您的Stats成绩是您被大学录取的原因。
首先,我们为什么将因果关系误认为是因果关系? (First things first - why do we mistake correlation with causation?)
It's easy to think that just because two things seem related, that one must be the cause of the other. But that can be a foolish and sometimes dangerous assumption.
容易想到,仅仅因为两件事看起来相关,所以一件事一定是另一件事的原因。 但这可能是愚蠢的,有时甚至是危险的假设。
For example, suppose you're trying to figure out what makes people less grumpy. You perform a study which finds that, when people get at least x hours of sleep a night, they're less grumpy.
例如,假设您正在尝试弄清什么使人们减少了脾气。 您进行的一项研究发现,当人们每晚至少睡眠x个小时时,他们的脾气就减少了。
But have you taken all factors into account here? Perhaps they also started working out more as a consequence of being well-rested, and this is what altered their moods.
但是您在这里考虑了所有因素吗? 也许由于休息好了,他们也开始锻炼了,这改变了他们的心情。
Not all examples are quite so benign - and some are downright nonsensical.
并非所有示例都如此良性-有些示例完全是荒谬的。
To illustrate how misleading it can be to assume that correlation implies causation, have a look at the following graph from Tyle