数据科学和数学建模_数据科学与国际象棋心理建模重叠

最新推荐文章于 2024-09-14 08:25:16 发布

weixin_26730921

最新推荐文章于 2024-09-14 08:25:16 发布

阅读量413

点赞数

文章标签： python 数学建模机器学习人工智能 java

原文链接：https://towardsdatascience.com/the-data-science-overlap-with-chess-mental-modeling-9238103843e9

版权

数据科学和数学建模

Chess and data science have a lot in common. Some seemingly surface-level parallels include imposter syndrome and a feeling of powerlessness in the face of overwhelming complexity and indecision, all on top of a time crunch.

国际象棋和数据科学有很多共同点。一些看似在表面上的相似之处包括冒名顶替综合症和面对压倒性的复杂性和犹豫不决的无能为力的感觉，所有这些都在时间紧缩之上。

If we look closer, though, these upfront similarities belie truly deeper parallels between fields. These types of experience don’t often come from light-hearted hobbies: frustration, complexity, despair, ecstasy, confusion, pride, and understanding are the earmarks of a pursuit which wholeheartedly consumes the practitioner for the duration of the activity and beyond, and the lifetimes and books devoted the study of chess or data science demonstrates the intensity of experience which belongs to both.

但是，如果我们仔细观察，这些前期相似之处实际上就是领域之间更深层次的相似之处。这些类型的经验通常并非来自轻松的爱好：沮丧，复杂，绝望，摇头丸，困惑，骄傲和理解是追求的目的，在整个活动过程中及以后，我们全心全意地消耗着从业者，并且致力于国际象棋或数据科学研究的一生和书籍证明了两者的经验强度。

Of course, much of what we call ‘data science’ has historically been plain old statistics, though admittedly without the furor of machine learning or deep learning, and experienced data scientists know all too well that often A/B testing, or two-sample hypothesis testing, can provide far more value and intelligible insight than neural networks.

当然，尽管我们承认没有机器学习或深度学习的知识，但我们所谓的“数据科学”在历史上大多是简单的旧统计数据，而且经验丰富的数据科学家非常了解，经常进行A / B测试或两次抽样假设检验可以提供比神经网络更多的价值和可理解的洞察力。

This is a good entry point into the main fundamental parallel between chess and data science, which is that in either category, experience, pattern recognition, and intuition are driving factors of success. It is fairly well-established that chess grandmasters, in particular, are successful by virtue of their deep pattern-recognition attained through elongated and intensive experience. While the exact explanatory aspect of pattern recognition is unclear (since young grandmasters with only a couple of decades of experience can win over older grandmasters with lifetimes of experience), it is clear that chess is strangely more Bayesian than it may initially appear, especially in the context of falsely algorithmic attitudes surrounding chess skill.

这是国际象棋与数据科学之间主要基础平行的一个很好的切入点，这就是说，无论是类别， 经验，模式识别和直觉都是成功的驱动因素 。众所周知，特别是国际象棋大师们，通过长期和深入的经验获得了深刻的模式识别能力而获得了成功。虽然模式识别的确切解释方面尚不清楚(因为只有几十年经验的年轻宗师可以胜任具有一生经验的年长宗师)，但很显然，象棋比起最初出现的象棋更像是贝叶斯，尤其是在围绕国际象棋技巧的错误算法态度的背景。

国际象棋背后的基本技能 (The Primary Skill Behind Chess)

A fascinating discussion of chess grandmaster thinking can be found here. While pattern recognition is a general theme, the primary focus of the discussion concerns the thinking process of grandmasters itself. What are the typical thoughts and observations that drive a peak-operating grandmaster to burn up to 6,000 calories a day during a tournament?

有关棋类大师思想的有趣讨论可以在这里找到。尽管模式识别是一个总的主题，但讨论的主要重点是大师的思考过程 。在比赛中，顶峰大师每天要消耗多达6,000卡路里的典型思想和观察是什么？

Robert Sapolsky, who studies stress in primates at Stanford University, says a chess player can burn up to 6,000 calories a day while playing in a tournament, three times what an average person consumes in a day. Based on breathing rates (which triple during competition), blood pressure (which elevates) and muscle contractions before, during and after major tournaments, Sapolsky suggests that grandmasters’ stress responses to chess are on par with what elite athletes experience. -Aishwarya Kumar

斯坦福大学(Stanford University)研究灵长类动物压力的罗伯特·萨波尔斯基(Robert Sapolsky)表示，国际象棋棋手在比赛中每天可燃烧多达6,000卡路里的热量，是普通人一天所消耗卡路里的三倍。根据重大比赛之前，期间和之后的呼吸速率(在比赛中增加三倍)，血压(在高血压中上升)和肌肉收缩，萨波尔斯基建议，大师级对国际象棋的压力React与精英运动员的经历相当。 - 艾西瓦娅·库玛 ( Aishwarya Kumar)

The answer is not deep algorithmic thinking. There is a myth that chess grandmasters are human computers, somehow capable of multi-thread processing and handling superhuman mental permutations at a scale far higher than other humans.

答案不是深入的算法思维 。有一个神话，国际象棋大师是人机，以某种方式能够以多线程处理和处理超人的心理排列，其规模远高于其他人。

Even supercomputers haven’t solved chess, given that the mere lower bound of the number of possible chess games is described by 10¹²⁰, which is the game-tree complexity of chess also known as the Shannon Number. Even looking 6 or 7 moves ‘into the future’ is just as difficult for a chess grandmaster as it is for a novice in terms of permutation processing.

甚至超级计算机也没有解决过象棋，因为可能的象棋游戏数目的下限仅用10 12来描述，这是象棋的游戏树复杂性，也称为香农数。对于国际象棋大师而言，即使是将6或7步“推向未来”也一样，对新手而言在排列处理上也同样困难。

The consensus, rather, is that grandmasters are far more quick at seeing threats, summarizing positions and holding it in long-term memory, and recognizing the inevitable lines from those states.

相反，共识是，宗师们要更快地看到威胁，汇总职位并将其保留在长期记忆中，并能识别出来自这些州的必然路线。

While this seems to imply that grandmasters are seeing ‘more into the future’, a novice can equally see checkmate in 4 or 5 moves once the pieces on the board are reduced to two kings and pawn approaching queen promotion. The difference is that grandmasters can handle more complexity and understand the most important information amid the noise, and recognize the relatively few (and often forced) options before them, effectively reducing the possible permutations to a surprisingly small decision tree. The important point is that they aren’t better permutation processors: they are just more capable of simplifying a complex state to less complex one, and then making their decisions based on prior experience with game states and pattern recognition, along with a supreme mental representation of the state.

虽然这似乎暗示着大师们正在看到“更多的未来”，但是一旦董事会中的棋子减少为两名国王，并且当兵接近皇后晋升，新手就可以以4到5步的速度看到同伴。不同之处在于，大师级可以处理更多的复杂性，并在噪声中了解最重要的信息 ，并识别出面前相对较少(通常是强制性)的选项，从而有效地将可能的排列减少到了令人惊讶的小决策树。重要的一点是它们不是更好的排列处理器：它们更有能力将复杂的状态简化为不太复杂的状态，然后根据先前对游戏状态和模式识别的经验以及至高的心理表示来做出决定。 状态的 。

Garry Kasparov could supposedly remember ‘the moves of all the games he had played in the past 6 months’, while Magnus Carlsen supposedly memorized 10,000 chess games, in an article by Bill Wall which seems to confirm the explanatory significance of long-term memory in chess.

加里·卡斯帕罗夫(Garry Kasparov)可能记得“他过去6个月内玩过的所有游戏的动作”，而马格努斯·卡尔森(Magnus Carlsen)则记得10,000场国际象棋，在比尔·沃尔(Bill Wall)的一篇文章中似乎证实了长期记忆的解释意义。棋。

How does any of this relate to data science?

这与数据科学有何关系？

Because in data science, much of this process would be called modeling! The vocabulary of STEM can often be confining as it is useful, and while modeling has specific connotations with statistical models, the process of model-building is nearly identical.

因为在数据科学中，此过程的大部分都称为建模 ！ STEM的用语通常很局限，尽管建模具有统计模型的特定含义，但模型构建的过程几乎相同。

While we often think of a linear regression or random-forest model in Python, built with sklearn and fit on a set of numpy arrays, the actual process of feature engineering, log-transforming, cleaning and imputing null values based on context, and visualizing data for deeper understanding of the data all utilize the same skills, for the same purpose, as a grandmaster staring at the board for an hour blazing away calories: to improve prediction and to gain a larger scope understanding of the data on an intuitive level. This larger-scope is often referred to as ‘domain knowledge’ or ‘business understanding’, or theoretical understanding in a scientific context, and fits into the cross-industry standard process for data mining, or CRISP-DM:

虽然我们经常想到使用sklearn构建并适合一组numpy数组的Python中的线性回归或随机森林模型，但是功能工程的实际过程，对数转换，基于上下文清理和估算空值以及可视化数据以更深入地了解数据所有人都使用相同的技能，并且达到相同的目的，就像大师级大师盯着董事会一小时燃烧卡路里一样： 在直观的水平上改善预测并获得更大范围的数据理解 。这种更大的范围通常被称为“领域知识”或“业务理解”，或在科学背景下的理论理解，适合于跨行业的数据挖掘标准流程，或CRISP-DM：

Image for post — Creative Commons Creative Commons的 Kenneth Jensen

Except that in chess, we might describe this first step as ‘game state understanding’, a deep and intuitive understanding of color-respective proximity to checkmate, or advantage, which may be roughly measured by average centipawn loss, or computer-determined strength of play. The ‘deployment’ step in CRISP-DM would be analogous to an actual chess move.

除了在国际象棋中，我们可以将第一步描述为“游戏状态理解”，即对颜色分别与将死或优势的深入和直观的理解，这可以通过平均cent损失或计算机确定的强度来大致衡量。玩。 CRISP-DM中的“部署”步骤类似于实际的棋步。

While statistical modeling misses the connotations of a chess grandmaster’s thinking process, the intent and flow of steps are shockingly similar, particularly in how the modeling ‘fit’ and ‘predict’ steps are equivalent to the ‘looking 20 steps ahead’ type of algorithmic processing grandmasters are so stereotypically known for, when in fact much of the real skill comes from longer-term memory, pattern recognition, contextual understanding of the problem, and a supreme agility and precision of mental modeling.

尽管统计建模错过了国际象棋大师的思维过程的内涵，但步骤的意图和流程却令人震惊地相似，尤其是在建模“拟合”和“预测”步骤如何等同于“向前看20个步骤”类型的算法处理方面大师们以刻板印象着称，事实上，大多数真正的技能来自长期记忆，模式识别，对问题的上下文理解以及思维模型的最高敏捷性和准确性。

I believe chess can be a worthy application of time for an aspiring data scientist, in that it’s both personally rewarding and enriching for the skillset of a data scientist.

我相信国际象棋对于有抱负的数据科学家来说可以是值得的时间应用，因为它对数据科学家的技能既有益又有益。

Also, it’s fun. Thanks for reading!

另外，这很有趣。谢谢阅读！