最近被问到了一个问题:Data Science是干什么的?
尽管一直在说Data Science,但是还真的没有深入的、认真的研究过它的起源。
Data Science,数据科学,一般的解释是:
数据科学仅是一种概念,它结合了统计学、数据分析、机器学习及其相关方法,旨在利用数据对实际现象进行“理解和分析”。
简单来讲:数据科学是一门将数据变得有用的学科。
看看IBM的描述:https://www.ibm.com/analytics/data-science
What is data science, and why does it matter?
Data science is the process of using algorithms, methods, and systems to extract knowledge and insights from structured and unstructured data. It uses analytics and machine learning to help users make predictions, enhance optimization, and improve operations and decision making.
Today’s data science teams are expected to answer many questions. Business demands better prediction and optimization based on real-time insights backed by tools for ModelOps and cloud data science.
The data science lifecycle starts with gathering data from relevant sources, cleaning it and putting it in formats that machines can understand. In the next phase, statistical methods and other algorithms are used to find patterns and trends. Then models are programmed and built to predict and forecast; finally, results are interpreted.
Advances in AI, machine learning and automation have raised the standards of data science tools for business. The result is the formation of data science teams — expert data scientists, citizen data scientists, programmers, engineers and business analysts — that extend across business units.
The opportunity here is massive. The automation of tedious data science tasks such as data preparation, and the empowerment of analysts without coding experience (00:21) to build models, keeps business agile and innovative. Automating the data science lifecycle frees expert data scientists to address the more interesting and innovative aspects of the field. Human intelligence — combined with data science technology and automation — helps a business extract greater value from data.
什么是数据科学,为什么它重要?