python 数据科学_Python-数据科学简介

python 数据科学

python 数据科学

Python-数据科学简介 (Python - Data Science Introduction)

Data science is the process of deriving knowledge and insights from a huge and diverse set of data through organizing, processing and analysing the data. It involves many different disciplines like mathematical and statistical modelling, extracting data from it source and applying data visualization techniques. Often it also involves handling big data technologies to gather both structured and unstructured data. Below we will see some example scenarios where Data science is used.

数据科学是通过组织,处理和分析数据从大量不同的数据集中获取知识和见解的过程。 它涉及许多不同的学科,例如数学和统计建模,从源中提取数据并应用数据可视化技术。 通常,它还涉及处理大数据技术以收集结构化和非结构化数据。 下面我们将看到一些使用数据科学的示例场景。

推荐系统 (Recommendation systems)

As online shopping becomes more prevalent, the e-commerce platforms are able to capture users shopping preferences as well as the performance of various products in the market. This leads to creation of recommendation systems which create models predicting the shoppers needs and show the products the shopper is most likely to buy.

随着在线购物变得越来越普遍,电子商务平台能够捕获用户的购物偏好以及市场上各种产品的性能。 这导致创建推荐系统,该推荐系统将创建模型来预测购物者的需求并显示购物者最有可能购买的产品。

财务风险管理 (Financial Risk management)

The financial risk involving loans and credits are better analysed by using the customers past spend habits, past defaults, other financial commitments and many socio-economic indicators. These data is gathered from various sources in different formats. Organising them together and getting insight into customers profile needs the help of Data science. The outcome is minimizing loss for the financial organization by avoiding bad debt.

可以通过使用客户过去的消费习惯,过去的违约,其他财务承诺和许多社会经济指标来更好地分析涉及贷款和信贷的财务风险。 这些数据是从各种来源以不同格式收集的。 将他们组织在一起并深入了解客户资料需要数据科学的帮助。 结果是通过避免坏账将金融组织的损失降到最低。

改善卫生保健服务 (Improvement in Health Care services)

The health care industry deals with a variety of data which can be classified into technical data, financial data, patient information, drug information and legal rules. All this data need to be analysed in a coordinated manner to produce insights that will save cost both for the health care provider and care receiver while remaining legally compliant.

医疗保健行业处理各种数据,这些数据可以分为技术数据,财务数据,患者信息,药物信息和法律规则。 需要以协调的方式分析所有这些数据,以产生洞察力,从而在保持法律合规性的同时为医疗保健提供者和护理接受者节省成本。

计算机视觉 (Computer Vision)

The advancement in recognizing an image by a computer involves processing large sets of image data from multiple objects of same category. For example, Face recognition. These data sets are modelled, and algorithms are created to apply the model to newer images to get a satisfactory result. Processing of these huge data sets and creation of models need various tools used in Data science.

计算机识别图像的进步涉及处理来自相同类别的多个对象的大量图像数据。 例如,人脸识别。 对这些数据集进行建模,并创建算法以将该模型应用于较新的图像以获得满意的结果。 处理这些巨大的数据集和创建模型需要数据科学中使用的各种工具。

能源高效管理 (Efficient Management of Energy)

As the demand for energy consumption soars, the energy producing companies need to manage the various phases of the energy production and distribution more efficiently. This involves optimizing the production methods, the storage and distribution mechanisms as well as studying the customers consumption patterns. Linking the data from all these sources and deriving insight seems a daunting task. This is made easier by using the tools of data science.

随着对能源消耗的需求激增,能源生产公司需要更有效地管理能源生产和分配的各个阶段。 这涉及优化生产方法,存储和分配机制以及研究客户的消费模式。 链接所有这些来源的数据并获得洞察力似乎是一项艰巨的任务。 通过使用数据科学工具,这变得更加容易。

数据科学中的Python (Python in Data Science)

The programming requirements of data science demands a very versatile yet flexible language which is simple to write the code but can handle highly complex mathematical processing. Python is most suited for such requirements as it has already established itself both as a language for general computing as well as scientific computing. More over it is being continuously upgraded in form of new addition to its plethora of libraries aimed at different programming requirements. Below we will discuss such features of python which makes it the preferred language for data science.

数据科学的编程要求需要一种非常通用但灵活的语言,该语言编写起来很简单,但是可以处理高度复杂的数学处理。 Python最适合于此类要求,因为它已经确立了自己的地位,既可以作为通用计算语言,也可以作为科学计算语言。 不仅如此,它还在以大量针对不同编程需求的库的形式以新的形式不断升级。 下面我们将讨论python的此类功能,这使其成为数据科学的首选语言。

  • A simple and easy to learn language which achieves result in fewer lines of code than other similar languages like R. Its simplicity also makes it robust to handle complex scenarios with minimal code and much less confusion on the general flow of the program.

    与R等其他类似语言相比,一种简单易学的语言所产生的代码行更少。它的简单性还使其能够以最少的代码来处理复杂的场景,并且对程序的总体流程的混淆也更少。
  • It is cross platform, so the same code works in multiple environments without needing any change. That makes it perfect to be used in a multi-environment setup easily.

    它是跨平台的,因此同一代码可在多个环境中工作而无需任何更改。 这使其很容易在多环境设置中使用。
  • It executes faster than other similar languages used for data analysis like R and MATLAB.

    它比其他用于数据分析的类似语言(如R和MATLAB)执行速度更快。
  • Its excellent memory management capability, especially garbage collection makes it versatile in gracefully managing very large volume of data transformation, slicing, dicing and visualization.

    其出色的内存管理功能(尤其是垃圾收集)使其能够灵活地管理大量数据转换,切片,切块和可视化。
  • Most importantly Python has got a very large collection of libraries which serve as special purpose analysis tools. For example – the NumPy package deals with scientific computing and its array needs much less memory than the conventional python list for managing numeric data. And the number of such packages is continuously growing.

    最重要的是,Python拥有大量的库,这些库可以用作特殊目的的分析工具。 例如,NumPy软件包用于科学计算,其数组比用于管理数字数据的常规python列表所需的内存少得多。 并且此类包装的数量正在不断增长。
  • Python has packages which can directly use the code from other languages like Java or C. This helps in optimizing the code performance by using existing code of other languages, whenever it gives a better result.

    Python的程序包可以直接使用其他语言(如Java或C)的代码。只要有更好的结果,它就可以使用其他语言的现有代码来帮助优化代码性能。

In the subsequent chapters we will see how we can leverage these features of python to accomplish all the tasks needed in the different areas of Data Science.

在随后的章节中,我们将看到如何利用python的这些功能来完成数据科学不同领域所需的所有任务。

翻译自: https://www.tutorialspoint.com/python_data_science/python_data_science_introduction.htm

python 数据科学

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值