爬十格阶梯每次一步或两步_爬上信息阶梯

爬十格阶梯每次一步或两步

如何将数据转化为信息 (How to turn data into information)

Many organisations have a lot of data without much idea what to do with it. Data Science (loosely used as a term to encompass Business Intelligence, Data Visualisation, Machine Learning, etc) is a relatively new industry to figure out how to extract value from this data.

许多组织拥有大量数据,却不知道如何处理。 数据科学(通常用术语来涵盖商业智能,数据可视化,机器学习等)是一个相对较新的行业,旨在弄清楚如何从这些数据中提取价值。

This article demonstrates there is a progressive learning to extract more and more information from the many sources of data.

本文说明了从许多数据源中提取越来越多信息的渐进式学习。

Converting this data into useful information is a journey.

将此数据转换为有用的信息是一段旅程。

The Information Ladder shows that journey, starting from the simple, all the way to complex machine learning.

信息阶梯展示了从简单的整个过程到复杂的机器学习的整个过程。

数据和信息之间的区别 (The difference between data and information)

In the workplace I often hear the words “data” and “information” used mistakenly. These are two distinct things, often confused for the same thing.

在工作场所,我经常听到错误地使用“数据”和“信息”这两个词。 这是两个截然不同的事物,通常将同一事物混淆。

In a world where many organisations are striving to become more data-driven, it is important to understand the distinction.

在许多组织都在努力变得越来越以数据为主导的世界中,了解这种区别很重要。

Data is the raw material used to produce information.

数据是用于产生信息的原材料。

Information is a product of data; without data, there won’t be information.

信息是数据的产物; 没有数据,就不会有信息。

Data comes in many shapes and sizes. Some data changes frequently, some is static.

数据有许多形状和大小。 有些数据经常更改,有些是静态的。

Frequently changing data includes website data; some websites have many clicks per second. Currency exchange rates are another good example, these fluctuate throughout the day.

经常变化的数据包括网站数据; 有些网站每秒点击次数很多。 货币汇率是另一个很好的例子,它们全天波动。

Static data would include the capital city of a country or the tenor of a bond.

静态数据将包括一个国家的首都或债券的期限。

Examples of data include:

数据示例包括:

  • Click data on a website

    点击网站上的数据
  • The written content of your emails

    电子邮件的书面内容
  • The static attributes of a book

    一本书的静态属性
  • A list of countries with their country code

    带有国家代码的国家/地区列表
  • Names of event attendees

    活动参加者姓名
  • Whether an email recipient engaged with the email

    电子邮件收件人是否参与了电子邮件
  • The holdings of an investment account

    投资账户的持有量
  • Static attributes of a fixed income product

    固定收益产品的静态属性

The explosion of the internet has led to a significant increase in data. I won’t even attempt to quantify how much data, (others have), but everything that happens online is another data point. Every email, web click, like, video view, tweet, etc is stored somewhere.

互联网的爆炸性增长导致数据的大量增加。 我什至不会尝试量化多少数据( 其他人拥有 ),但是在线发生的一切都是另一个数据点。 每个电子邮件,网页点击(例如视频视图,推文等)都存储在某个位置。

Gaining value from the data is a significant challenge. It takes time for organisations to become data-driven. It takes time to understand the data and it takes time to work out what to do with the data.

从数据中获取价值是一项重大挑战。 组织需要时间来成为数据驱动的。 理解数据需要花费时间,并且需要花费时间来处理数据。

Therefore, when starting on the journey to make an organisation more data-driven, it can be difficult to manage expectations.

因此,在开始使组织更多地由数据驱动的过程中,可能难以管理期望。

信息阶梯表示将数据转换为有用的信息。 (The Information Ladder represents the conversion of data into useful information.)

Climbing the Information Ladder is a journey.

爬上信息阶梯是一段旅程。

The lessons learned on each rung enable progression to the next.

在每个梯级上获得的经验教训使您可以进入下一个梯级。

Each rung leads to a deeper understanding of the data. At the bottom rungs of the ladder, the data work is often simpler. Moving up the ladder often leads to more advanced data work and a larger amount of data sources. At the upper rungs, the data work becomes more complex.

每个梯级可导致对数据的更深入了解。 在阶梯的底部,数据工作通常更简单。 爬上梯子通常会导致更高级的数据工作和更多的数据源。 在高层,数据工作变得更加复杂。

数据到信息阶梯的梯级是什么? (What are the rungs of the Data to Information Ladder?)

静态报告 (Static reports)

Image for post
TAR Solutions TAR Solutions

These are often the starting point of any business information. Usually table style reports, often someone getting data, putting it in Excel/Powerpoint and sending it around. Usually, this will be one or two data sources combined, keeping within the limits of basic Excel. Sometimes data is distributed in big Excel files and finding the information may take some work by the recipient, either scanning through a table or building pivot tables, etc.

这些通常是任何业务信息的起点。 通常是表格样式的报表,经常有人获取数据,然后将其放入Excel / Powerpoint中并发送出去。 通常,这将是一个或两个数据源的组合,并保持在基本Excel的范围内。 有时,数据会散布在大型Excel文件中,收件人可能需要做一些工作,例如扫描表或构建数据透视表等。

临时报告 (Ad-hoc reports)

Image for post
TAR Solutions TAR Solutions

Like the above, only this time the user can drag and drop limited data sources to find information themselves. Or perhaps they ask their “data team” to provide information on an ad-hoc basis. This is generally in the form of Excel. Often there are limitations to the data available. It’s also prone to error as not all users may know the correct filters to apply / quirks of the data / etc to use effectively. The positive for the user is that getting answers, as long as the data is available, should be quite quick.

像上面一样,只有这次用户才能拖放有限的数据源来自己查找信息。 或者,他们可能要求他们的“数据团队”临时提供信息。 通常以Excel的形式。 通常,可用数据有限。 由于并非所有用户都可能知道正确的过滤器/数据怪癖/有效使用等,因此也容易出错。 对于用户来说,有利的是,只要有可用的数据,就应该很快获得答案。

互动式仪表板 (Interactive Dashboards)

Image for post
TAR Solutions TAR Solutions

A well designed dashboard should answer many business questions. Dashboards should provide the means for the user to answer their initial standard questions (i.e. how are the sales vs budget?) and also answer the follow on questions (i.e. which products are the top sellers? Which salespeople are performing well/poorly?). Usually, these are automated and web-based, putting the information at the fingertips of those who need it.

精心设计的仪表板应回答许多业务问题。 仪表板应为用户提供回答其最初的标准问题(即,销售与预算如何?)并回答以下问题(即,哪些产品是最畅销的产品?哪些销售人员的表现良好/不好?)的方式。 通常,这些都是基于网络的自动化工具,使信息触手可及。

Dashboards should answer the majority of standard business questions in a mature business intelligence environment.

在成熟的商业智能环境中,仪表板应回答大多数标准商业问题。

快讯 (Alerts)

Image for post
TAR Solutions TAR Solutions

This is where users are proactively notified if something requires their attention. Business rules drive alerts. Some alerts require multiple sources of data, some only one source. It notifies someone they need to aware of some information and perhaps take some action. Examples includes:

如果需要引起注意,可以在此处主动通知用户。 业务规则驱动警报。 有些警报需要多种数据来源,有些则仅需要一种来源。 它通知某人,他们需要知道一些信息并可能采取某些措施。 示例包括:

  • some dodgy data is in a CRM system that requires cleaning

    一些不可靠的数据位于需要清洁的CRM系统中
  • a trader has breached a trading limit; the trader and their management needs to know

    交易者违反了交易限制; 交易者及其管理者需要知道
  • an item in a fund portfolio has moved more than usual and the fund/portfolio manager should be aware

    基金投资组合中的某个项目移动得比平常多,基金/投资组合经理应注意

统计分析 (Statistical Analysis)

Image for post
TAR Solutions TAR Solutions

This is where the data complexity starts to increase. For statistical analysis to be meaningful the underlying data has to be good. The previous rungs of the ladder should ensure the data is of good quality. If people are using the information from previous rungs any existing data issues should have been corrected.

这就是数据复杂性开始增加的地方。 为了使统计分析有意义,基础数据必须良好。 梯子的先前梯级应确保数据质量良好。 如果人们使用的是先前的信息,则任何现有的数据问题都应该得到纠正。

This is where data is used to try and better understand something. For example, in a subscription business what factors lead to a subscription renewal? With PPI claims what characteristics of the loan lead to compensation? Within asset management what causes fund outflows?

在这里,数据被用来尝试并更好地理解某些东西。 例如,在订阅业务中,哪些因素导致订阅续订? 对于PPI索赔,贷款的哪些特征导致赔偿? 在资产管理中,什么原因导致资金流出?

预测 (Forecasting)

Image for post
TAR Solutions TAR Solutions

Based on what we have learned during the statistical analysis, what is the likely aggregate outcome? For example, what do we think the subscription renewal rate will be? How many PPI claims will require manual investigation? How much AUM is likely to be lost over the coming 12 months?

根据我们在统计分析中所学到的知识,可能的总体结果是什么? 例如,我们认为订阅续订率是多少? 有多少PPI索赔需要人工调查? 在未来的12个月中,可能会损失多少AUM?

预测分析 (Predictive Analytics)

Image for post
TAR Solutions TAR Solutions

This is where we attempt to predict what will happen in the future at an individual level.

在这里,我们试图预测个人将来的情况。

Predictive analytics has been in widespread use for a long time in financial services. For example:

预测分析在金融服务中已广泛使用了很长时间。 例如:

  • Your Credit Score is used to calculate the risk of defaulting on a loan — which is then used to decide a) whether to offer a loan and b) what price (interest rate) to offer the loan

    您的信用评分用于计算贷款违约的风险-然后用于确定a)是否提供贷款以及b)提供贷款的价格(利率)
  • Car Insurance — your past claim history, age, type of car, etc. contribute to your insurance cost; they analyse the probability of an incident based on the incident history of those in a similar cohort and then price your insurance based on the claim likelihood

    汽车保险-您过去的索赔记录,年龄,汽车类型等会增加您的保险费用; 他们根据相似队列中的事件历史分析事件发生的可能性,然后根据索赔可能性为保险定价
  • Life Insurance — data about your age, health and lifestyle are gathered before pricing your life insurance; using past data the insurer is working out the probability you will die while being insured. So an overweight smoker in their 50s will pay far more than an athletic clean living 20 something

    人寿保险-在为人寿保险定价之前,会收集有关您的年龄,健康和生活方式的数据; 使用过去的数据,保险公司可以计算出您在被保险期间死亡的可能性。 因此,五十多岁的超重吸烟者的收入将远远超过20岁的运动清洁者

机器学习 (Machine Learning)

Image for post
TAR Solutions TAR Solutions

This could easily be a separate topic. Machine Learning is a subset of Artificial Intelligence, which this article explains well. It should help better understand clients and subsequently make their experience more tailored to their needs. It can be further broken down:

这很容易成为一个单独的主题。 机器学习是人工智能的一个子集, 本文对此进行了很好的解释 。 它应该有助于更好地了解客户,并随后使他们的体验更适合他们的需求。 可以进一步细分:

  • Scenario modelling — forecasts and predictions under what-if scenarios

    方案建模-假设情景下的预测和预测
  • Decision support — extends scenario modelling, the goal here being to optimise decision making

    决策支持-扩展方案建模,此处的目标是优化决策
  • Bots and recommender systems — use machine learning to influence customer behaviour. Some leading-edge organisations excel in this area, such as Amazon with their recommendation engine. For example, what does this customer/client want? Based on the behaviour of similar customers can we affect the decision-making process? Can we win more business/maintain their business?

    漫游器和推荐系统-使用机器学习来影响客户的行为。 一些领先的组织在这方面表现出色,例如具有推荐引擎的亚马逊。 例如,该客户/客户想要什么? 基于类似客户的行为,我们可以影响决策过程吗? 我们可以赢得更多业务/维持他们的业务吗?

信息阶梯的现实 (The reality of the Information Ladder)

In the real world, there can be significant cross-over between the rungs. Also, it’s not necessary to step on every rung of the ladder. For example, many data projects start on ladder rung 3, “interactive”. Rung 4, “alerts”, is often a by-product of “interactive”.

在现实世界中,梯级之间可能会有很大的交叉。 同样,也不必踩在梯子的每个梯级上。 例如,许多数据项目都从梯级3“交互”开始。 梯级4(“警报”)通常是“互动”的副产品。

Similarly, “predictive analytics” and “machine learning” can also be quite similar.

同样,“预测分析”和“机器学习”也可以非常相似。

However, what is true is the data understanding and amount of data sources does increase as one progresses up the ladder.

但是,事实是,随着人们的不断进步,对数据的理解和数据源的数量确实会增加。

Starting at the top of the ladder simply would not be possible without climbing the rungs below.

如果不攀登下面的梯级,根本不可能从梯子的顶部开始。

爬上信息阶梯 (Climbing the Information Ladder)

Often data to information projects start small with simpler data sets. All organisations are different, some are data-driven, other less so.

通常,信息项目的数据始于简单的数据集。 所有组织都是不同的,有些是数据驱动的,有些则不是。

However, once these projects start to gain traction and start to prove their value they can lead to significant cultural change.

但是,一旦这些项目开始受到人们的关注并开始证明其价值,它们就会导致重大的文化变革。

The interest in data tends to rapidly spread throughout organisations of all sizes.

对数据的兴趣倾向于Swift传播到各种规模的组织中。

There is a natural progression to expand to more data and use existing data in different ways.

有自然的进展,可以扩展到更多数据并以不同方式使用现有数据。

Simply by answering one question is likely to lead to more questions.

仅回答一个问题就可能导致更多问题。

For example:

例如:

  • Which traders are most profitable?

    哪些交易员最赚钱?
  • Who are they trading with?

    他们与谁交易?
  • Which products are they trading?

    他们交易哪些产品?
  • Are there external events causing these trades — for example, inflation is rising?

    是否有外部事件导致这些交易-例如,通货膨胀率上升?
  • Are we able to predict what events would create increased activity for certain products?

    我们是否能够预测哪些事件会增加某些产品的活动量?
  • Which clients would benefit from these products?

    哪些客户将从这些产品中受益?
  • etc

    等等

Once an organisation moves to a more data-driven path, the business questions become endless.

一旦组织改用数据驱动的路径,业务问题就会无穷无尽。

翻译自: https://towardsdatascience.com/climb-the-information-ladder-960da82f62b9

爬十格阶梯每次一步或两步

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值