

边缘是数据的未来(Fringe is the future of data)

Alternative data has been a buzzword among investors for several years now. By leveraging the insights available in non-traditional datasets, hedge funds can reap massive profits off insights that aren’t easily available to the average investor or anyone looking at traditional markers. Hedge funds so value alternative data that, according to JP Morgan, asset managers were already spending $2–3 billion on alternative data in 2017 and those investments have increased 10–20% a year since.

多年来,替代数据一直是投资者的流行语。 通过利用非传统数据集中可用的洞察力,对冲基金可以从洞察力中获利,而这些洞察力对于普通投资者或任何使用传统标记的人来说都不容易获得。 对冲基金非常重视替代数据,据摩根大通( JP Morgan)称,2017年资产管理人已经在替代数据上花费了2-3亿美元,而自那以来,这些投资每年增长了10-20%。

But it is not just hedge funds who profit from alternative data. All kinds of businesses can use alternative data to increase returns — and the most innovative companies have realized that alternative data holds the key to maximizing their competitive advantage. In fact, data science analyses that leverage alternative data outperform benchmarks about 13% better than traditional approaches to analytics.

但是从对冲数据中获利的不仅仅是对冲基金。 各种企业都可以使用替代数据来增加回报-最具创新力的公司已经意识到,替代数据是最大化其竞争优势的关键。 实际上,利用替代数据的数据科学分析的性能优于基准,比传统分析方法高出约13%

替代数据到底是什么? (What exactly is alternative data?)

Put simply, alternative data is any data used to make an analysis that would not traditionally be used to make that decision. Alternative data includes proxy metrics that can stand in for factors that are usually difficult to measure. They can also include information originating from unofficial sources that individuals can use to gain insight into an analysis. Alternative data provides both new types of business intelligence and new ways of understanding intelligence that could be gained from traditional data.

简而言之,替代数据是用于进行分析的任何数据,而这些数据通常不会用于做出该决策。 替代数据包括可以代替通常难以衡量的因素的代理指标。 它们还可以包括来自非官方来源的信息,个人可以用来获取对分析的见解。 替代数据既提供了新型的商业智能,又提供了从传统数据中获得的新的理解智能的方式。

“Alternative data draws from non-traditional data sources so that when you apply analytics to the data, they yield additional insights that complement the information you receive from traditional sources.” — Krishna Nathan, CIO of S&P Global

“替代数据来自非传统数据源,因此,当您将分析应用于数据时,它们会产生其他见解,以补充您从传统源获得的信息。” — S&P Global的CIO Krishna Nathan

Alternative data sources will vary widely depending on the industry and type of analysis being done. Investors have used everything from credit card transactions to location data from cell phones and scraped from the web. Even tracking the private jets of companies has been used to assess whether or not to invest in a particular company.

替代数据源将根据行业和进行的分析类型而变化很大。 投资者已经使用了从信用卡交易到手机的位置数据以及从网络上抓取的所有内容。 甚至跟踪公司的私人飞机也已用于评估是否投资于特定公司。

Alternative data can be leveraged just as effectively outside of the investments world. Retailers have used data taken from satellite images to decide where to open new locations. Fintech companies have used cashflow markers and academic history to assess the creditworthiness of people without a credit history. Even travel companies have used alternative data scraped from the internet to decide which amenities to offer and where.

可以在投资领域之外同样有效地利用替代数据。 零售商使用从卫星图像中获取的数据来决定在何处开设新地点。 金融科技公司已经使用现金流量标记和学术历史来评估没有信用记录的人的信用度。 甚至旅游公司也使用从互联网上抓取的替代数据来决定提供哪些便利设施以及在何处提供服务。

At Evo, the supply chain and pricing AI company where I work, alternative data is a significant part of why our supply chain tools increase inventory efficiency by at least 10%. We use everything from scraped web data to weather to improve our analyses. One of my favourite examples of alternative data that we had great success with was using store manager opinions on trends. Essentially, we allowed store managers to request particular items they believed were most likely to be popular in the upcoming sales period. It was a way to get a more local, granular measurement of trends and popularity, using the managers as a proxy — and it worked. This increased the accuracy of our forecast by an additional 5pp over the 20pp improvement over the original replenishment system.

在我工作的供应链和定价AI公司Evo ,替代数据是我们供应链工具将库存效率至少提高10%的重要原因。 我们使用从抓取的Web数据到天气的所有内容来改善分析。 我最喜欢的替代数据示例之一是使用商店经理对趋势的意见,我们取得了很大的成功。 从本质上讲,我们允许商店经理请求他们认为在即将到来的销售期间最有可能流行的特定商品。 这是一种以经理人为代表的方法,可以对趋势和受欢迎程度进行更本地化的细化度量,并且有效。 与原始补货系统相比,这将我们的预测准确性提高了20个百分点,增加了5个百分点。

替代数据如何推动数字化转型 (How alternative data fuels digital transformation)

Data-driven decision-making is vital for businesses that want to compete in today’s economy. That’s why a majority of companies use big data analytics to collect business intelligence. Few of these, however, have discovered how to leverage data effectively enough to succeed in company-wide digital transformation.

数据驱动的决策对于想要在当今经济中竞争的企业至关重要。 这就是为什么大多数公司使用大数据分析来收集商业智能的原因。 但是,其中很少有人发现如何充分有效地利用数据来成功实现公司范围的数字化转型

Why? Because they are still looking at the same data. True digital transformation is about more than integrating AI and machine learning into your current decision-making processes. It’s about rethinking your entire approach to the problem by leveraging new technologies. Business intelligence may be collected more efficiently when using new tools to analyse traditional data, but overall gains will be limited unless alternative data is also incorporated.

为什么? 因为他们仍在查看相同的数据。 真正的数字化转型不仅仅是将AI和机器学习集成到您当前的决策过程中。 这是关于通过利用新技术来重新考虑解决问题的整个方法。 使用新工具分析传统数据时,可以更有效地收集商业智能,但是除非同时包含替代数据,否则总体收益将受到限制

Including alternative data in your analysis allows you to consider new strategies that would not have been informed by traditional approaches. You can fill in the gaps in your analysis for a more granular, more real-time, and more accurate recommendation. Only this can deliver the expected dramatic improvements promised by digital transformation.

在分析中包括替代数据,使您可以考虑传统方法无法掌握的新策略。 您可以填补分析中的空白,以获得更细化,更实时和更准确的建议。 只有这样才能实现数字化转型所带来的预期的重大改进。

“When digital transformation is done right, it’s like a caterpillar turning into a butterfly, but when done wrong, all you have is a really fast caterpillar.”- George Westerman, Research Scientist with the MIT Sloan Initiative on the Digital Economy


寻找正确的替代数据(Finding the right alternative data)

As a data scientist, simply knowing that alternative data in general helps improve analysis isn’t very useful. You have to understand which data will help you achieve your business goals and deliver useful business intelligence. So much data is available, yet most of that data is just noise — and ultimately useless.

作为数据科学家,仅仅知道替代数据通常可以帮助改善分析并不是很有用。 您必须了解哪些数据将帮助您实现业务目标并提供有用的商业智能。 可用的数据很多,但其中大多数只是噪音,最终无用。

“Every day, three times per second, we produce the equivalent of the amount of data that the Library of Congress has in its entire print collection, right? But most of it is like cat videos on YouTube or 13-year-olds exchanging text messages about the next Twilight movie.” — Nate Silver, Statistician and Founder and Editor-in-Chief of FiveThirtyEight

“每天,每秒三次,我们产生的数据量相当于国会图书馆整个印刷馆藏中的数据量,对吗? 但大多数情况就像YouTube上的猫视频或13岁的孩子交换有关下一部暮光之城电影的短信。” — Nate Silver,统计学家以及FiveThirtyEight的创始人兼总编辑

Some of filtering out the right alternative data is simply trial and error. You choose a source of data that is likely to apply to the analysis you are making, assess the risks of choosing that data, and make the best guess. The results will allow you hone in on the most appropriate choice after running some tests.

筛选出正确的替代数据的某些方法只是反复试验。 您选择可能适用于所进行分析的数据源,评估选择该数据的风险,并做出最佳猜测。 结果将使您在运行一些测试后可以选择最合适的选择。

Ultimately, however, you choose the right data by asking the right questions. When you always prioritize the true business goal and not just KPIs, you can work back to what data you need to fill in the gaps much more easily. I tend to use a common-sense approach. It is all about context. When I’m working with clients, I make sure to find out their motivation. Why are they looking to make their supply chain more efficient? Those answers will help direct me to a reasonable source of alternative data.

但是,最终,您会通过提出正确的问题来选择正确的数据。 当您始终优先考虑真正的业务目标而不仅仅是KPI时,您可以更轻松地找到需要哪些数据来填补空白。 我倾向于使用常识性方法。 这全都与上下文有关。 与客户合作时,请确保找出他们的动力。 他们为什么要提高供应链效率? 这些答案将帮助我找到替代数据的合理来源。

最大化数字化转型的回报 (Maximizing returns from digital transformation)

Big data has flipped the challenge for data scientists. We no longer struggle to find enough sources of data; we struggle to find coherent and useful patterns in the massive amounts of data available. Considering alternative data only amplifies that problem.

大数据已经为数据科学家带来了挑战。 我们不再努力寻找足够的数据来源; 我们努力在大量可用数据中找到一致且有用的模式。 考虑替代数据只会加剧该问题。

Yet we must meet the challenge. The best business intelligence today often comes from non-traditional sources, so it is a data scientist’s responsibility to pinpoint and analyse that data. Alternative data is the future of digital transformation, and failing to embrace it as a critical part of your analysis will lead to your company falling behind. If you want to maximize your ROI, you must first invest in the right information. That is now alternative data.

但是我们必须面对挑战。 当今最好的商业智能通常来自非传统来源,因此查明和分析数据是数据科学家的责任。 替代数据是数字化转型的未来,如果不能将其视为分析的关键部分,则会导致公司落后。 如果要最大程度地提高投资回报率,则必须首先投资正确的信息。 现在是替代数据。

