nlp自然语言处理_nlp满足可持续投资

nlp自然语言处理

Sustainable Investing is a growing investment strategy that seeks strong financial returns while also making the world a better place. However, it is often a challenging investment strategy for many investors. Luckily, Natural Language Processing can help, here’s how —

可持续投资是一种不断增长的投资策略,它在寻求强劲财务回报的同时,也使世界变得更加美好。 然而,对于许多投资者而言,这通常是具有挑战性的投资策略。 幸运的是,自然语言处理可以提供帮助,这是-

什么是可持续投资? (What is Sustainable Investing?)

Sustainable, responsible, and impact investing (SRI) is an investment discipline that considers environmental, social, and corporate governance (ESG) criteria to generate long-term competitive financial returns and positive societal impact.

可持续,负责任和影响力投资(SRI)是一门投资学科,它考虑环境,社会和公司治理(ESG)的标准,以产生长期的竞争性财务回报和积极的社会影响。

Image for post
Image by author
图片作者

There are several motivations for sustainable investing, including personal values and goals, institutional mission, and the demands of clients, constituents, or plan participants.

可持续投资有多种动机,包括个人价值观和目标,机构使命以及客户,三方成员或计划参与者的需求。

Sustainable investors aim for strong financial performance, but also believe that these investments should be used to contribute to advancements in social, environmental, and governance practices. [6]

可持续的投资者追求强劲的财务业绩,但也认为这些投资应被用于促进社会,环境和治理实践的发展。 [6]

They may actively seek out investments — such as community development loan funds or clean tech portfolios — that are likely to provide important societal or environmental benefits.

他们可能会积极寻找可能会提供重要的社会或环境利益的投资,例如社区发展贷款基金或清洁技术投资组合。

Some investors embrace sustainable investing strategies to manage risk and fulfill fiduciary duties; they review ESG criteria to assess the quality of management and the likely resilience of their portfolio companies in dealing with future challenges. Some are seeking financial outperformance over the long term; a growing body of academic research shows a strong link between ESG and financial performance. [1]

一些投资者采用可持续的投资策略来管理风险和履行信托义务。 他们审查了ESG标准,以评估管理质量以及投资组合公司应对未来挑战的可能应变能力。 有些人正在寻求长期的财务业绩; 越来越多的学术研究表明,ESG与财务绩效之间有着密切的联系。 [1]

Image for post
Global growth in sustainable investments (USD$ Trillion) — Image by author
全球增长的可持续投资(万亿美元)—照片作者作者

Investments marketed as sustainable — meaning they focus on companies that incorporate environmental and social corporate-governance practices into long-term corporate strategies — are experiencing explosive growth.

以可持续方式营销的投资-意味着它们专注于将环境和社会公司治理实践纳入长期公司战略的公司-正在经历爆炸性增长。

Although sustainable investing emerged in the 1970s, the movement has gained impressive traction in the last few years.

尽管在1970年代出现了可持续投资,但该运动在最近几年中获得了令人瞩目的发展。

Since 2012, total assets in sustainable investing have more than doubled. [2]

自2012年以来,可持续投资的总资产增加了一倍以上。 [2]

As sustainable investing goes mainstream, it won’t simply act as a niche in a broader strategy — instead, it’ll be naturally integrated throughout a portfolio.

随着可持续投资成为主流,它不仅会在更广泛的战略中充当利基市场,反而会自然地融入整个投资组合中。

“With the impact of sustainability on investment returns increasing, we believe that sustainable investing is the strongest foundation for client portfolios going forward.”

“随着可持续发展对投资回报的影响越来越大,我们认为可持续投资是未来客户投资组合的最坚实基础。”

— Larry Fink, BlackRock Chairman, and CEO

—贝莱德(BlackRock)董事长兼首席执行官拉里·芬克(Larry Fink)

Sustainability is a global force that will continue to factor into everyday decisions.

可持续发展是一支全球力量,将继续影响日常决策。

可持续投资-挑战 (Sustainable Investing — Challenges)

The current pool of data around sustainability relies too much on voluntary corporate disclosures, such as annual sustainability reports and company questionnaires put together by institutional investors — many of which ask different questions. [3]

当前有关可持续性的数据池过分依赖自愿性公司披露,例如年度可持续性报告和机构投资者汇总的公司调查表,其中许多问不同的问题。 [3]

“Individual investors are quite challenged to obtain this type of information in a way that is easily available and informs investment decisions”

“个人投资者在以易于获得的方式获得这种信息并为投资决策提供信息方面面临着很大的挑战”

— Jean Rogers, CEO, and Founder of the nonprofit Sustainability Accounting Standards Board

—非营利组织可持续性会计标准委员会首席执行官兼创始人让·罗杰斯(Jean Rogers)

It’s challenging for investors to make sustainable investment choices while relying solely on annual sustainability reports and such. They are often hundreds of pages long and take up huge amounts of human resources to analyze. This problem compounds itself over time as the number of sustainable assets increases. These reports are also never completely transparent. Companies may choose to leave certain things out of their annual reports.

对于投资者而言,仅依靠年度可持续发展报告等做出可持续投资选择具有挑战性。 它们通常长达数百页,并占用大量人力资源进行分析。 随着可持续资产数量的增加,这个问题随着时间的流逝而加剧。 这些报告也永远不会完全透明。 公司可以选择将某些事项排除在年度报告之外。

Annual sustainability reports and such are also very static. They do not reflect changes in the company in real-time, they only reflect an accumulation of changes over a fixed period. This approach misses out on all the changes happening in real-time that may be reflected in news articles.

年度可持续发展报告等也是非常静态的。 它们不能实时反映公司的变更,而只能反映固定期间内累积的变更。 这种方法错过了新闻中可能实时反映的所有实时变化。

A more dynamic approach to sustainable investing would take real-time changes into account while also reducing the complexity of analyzing annual sustainability reports. This would make Sustainable Investing more scalable, increasing efficiency while reducing human prone errors.

一种更具动态性的可持续投资方法将考虑实时变化,同时还降低了分析年度可持续性报告的复杂性。 这将使“可持续投资”更具可扩展性,提高效率,同时减少人为错误。

NLP如何提供帮助 (How NLP can help)

Natural Language Processing can be used to analyze sustainability reports and news articles extracting out important ESG centric insights. This reduces the complexity of analyzing reports manually, while also making the approach more dynamic by also looking at real-time changes in news articles.

自然语言处理可用于分析可持续发展报告和新闻报道,以提取出重要的以ESG为中心的见解。 这降低了手动分析报告的复杂性,同时还通过查看新闻中的实时变化使方法更加动态。

Let’s take a look at an example:

让我们看一个例子:

We are proud to have reached 100 percent renewable electricity for Apple facilities, and carbon neutrality for Apple’s corporate emissions, including business travel and employee commute. We are embarking on a new goal to become carbon neutral for our entire carbon footprint by 2030.

我们为Apple设施达到100%的可再生电力以及Apple的企业排放(包括商务旅行和员工上下班)的碳中和而感到自豪。 我们正在制定一个新目标,到2030年使我们的整个碳足迹成为碳中和。

—An excerpt from the 2020 Apple Sustainability Report [4]

—摘自2020苹果可持续发展报告[4]

Rather than manually reading the report and analyzing it, an NLP model could perform downstream NLP tasks such as text classification and sentiment analysis on the report, reducing the complexity of analyzing a report and making the whole process more time and resource-efficient. In this case, the NLP model would classify the excerpt as relating to “Climate Change” with a sentiment value of “positive”.

NLP模型无需手动读取报告并进行分析,而是可以执行下游NLP任务,例如对报告进行文本分类情感分析 ,从而减少了分析报告的复杂性,并使整个过程更加节省时间和资源。 在这种情况下,NLP模型会将摘录归类为与“气候变化”相关的情感值“正”。

NLP empowers the investor to make a better and more efficient analysis of reports and articles, leading to a much more informed Sustainable Investment decision.

NLP使投资者能够对报告和文章进行更好,更有效的分析,从而得出更明智的可持续投资决策。

At my internship at Parabole.ai, I was able to develop ESG-BERT by further pre-training Google’s “BERT” language model on large unstructured Sustainability text corpora.

Parabole.ai 实习 期间 ,我能够通过 在大型非结构化Sustainability文本语料库上 进一步对Google的“ BERT ”语言模型进行 预培训来开发ESG-BERT

I had tried approaching this problem using ‘sci-kit learn’ models and ‘count-vectorizers’. Given the nature of this domain and its unique vocabulary, traditional ML models did not yield satisfactory results. Deep Learning models, on the other hand, required large amounts of structured text data, which we were lacking in this case. There was an abundance of unstructured text data, but structured data was scarce.

我曾尝试使用“ sci-kit学习 ”模型和“ 计数向量化器 ”解决此问题。 考虑到该领域的性质及其独特的词汇,传统的机器学习模型无法产生令人满意的结果。 另一方面,深度学习模型需要大量的结构化文本数据,而在这种情况下,我们缺乏这些数据。 有大量的非结构化文本数据,但是结构化数据却很少。

Having tried these approaches, I turned towards Google’s BERT which is pre-trained on large unstructured text corpora and hence requires much less structured data for downstream NLP tasks, such as text classification. This seemed to fit our case quite perfectly. [5]

在尝试了这些方法之后,我转向了Google的BERT,该BERT在大型非结构化文本语料库上进行了预训练,因此对于下游NLP任务(例如文本分类)所需的结构化数据要少得多。 这似乎非常适合我们的情况。 [5]

BERT (Bidirectional Encoder Representations from Transformers) is a technique developed by Google for pre-training of Natural Language Processing models. The official BERT repo contains different pre-trained models that can be trained on downstream NLP tasks with an added output layer. These models, however, are pre-trained on general English text corpora, and they are not capable of understanding domain-specific vocabulary. [5]

BERT(来自变压器的双向编码器表示)是Google开发的一种用于自然语言处理模型的预训练的技术。 官方的BERT 回购包含不同的预训练模型,可以使用增加的输出层对下游NLP任务进行训练。 但是,这些模型已在通用英语文本语料库上进行了预训练,并且无法理解特定领域的词汇。 [5]

Sustainable Investing as a domain has a unique vocabulary that ESG-BERT is capable of understanding. ESG-BERT was further trained on unstructured text data with accuracies of 100% and 98% for Next Sentence Prediction and Masked Language Modelling tasks. Fine-tuning ESG-BERT for text classification yielded an F-1 score of 0.90. For comparison, the general BERT (BERT-base) model scored 0.79 after fine-tuning, and the sci-kit learn approach scored 0.67.

可持续投资作为一个领域具有ESG-BERT能够理解的独特词汇。 ESG-BERT接受了非结构化文本数据的进一步培训,其准确度分别为100%98%,可用于下一句预测和屏蔽语言建模任务。 对文本分类的ESG-BERT进行微调,得出的F-1得分为0.90 。 为了进行比较,一般的BERT模型(基于BERT的模型)经过微调后的得分为0.79 ,而sci-kit学习方法的得分为0.67

Image for post
Image by author
图片作者

The applications of ESG-BERT can be expanded way beyond just text classification. It can be fine-tuned to perform various other downstream NLP tasks in the domain of Sustainable Investing.

ESG-BERT的应用范围可以扩展到不仅仅是文本分类。 在可持续投资领域,可以对其进行微调以执行其他各种下游NLP任务。

如何使用ESG-BERT? (How to use ESG-BERT?)

The pre-trained domain-specific ESG-BERT model can be downloaded from the GitHub repository here. It can be fine-tuned to perform downstream NLP tasks such as sentiment analysis, etc.

可以在此处从GitHub存储库下载经过预训练的特定于领域的ESG-BERT模型。 可以对其进行微调以执行下游NLP任务,例如情感分析等。

ESG-BERT was also fine-tuned to perform text classification on Sustainable Investing text data. The fine-tuned model can be downloaded and served, as explained in the readme section of the GitHub repo.

ESG-BERT也经过微调,可以对可持续投资文本数据进行文本分类。 如GitHub repo自述文件部分所述 ,可以下载并提供经过调整的模型。

结论 (Conclusion)

This is a substantial step towards text mining in Sustainable Investing.

这是迈向可持续投资中文本挖掘的重要一步。

ESG-BERT can be used to make Sustainable Investing more accessible to investors. It makes Sustainability as a goal more attainable by bridging the gap between complex Sustainability data and investors. Its impacts, however, transcend just text mining. Sustainability Reports are often hundreds of pages long and filled with ESG jargon that most people would not understand. This tool makes these Sustainability reports more readable and accessible to everyone and therefore increasing the impact of Sustainable Investing. This moves us one step closer to a greener, safer, and more sustainable future.

ESG-BERT可以使投资者更容易获得可持续投资。 通过弥合复杂的可持续发展数据和投资者之间的差距,使可持续发展作为目标更加可实现。 但是,它的影响超越了文本挖掘。 可持续发展报告通常长达数百页,充满了大多数人不理解的ESG术语。 该工具使所有人都可以更容易理解和获取这些可持续发展报告,从而增加了可持续投资的影响。 这使我们向更绿色,更安全,更可持续的未来迈进了一步。

In the near future, I will be publishing tutorials on how I further pre-trained BERT to create ESG-BERT, how I Fine-Tuned BERT using PyTorch, and talk about the other NLP approaches using “count-vectorizers”, and “bag of word” models.

在不久的将来,我将发布有关如何进一步对BERT进行预训练以创建ESG-BERT,如何使用PyTorch精调BERT以及如何使用“计数矢量化器”和“包装袋”讨论其他NLP方法的教程。字”模型。

Feel free to connect with me on LinkedIn and shoot me a message here.

随时在LinkedIn上与我联系,并 在此处向 我发送消息

翻译自: https://towardsdatascience.com/nlp-meets-sustainable-investing-d0542b3c264b

nlp自然语言处理

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值