python财务报告_python - 财务报告分析情感-Python - SO中文参考 - www.soinside.com

本文展示了如何使用Python进行财务报告的情感分析。通过结合股票市场词汇和Loughran-McDonald的正负面词汇,创建了一个情感词典。然而,模型在理解数字的上下文方面表现不佳,即使数字变化,情感分数依然为负。寻求改进模型的方法以更好地理解数字的含义。
摘要由CSDN通过智能技术生成

添加单词的代码如下:

import csv

import pandas as pd

# stock market lexicon

stock_lex = pd.read_csv('C:/Users/ddutta070819/Downloads/EWS/StockSentimentTrading-master/lexicon_data/stock_lex.csv')

stock_lex['sentiment'] = (stock_lex['Aff_Score'] + stock_lex['Neg_Score'])/2

stock_lex = dict(zip(stock_lex.Item, stock_lex.sentiment))

stock_lex = {k:v for k,v in stock_lex.items() if len(k.split(' '))==1}

stock_lex_scaled = {}

for k, v in stock_lex.items():

if v > 0:

stock_lex_scaled[k] = v / max(stock_lex.values()) * 4

else:

stock_lex_scaled[k] = v / min(stock_lex.values()) * -4

# Loughran and McDonald

positive = []

with open('C:/Users/ddutta070819/Downloads/EWS/StockSentimentTrading-master/lexicon_data//lm_positive.csv', 'r') as f:

reader = csv.reader(f)

for row in reader:

positive.append(row[0].strip())

negative = []

with open('C:/Users/ddutta070819/Downloads/EWS/StockSentimentTrading-master/lexicon_data//lm_negative.csv', 'r') as f:

reader = csv.reader(f)

for row in reader:

entry = row[0].strip().split(" ")

if len(entry) > 1:

negative.extend(entry)

else:

negative.append(entry[0])

final_lex = {}

final_lex.update({word:2.0 for word in positive})

final_lex.update({word:-2.0 for word in negative})

final_lex.update(stock_lex_scaled)

final_lex.update(sia.lexicon)

sia.lexicon = final_lex尽管总体结果有所改善,但是该模型似乎无法理解这些数字。对于前:

sia.polarity_scores('Royal Dutch Shell plc announced earnings results for the second quarter ended June 30, 2019. \ For the second quarter, the company announced total revenue was USD 91,838 million compared to USD 99,268 million a year \ ago. Net income was USD 2,998 million compared to USD 6,024 million a year ago. Basic earnings per share was USD 0.37 \ compared to USD 0.72 a year ago. For the half year, total revenue was USD 177,499 million compared to USD 190,382 million\ a year ago. Net income was USD 8,999 million compared to USD 11,923 million a year ago. Basic earnings per share was \ USD 1.11 compared to USD 1.44 a year ago. Diluted earnings per share was USD 1.1 compared to USD 1.42 a year ago.')

-0.81

这是绝对正确的,但是即使我更改了数字:

sia.polarity_scores('Royal Dutch Shell plc announced earnings results for the second quarter ended June 30, 2019. \ For the second quarter, the company announced total revenue was USD 91,838 million compared to USD 69,268 million a year \ ago. Net income was USD 2,998 million compared to USD 1,024 million a year ago. Basic earnings per share was USD 0.37 \ compared to USD 0.17 a year ago. For the half year, total revenue was USD 177,499 million compared to USD 150,382 million\ a year ago. Net income was USD 8,999 million compared to USD 6,923 million a year ago. Basic earnings per share was \ USD 1.11 compared to USD 1.04 a year ago. Diluted earnings per share was USD 1.1 compared to USD 1.02 a year ago.')

-0.81

提供的情感分数仍然为负。

有没有一种方法可以帮助模型根据所写文本的上下文来理解这些数字?

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值