基于bert的platos republic i ii情绪分析和可视化

介绍 (Introduction)

Image for post

Plato’s Republic, which introduces questions that dominate western political philosophy even nowadays, is fundamentally a dialogue. Plato endeavours to conceptualize the ideal society through philosophical discussions and these tendencies for spirited debates are quite explicit in books I~II. Early in The Republic, Socrates refutes the potential definitions of justice suggested by various figures such as Cephalus, Polemarchus, and Thrasymachus. Since negative emotions often accompanied these arguments, I thought conducting sentiment analysis could help contextualize the main ideas covered in The Republic. Using the BERT-based sentiment classification model provided by Huggingface’s Transformers package, I attempted to extract the sentence tokens of negative sentiment and visualize their word frequencies with the Scattertext package.

从根本上说,柏拉图共和国在当今仍引入了主导西方政治哲学的问题,这从根本上来说就是一场对话。 柏拉图致力于通过哲学讨论将理想社会概念化,而这些激烈辩论的趋势在第一至第二本书中是很明显的。 在共和国初期,苏格拉底驳斥了诸如塞费勒斯,波勒玛格鲁斯和Thrasymachus等人物提出的正义的潜在定义。 由于负面情绪经常伴随着这些论点,因此我认为进行情绪分析可以帮助将《共和国》中的主要思想背景化。 使用Huggingface的Transformers软件包提供的基于BERT的情感分类模型,我尝试提取负面情感的句子标记,并使用Scattertext软件包可视化它们的词频。

数据预处理 (Data Preprocessing)

import nltk
import pandas as pd
from transformers import pipeline


nlp_sentiment = pipeline("sentiment-analysis")


full_text = open("/mnt/d/nemo/philosophy_analysis/Plato/Plato's Republic/Plato's Republic - Book I - II/data/Dialogue with Glaucon.txt","r",encoding='UTF8').read()
full_text = [' '.join(nltk.word_tokenize(s)) for s in full_text.replace('?', '.').replace('!', '.').split('.') if len(s)>20]


results = nlp_sentiment(full_text)
sentiment = []
for s in results:
    if s['label'] == 'NEGATIVE' and s['score'] > 0.9:
        sentiment.append('NEGATIVE')
    else:
        sentiment.append('NEUTRAL')
speaker = ['Glaucon'] * len(sentiment)
galucon_df = pd.DataFrame({'text':full_text, 'sentiment':sentiment, 'speaker':speaker})

Republic I~II, which are a series of arguments about the essence of justice, would be an excellent subject for sentiment analysis. First, I prepared a dataset of texts from the Republic. Then, using the BERT sentiment classifier, I was able to determine whether a given text contained negative sentiment. As a result, I created a pandas dataframe with a sentiment column that consists of NEGATIVE or NEUTRAL.

关于正义本质的一系列论证的共和国一到二,将是情感分析的一个很好的主题。 首先,我准备了一个来自共和国的文本数据集。 然后,使用BERT情感分类器,我能够确定给定文本是否包含负面情感。 结果,我创建了一个pandas数据框,其中的情绪列由NEGATIVE或NEUTRAL组成。

Image for post

情绪分析 (Sentiment Analysis)

import spacy
import scattertext as st
import pandas as pd


dialogue_df = pd.read_csv('/mnt/d/nemo/philosophy_analysis/Plato/Plato\'s Republic/Plato\'s Republic - Book I - II/data/dialouge_data.csv', index_col=0)


nlp = spacy.load('en')
corpus = st.CorpusFromPandas(dialogue_df, category_col='sentiment', text_col='text',  nlp=nlp).build()
html = st.produce_scattertext_explorer(corpus, category='NEGATI
  • 0
    点赞
  • 4
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值