介绍 (Introduction)
![Image for post](https://i-blog.csdnimg.cn/blog_migrate/ee004321a2c14b1069e20b0a1fb3a52c.png)
Plato’s Republic, which introduces questions that dominate western political philosophy even nowadays, is fundamentally a dialogue. Plato endeavours to conceptualize the ideal society through philosophical discussions and these tendencies for spirited debates are quite explicit in books I~II. Early in The Republic, Socrates refutes the potential definitions of justice suggested by various figures such as Cephalus, Polemarchus, and Thrasymachus. Since negative emotions often accompanied these arguments, I thought conducting sentiment analysis could help contextualize the main ideas covered in The Republic. Using the BERT-based sentiment classification model provided by Huggingface’s Transformers package, I attempted to extract the sentence tokens of negative sentiment and visualize their word frequencies with the Scattertext package.
从根本上说,柏拉图共和国在当今仍引入了主导西方政治哲学的问题,这从根本上来说就是一场对话。 柏拉图致力于通过哲学讨论将理想社会概念化,而这些激烈辩论的趋势在第一至第二本书中是很明显的。 在共和国初期,苏格拉底驳斥了诸如塞费勒斯,波勒玛格鲁斯和Thrasymachus等人物提出的正义的潜在定义。 由于负面情绪经常伴随着这些论点,因此我认为进行情绪分析可以帮助将《共和国》中的主要思想背景化。 使用Huggingface的Transformers软件包提供的基于BERT的情感分类模型,我尝试提取负面情感的句子标记,并使用Scattertext软件包可视化它们的词频。
数据预处理 (Data Preprocessing)
import nltk
import pandas as pd
from transformers import pipeline
nlp_sentiment = pipeline("sentiment-analysis")
full_text = open("/mnt/d/nemo/philosophy_analysis/Plato/Plato's Republic/Plato's Republic - Book I - II/data/Dialogue with Glaucon.txt","r",encoding='UTF8').read()
full_text = [' '.join(nltk.word_tokenize(s)) for s in full_text.replace('?', '.').replace('!', '.').split('.') if len(s)>20]
results = nlp_sentiment(full_text)
sentiment = []
for s in results:
if s['label'] == 'NEGATIVE' and s['score'] > 0.9:
sentiment.append('NEGATIVE')
else:
sentiment.append('NEUTRAL')
speaker = ['Glaucon'] * len(sentiment)
galucon_df = pd.DataFrame({'text':full_text, 'sentiment':sentiment, 'speaker':speaker})
Republic I~II, which are a series of arguments about the essence of justice, would be an excellent subject for sentiment analysis. First, I prepared a dataset of texts from the Republic. Then, using the BERT sentiment classifier, I was able to determine whether a given text contained negative sentiment. As a result, I created a pandas dataframe with a sentiment column that consists of NEGATIVE or NEUTRAL.
关于正义本质的一系列论证的共和国一到二,将是情感分析的一个很好的主题。 首先,我准备了一个来自共和国的文本数据集。 然后,使用BERT情感分类器,我能够确定给定文本是否包含负面情感。 结果,我创建了一个pandas数据框,其中的情绪列由NEGATIVE或NEUTRAL组成。
![Image for post](https://i-blog.csdnimg.cn/blog_migrate/75dc8442019e37348aec137aeac168c2.png)
情绪分析 (Sentiment Analysis)
import spacy
import scattertext as st
import pandas as pd
dialogue_df = pd.read_csv('/mnt/d/nemo/philosophy_analysis/Plato/Plato\'s Republic/Plato\'s Republic - Book I - II/data/dialouge_data.csv', index_col=0)
nlp = spacy.load('en')
corpus = st.CorpusFromPandas(dialogue_df, category_col='sentiment', text_col='text', nlp=nlp).build()
html = st.produce_scattertext_explorer(corpus, category='NEGATI