需要安装nltk,安装完之后还有stopwords,装在copora文件夹下边
!
import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
set(stopwords.words('english'))
text="""Removal of amoxicillin from aqueous solution using sludge-based activated carbon modified ."""#插入需要停用词处理的txt
stop_words=set(stopwords.words('english'))
word_tokens=word_tokenize(text)
filtered_sentence = []
for w in word_tokens:
if w not in stop_words:
filtered_sentence.append(w)
print("\n\nFiltered Sentence \n\n")
print(" ".join(filtered_sentence))
输出的结果是:
Filtered Sentence
Removal amoxicillin aqueous solution using sludge-based activated carbon modified walnut shell nano-titanium dioxide . Dewatered municipal sludge used raw material prepare activated carbon ( SAC ) , SAC modified walnut shell nano-titanium dioxide ( MSAC ) . The results showed MSAC higher specific surface area ( S-BET ) ( 279.147 ( 2 ) /g ) total pore volume ( V-T ) ( 0.324 cm ( 3 ) /g ) SAC .
Process finished with exit code 0
我也是个小白菜鸡文科硕士生……
正在记录自己的处理过程