科研训练第五周:关于《Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Ric》的复现——

周二结束软件课设的答辩,周三结束团学会议,接着是组原实验以及周日算法课设DDL
大三过分充实😄

遇到的报错的主要是transformsers的没有xBertTokenizer的问题以及sklearn没法安装的问题
暂时改用了BertTokenizer作为替代,还没找官方文档看区别

今天先跑出来preprocess.py,其他的文件看再逐个筛查8,环境真的配麻了

---------------------------------------------------------下午_电脑快没电啦------------------------------------------

没有改JDK配置,参考博客CoreNLP安装教程
但是中文还是会报错
在这里插入图片描述
不管了,暂时也用不到中文
附上英文的测试代码以及输出结果:

from stanfordcorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP(r'E:\SCI\SynFue_Demo\core_nlp\stanford-corenlp-latest\stanford-corenlp-4.3.1')

sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.'
print ('Tokenize:', nlp.word_tokenize(sentence))
print ('Part of Speech:', nlp.pos_tag(sentence))
print ('Named Entities:', nlp.ner(sentence))
print ('Constituency Parsing:', nlp.parse(sentence))#语法树
print ('Dependency Parsing:', nlp.dependency_parse(sentence))#依存句法
nlp.close() # Do not forget to close! The backend server will consume a lot memery

输出结果(因为之前动了包里面自带的代码,导致有奇怪的语句混入)

sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Tokenize: ['Guangdong', 'University', 'of', 'Foreign', 'Studies', 'is', 'located', 'in', 'Guangzhou', '.']
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Part of Speech: [('Guangdong', 'NNP'), ('University', 'NNP'), ('of', 'IN'), ('Foreign', 'NNP'), ('Studies', 'NNPS'), ('is', 'VBZ'), ('located', 'VBN'), ('in', 'IN'), ('Guangzhou', 'NNP'), ('.', '.')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ner%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Named Entities: [('Guangdong', 'ORGANIZATION'), ('University', 'ORGANIZATION'), ('of', 'ORGANIZATION'), ('Foreign', 'ORGANIZATION'), ('Studies', 'ORGANIZATION'), ('is', 'O'), ('located', 'O'), ('in', 'O'), ('Guangzhou', 'CITY'), ('.', 'O')]
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27pos%2Cparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Constituency Parsing: (ROOT
  (S
    (NP
      (NP (NNP Guangdong) (NNP University))
      (PP (IN of)
        (NP (NNP Foreign) (NNPS Studies))))
    (VP (VBZ is)
      (VP (VBN located)
        (PP (IN in)
          (NP (NNP Guangzhou)))))
    (. .)))
sa= ('::1', 9001, 0, 0)
/?properties=%7B%27annotators%27%3A+%27depparse%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=en
Dependency Parsing: [('ROOT', 0, 7), ('compound', 2, 1), ('nsubj:pass', 7, 2), ('case', 5, 3), ('compound', 5, 4), ('nmod', 2, 5), ('aux:pass', 7, 6), ('case', 9, 8), ('obl', 7, 9), ('punct', 7, 10)]
sa= ('::1', 9002, 0, 0)
/?properties=%7B%27annotators%27%3A+%27ssplit%2Ctokenize%27%2C+%27outputFormat%27%3A+%27json%27%7D&pipelineLanguage=zh

coreNLP使用文档

23:00好家伙~~,依赖边出不来,报错:~~ 没事了,好像只是网络连接的问题😂
明天做一下,准备参考这个

————————————我是分割线10月16日————————

关于如何跑出样例

记得仔细看readme文件!!记得仔细看readme文件!!记得仔细看readme文件!!

源工程文件说要在命令行输入那就不要乱run

python Synfue.py train --config configs/16res_train.conf

贴一下当前的conda list,

Name Version Build Channel
blas 1.0 mkl
boto3 1.18.21 pyhd3eb1b0_0
botocore 1.21.41 pyhd3eb1b0_1
brotlipy 0.7.0 py36h2bbff1b_1003
ca-certificates 2021.9.30 haa95532_1
certifi 2021.5.30 py36haa95532_0
cffi 1.14.6 py36h2bbff1b_0
charset-normalizer 2.0.4 pyhd3eb1b0_0
click 8.0.1 pyhd3eb1b0_0
colorama 0.4.4 pypi_0 pypi
cpuonly 1.0 0 pytorch
cryptography 3.4.7 py36h71e12ea_0
cython 0.29.24 pypi_0 pypi
freetype 2.10.4 hd328e21_0
icc_rt 2019.0.0 h0cc432a_1
idna 3.2 pyhd3eb1b0_0
importlib-metadata 4.8.1 py36haa95532_0
intel-openmp 2021.3.0 haa95532_3372
jmespath 0.10.0 pyhd3eb1b0_0
joblib 1.0.1 pyhd3eb1b0_0
jpeg 9d h2bbff1b_0
libpng 1.6.37 h2a8f88b_0
libtiff 4.2.0 hd0e1b90_0
lz4-c 1.9.3 h2bbff1b_1
mkl 2019.4 245
mkl-service 2.3.0 py36h196d8e1_0
mkl_fft 1.3.0 py36h46781fe_0
mkl_random 1.0.4 py36h343c172_0
ninja 1.10.2 h6d14046_1
nltk 3.6.3 pyhd3eb1b0_0
numpy 1.17.0 py36h19fb1c0_0
numpy-base 1.17.0 py36hc3f5095_0
olefile 0.46 py36_0
openssl 1.1.1l h2bbff1b_0
pillow 8.3.1 py36h4fa10fc_0
pip 21.2.2 py36haa95532_0
psutil 5.8.0 pypi_0 pypi
pycparser 2.20 py_2
pyopenssl 20.0.1 pyhd3eb1b0_1
pysocks 1.7.1 py36haa95532_0
python 3.6.13 h3758d61_0
python-dateutil 2.8.2 pyhd3eb1b0_0
pytorch 1.4.0 py3.6_cpu_0 [cpuonly] pytorch
pyyaml 5.4.1 py36h2bbff1b_1
regex 2021.8.3 py36h2bbff1b_0
requests 2.26.0 pyhd3eb1b0_0
s3transfer 0.5.0 pyhd3eb1b0_0
sacremoses 0.0.43 pyhd3eb1b0_0
scikit-learn 0.24.2 pypi_0 pypi
scipy 1.5.2 py36h9439919_0
sentencepiece 0.1.96 pypi_0 pypi
setuptools 58.0.4 py36haa95532_0
six 1.16.0 pyhd3eb1b0_0
sqlite 3.36.0 h2bbff1b_0
stanfordcorenlp 3.9.1.1 pypi_0 pypi
threadpoolctl 3.0.0 pypi_0 pypi
tk 8.6.11 h2bbff1b_0
torchvision 0.5.0 py36_cpu [cpuonly] pytorch
tqdm 4.62.2 pyhd3eb1b0_1
transformers 2.1.1 pyhd3eb1b0_0
typing_extensions 3.10.0.2 pyh06a4308_0
urllib3 1.26.7 pyhd3eb1b0_0
vc 14.2 h21ff451_1
vs2015_runtime 14.27.29016 h5e58377_2
wheel 0.37.0 pyhd3eb1b0_1
win_inet_pton 1.1.0 py36haa95532_0
wincertstore 0.2 py36h7fe50ca_0
xz 5.2.5 h62dcd97_0
yaml 0.2.5 he774522_0
zipp 3.6.0 pyhd3eb1b0_0
zlib 1.2.11 h62dcd97_4
zstd 1.4.9 h19a0ad4_0

保险起见再记录一下pip list

Package Version
boto3 1.18.21
botocore 1.21.41
brotlipy 0.7.0
certifi 2021.5.30
cffi 1.14.6
charset-normalizer 2.0.4
click 8.0.1
colorama 0.4.4
cryptography 3.4.7
Cython 0.29.24
idna 3.2
importlib-metadata 4.8.1
jmespath 0.10.0
joblib 1.0.1
mkl-fft 1.3.0
mkl-random 1.0.4
mkl-service 2.3.0
nltk 3.6.3
numpy 1.17.0
olefile 0.46
Pillow 8.3.1
pip 21.2.2
psutil 5.8.0
pycparser 2.20
pyOpenSSL 20.0.1
PySocks 1.7.1
python-dateutil 2.8.2
PyYAML 5.4.1
regex 2021.8.3
requests 2.26.0
s3transfer 0.5.0
sacremoses 0.0.43
scikit-learn 0.24.2
scipy 1.5.2
sentencepiece 0.1.96
setuptools 58.0.4
six 1.16.0
stanfordcorenlp 3.9.1.1
threadpoolctl 3.0.0
torch 1.4.0
torchvision 0.5.0
tqdm 4.62.2
transformers 2.1.1
typing-extensions 3.10.0.2
urllib3 1.26.7
wheel 0.37.0
win-inet-pton 1.1.0
wincertstore 0.2
zipp 3.6.0

遇到的bug:

1、module ‘transformers’ has no attribute ‘get_linear_schedule_with_warmup’
经过检验发现降低版本到2.1.1毫无用处:
最后的解决方案是将两个函数名改一下,猜测是版本更迭之后的遗留问题(不太清楚为啥要改函数名呀😣)

"""
        scheduler = transformers.get_linear_schedule_with_warmup(optimizer,
                                                                 num_warmup_steps=args.lr_warmup * updates_total,
                                                                 num_training_steps=updates_total)
        """
        scheduler=transformers.WarmupLinearSchedule(optimizer,warmup_steps=args.lr_warmup * updates_total,t_total=updates_total)

2、命令行Terminal在pycharm里面打不开
参考如何修复Terminal

3、文件名称不合法
需要replace(’:’,’_’)

然后就可以愉快训练啦~~感谢yt,yyds!!!
配置了一周没有解决的问题被队友两句话搞定是什么感觉🐸,跪了跪了

4、最后卡脖子的问题——电脑内存🙄不够
还没有学会用服务器,看看下次能不能解决(今晚算法还要课设收尾一下)
在这里插入图片描述

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值