关于使用PyCharm远程调试运行时StanfordCoreNLP报无法找到java的问题解决
最近学习NLP,在PyCharm配置好了远程调试运行,在使用stanfordcorenlp的时候报错FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java',原本以为可以和上一篇文章《关于pyhanlp报FileNotFoundError: [Errno 2] No such file or directory: '/usr/lib/jvm'错误的解决》一样,添加环境变量即可,但无济于事。网上也没有查到类似的错误,看来是我jdk的安装比较奇葩?
报错详情:
ssh://yl@IP:PORT/home/USER/anaconda3/envs/tensorflow/bin/python -u /home/yl/python/nlp/learing/test01.py
Traceback (most recent call last):
File "/home/yl/python/nlp/learing/test01.py", line 7, in <module>
snlp = StanfordCoreNLP(os.sep + 'opt' + os.sep + "nlp" + os.sep + 'stanford-corenlp', lang='zh')
File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/site-packages/stanfordcorenlp/corenlp.py", line 46, in __init__
if not subprocess.call(['java', '-version'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT) == 0:
File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 323, in call
with Popen(*popenargs, **kwargs) as p:
File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 774, in __init__
restore_signals, start_new_session)
File "/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py", line 1522, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'java': 'java'
原因分析:
老办法,网上找不到答案,慢慢看源代码找原因吧。打开文件/home/yl/anaconda3/envs/tensorflow/lib/python3.7/subprocess.py找到_execute_child()这个函数,并在1522行附近有如下代码
if issubclass(child_exception_type, OSError) and hex_errno:
errno_num = int(hex_errno, 16)
child_exec_never_called = (err_msg == "noexec")
if child_exec_never_called:
err_msg = ""
# The error must be from chdir(cwd).
err_filename = cwd
else:
err_filename = orig_executable
if errno_num != 0:
err_msg = os.strerror(errno_num)
if errno_num == errno.ENOENT:
err_msg += ': ' + repr(err_filename)
raise child_exception_type(errno_num, err_msg, err_filename)
可以看到当errno_no不为0的时候报错,依次向上查看,可以看到错误来源 errno_no -> hex_errno -> errpipe_data -> errpipe_read -> self.pid = _posixsubprocess.fork_exec()执行时产生(1452行左右),从该函数输入参数名来看,应该是executable_list和env_list影响了是否能找到java位置。于是在executable_list生成附近print了查看其变化情况,如下(1436行至1442行)
executable = os.fsencode(executable)
print('executable: ', executable) # 打印 从上面传入的初始值
if os.path.dirname(executable):
executable_list = (executable,)
else:
# This matches the behavior of os._execvpe().
print('env: ', env) # 打印 env
print('get_exec_path of env: ', os.get_exec_path(env)) # 打印 从env获取系统可执行路径 应该是 PATH 变量
executable_list = tuple(
os.path.join(os.fsencode(dir), executable)
for dir in os.get_exec_path(env))
print('executable_list: ', executable_list) # 打印 最终的路径结果
PyCharm中导入stanfordcorenlp执行StanfordCoreNLP时输出如下:
executable: b'java'
env: None
get_exec_path of env: ['/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games']
executable_list: (b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java')
Traceback (most recent call last):
在linux终端中运行输出如下:
yl@ylhome [20:24:07] ~$ /home/yl/anaconda3/envs/tensorflow/bin/python
Python 3.7.3 (default, Mar 27 2019, 22:11:17)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> from stanfordcorenlp import StanfordCoreNLP
>>> stanford_nlp = StanfordCoreNLP('/opt/nlp/stanford-corenlp', lang='zh')
executable: b'java'
env: None
get_exec_path of env: ['/home/yl/.local/bin', '/home/yl/bin', '/usr/local/cuda/bin', '/home/yl/anaconda3/bin', '/usr/local/java/latest/bin', '/usr/local/cuda/bin', '/usr/local/sbin', '/usr/local/bin', '/usr/sbin', '/usr/bin', '/sbin', '/bin', '/usr/games', '/usr/local/games', '/snap/bin']
executable_list: (b'/home/yl/.local/bin/java', b'/home/yl/bin/java', b'/usr/local/cuda/bin/java', b'/home/yl/anaconda3/bin/java', b'/usr/local/java/latest/bin/java', b'/usr/local/cuda/bin/java', b'/usr/local/sbin/java', b'/usr/local/bin/java', b'/usr/sbin/java', b'/usr/bin/java', b'/sbin/java', b'/bin/java', b'/usr/games/java', b'/usr/local/games/java', b'/snap/bin/java')
executable: b'/bin/sh'
>>>
命令行中读取到的PATH的值是正确的,PyCharm远程调用时无法获取用户自行添加的PATH。那么,一个便捷的方式是将java链接到PyCharm调用时能读取到的位置,如/usr/local/bin中。
具体内在的原因,由于时间匆忙就不予深究了,暂时解决问题以后再来回顾。
解决办法:
将java命令链接到系统默认的可执行目录,如/usr/bin或/usr/local/bin等地方。我的配置:
sudo ln -sf /usr/local/java/latest/bin/java /usr/local/bin/java
运行效果:
然后在PyCharm中运行stanfordcorenlp包,可以正常运行
from stanfordcorenlp import StanfordCoreNLP
import os
snlp = StanfordCoreNLP('/opt/nlp/stanford-corenlp', lang='zh')
str = '今天晚上吃火锅啊!'
print(snlp.ner(str))
结果: