最近在使用Python hdfs模块上传文件的时候,出现了一个错误:
Python3代码:
from hdfs.client import Client
class Hdfs(object):
"""docstring for Hdfs"""
def __init__(self, username):
super(Hdfs, self).__init__()
self.client = Client("http://192.168.125.130:50070")
# 上传文件
def uploadFile(self, hdfsPath, localPath):
self.client.upload(hdfsPath, localPath, cleanup=True)
# 列出目录下的文件
def listDir(self, hdfsPath):
return self.client.list(hdfsPath, status=True)
if __name__ == "__main__":
hdfsClient = Hdfs(username='root')
hdfsClient.uploadFile('/yunpan/Ricky', 'settings.py')
print(hdfsClient.listDir('/yunpan/'))
Traceback (most recent call last):
File "D:\Python\PythonCode\Flask\yunpan\hadoop\hdfsUntil.py", line 35, in <module>
hdfsClient.uploadFile('/yunpan/Ricky', 'settings.py')
File "D:\Python\PythonCode\Flask\yunpan\hadoop\hdfsUntil.py", line 22, in uploadFile
self.client.upload(hdfsPath, localPath, cleanup=True)
File "D:\Python\lib\site-packages\hdfs\client.py", line 611, in upload
raise err
File "D:\Python\lib\site-packages\hdfs\client.py", line 600, in upload
_upload(path_tuple)
File "D:\Python\lib\site-packages\hdfs\client.py", line 530, in _upload
self.write(_temp_path, wrap(reader, chunk_size, progress), **kwargs)
File "D:\Python\lib\site-packages\hdfs\client.py", line 476, in write
consumer(data)
File "D:\Python\lib\site-packages\hdfs\client.py", line 468, in consumer
data=(c.encode(encoding) for c in _data) if encoding else _data,
File "D:\Python\lib\site-packages\hdfs\client.py", line 214, in _request
**kwargs
File "D:\Python\lib\site-packages\requests\sessions.py", line 512, in request
resp = self.send(prep, **send_kwargs)
File "D:\Python\lib\site-packages\requests\sessions.py", line 622, in send
r = adapter.send(request, **kwargs)
File "D:\Python\lib\site-packages\requests\adapters.py", line 463, in send
low_conn.endheaders()
File "D:\Python\lib\http\client.py", line 1234, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "D:\Python\lib\http\client.py", line 1026, in _send_output
self.send(msg)
File "D:\Python\lib\http\client.py", line 964, in send
self.connect()
File "D:\Python\lib\site-packages\urllib3\connection.py", line 196, in connect
conn = self._new_conn()
File "D:\Python\lib\site-packages\urllib3\connection.py", line 180, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x000001F42B28C550>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed
提示上传文件失败。。。
在百度了很久之后,无果,最后在Google上找到了答案,还是Google大法好啊,百度是真的垃圾。。。
原因是Windows找不到Hadoop的hostname,需要修改Windows的host文件,在C:\Windows\System32\drivers\etc\hosts添加如下内容:
192.168.125.130 master
其中192.168.125.130为Hadoop的节点IP,master为hostname,因为我是单节点的,所以只填了namenode的hostname和IP,如果有多个节点需要将这些节点的IP和hostname添加到hosts文件中。
最后,上面的代码在Linux系统上是可以成功运行的,因为在Linux上配置好了Haoop就已经配置好了hosts文件。
附上原文地址:https://www.smwenku.com/a/5b892b272b71775d1ce055f3/zh-cn