允许转载,但请保留出处:
http://blog.csdn.net/u011419453/article/details/39057711
最近使用elasticsearch,发现index超过了380个,并且只有两个node,现在应经达到每个node上1.8T的index量以及3700个shard,两台服务器这两天频繁的oom,参考了官方的doc文档发现可以将index close掉,这样可以大大的减轻es cluster的state维护压力,所以就想写个shell脚本每天用crond去定时关闭14天之前的index,后来听人介绍发现github上有现在的脚本,所以直接拿来就用了,附上地址:github.com/elasticsearch/curator/,本文就叙述一下Linux安装过程(本人使用RHEL5.5版本)。
1、需要的安装环境:python&&pip.本人的python版本2.7.3,pip版本1.5.6。pip install elasticsearch-curator
2、执行pip install elasticsearch-curator,由它自动安装。
3、安装成功之后,接下来这个问题困扰了我半天:调用curator的时候一直报
Traceback (most recent call last):
File "/usr/local/bin/curator", line 5, in <module>
from pkg_resources import load_entry_point
File "/usr/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 2603, in <module>
File "/usr/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 666, in require
File "/usr/local/lib/python2.7/site-packages/setuptools-0.6c11-py2.7.egg/pkg_resources.py", line 565, in resolve
pkg_resources.DistributionNotFound: elasticsearch>=1.0.0,<2.0.0
后来没办法只能将curator的完整路径用上:
python /usr/local/lib/python2.7/site-packages/curator/curator.py close --timestring %Y.%m.%d --prefix (前缀,默认是logstash-) --older-than 14
按照作者的指南发现执行上段命令之后依然报错:
2014-09-04 16:49:12,174 INFO Job starting...
Traceback (most recent call last):
File "curator.py", line 736, in <module>
main()
File "curator.py", line 714, in main
check_version(client)
File "curator.py", line 259, in check_version
version_number = get_version(client)
File "curator.py", line 254, in get_version
version = client.info()['version']['number']
File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/utils.py", line 68, in _wrapped
return func(*args, params=params, **kwargs)
File "/usr/local/lib/python2.7/site-packages/elasticsearch/client/__init__.py", line 159, in info
_, data = self.transport.perform_request('GET', '/', params=params)
File "/usr/local/lib/python2.7/site-packages/elasticsearch/transport.py", line 284, in perform_request
status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
File "/usr/local/lib/python2.7/site-packages/elasticsearch/connection/http_urllib3.py", line 51, in perform_request
raise ConnectionError('N/A', str(e), e)
elasticsearch.exceptions.ConnectionError: ConnectionError(('Connection aborted.', error(111, 'Connection refused'))) caused by: ProtocolError(('Connection aborted.', error(111, 'Connection refused')))
一直报es连接错误,无奈之后看到作者的文档里写有可以加上--debug进行调试,于是果断调试了一把:
发现log信息里有 xxx GET http://localhost:9200 这段话,于是想到linux下默认localhost是无法访问的,于是想到执行的curator.py,修改curator.py第三十行,将'host': 'localhost',localhost改成实际ip地址,这时运行python /usr/local/lib/python2.7/site-packages/curator/curator.py close --timestring %Y.%m.%d --prefix (前缀,默认是logstash-) --older-than 14即可。