现象
某CDH Hadoop集群环境在重启机器后Agent无法启动,报错信息如下,
[17/Dec/2019 16:37:53 +0000] 6741 MainThread agent ERROR Could not determine hostname or ip address; proceeding.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.9.2-py2.6.egg/cmf/agent.py", line 2622, in parse_arguments
ip_address = socket.gethostbyname(fqdn)
gaierror: [Errno -3] Temporary failure in name resolution
usage: cmf-agent [-h] [--agent_dir AGENT_DIR] [--daemon] [--lib_dir LIB_DIR]
[--orphan_process_directory_staleness_threshold ORPHAN_PROCESS_DIRECTORY_STALENESS_THRESHOLD]
[--orphan_process_directory_refresh_interval ORPHAN_PROCESS_DIRECTORY_REFRESH_INTERVAL]
[--agent_httpd_port AGENT_HTTPD_PORT] --package_dir
PACKAGE_DIR [--parcel_dir PARCEL_DIR]
[--supervisord_path SUPERVISORD_PATH]
[--supervisord_httpd_port SUPERVISORD_HTTPD_PORT]
[--standalone STANDALONE] [--master MASTER]
[--environment ENVIRONMENT] [--host_id HOST_ID]
[--disable_supervisord_events] --hostname HOSTNAME
--ip_address IP_ADDRESS
[--reported_hostname REPORTED_HOSTNAME] [--use_tls]
[--client_key_file CLIENT_KEY_FILE]
[--client_cert_file CLIENT_CERT_FILE]
[--verify_cert_file VERIFY_CERT_FILE]
[--verify_cert_dir VERIFY_CERT_DIR]
[--client_keypw_file CLIENT_KEYPW_FILE]
[--client_keypw_cmd CLIENT_KEYPW_CMD]
[--max_cert_depth MAX_CERT_DEPTH] [--logfile LOGFILE]
[--logdir LOGDIR] [--optional_token] [--clear_agent_dir]
[--sudo_command SUDO_COMMAND] [--pidfile PIDFILE]
[--comm_name COMM_NAME]
cmf-agent: error: argument --hostname is required
分析
根据报错信息“Could not determine hostname or ip address; proceeding.”,可以断定是无法识别hostname或IP,
- 检查/etc/hosts是否配置正确
- 检查/etc/sysconfig/network是否配置正确
- 检查hostname是否正确
解决
根据上述检查步骤 ,我们发现/etc/hosts内容无误。但hostname和/etc/sysconfig/network均异常。
[root@test02 cloudera-scm-agent]# cat /etc/sysconfig/network
HOSTNAME=test02.xxx.cnlocalhost.localdomain
NETWORKING=yes
NISDOMAIN=xxx.cn
[root@test02 cloudera-scm-agent]# hostname
test02.esgyn.cnlocalhost.localdomain
修改/etc/sysconfig/network及hostname后,重启Agent正常。