目前新安装了airflow1.8。作为MR和spark的作业调度;实际使用过程中遇到如下问题:
当我的脚本里面包含中文时;airflow的worker节点记录作业的日志会卡住;实际作业执行成功了;但是在airflow里面会一直停留在作业卡住的界面;导致后续的作业无法正常执行;
查看worker日志发现如下报错:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/local/python2.7.11/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/local/python2.7.11/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/python2.7.11/lib/python2.7/site-packages/airflow/task_runner/base_task_runner.py", line 95, in _read_task_logs
self.logger.info('Subtask: {}'.format(line.rstrip('\n')))
UnicodeEncodeError: 'ascii' codec can't encode characters in position 74-76: ordinal not in range(128)
self.logger.info("Subtask: %s"%(line.rstrip('\n')))
发现是/usr/local/python2.7.11/lib/python2.7/site-packages/airflow/task_runner/base_task_runner.py这个脚本有问题;
查看line95 前后的代码如下:
def _read_task_logs(self, stream):
while True:
line = stream.readline().decode('utf-8')
if len(line) == 0:
break
self.logger.info('Subtask: {}'.format(line.rstrip('\n')))
改成如下self.logger.info(u'Subtask: {}'.format(line.rstrip('\n'))),然后发现airflow作业执行恢复正常。