报错
用spark 往es写入数据,总是丢失几条数据,进入日志详细查看后发现一个报错
{"index":"*****","type":"_doc","id":"1267565827","cause":{"type":"exception","reason":"Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field=\"medical_record_info\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[91, 123, 112, 97, 116, 105, 101, 110, 116, 95, 117, 105, 110, 58, 49, 50, 54, 55, 53, 54, 53, 56, 50, 55, 44, 116, 114, 101, 97, 116]...', original message: bytes can be at most 32766 in length; got 37920]","caused_by":{"type":"exception","reason":"Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 37920]"}},"status":400}
{"index":"*****","type":"_doc","id":"1396085925","cause":{"type":"exception","reason":"Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field=\"medical_record_info\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[91, 123, 112, 97, 116, 105, 101, 110, 116, 95, 117, 105, 110, 58, 49, 51, 57, 54, 48, 56, 53, 57, 50, 53, 44, 116, 114, 101, 97, 116]...', original message: bytes can be at most 32766 in length; got 56516]","caused_by":{"type":"exception","reason":"Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 56516]"}},"status":400}
{"index":"*****","type":"_doc","id":"1455758148","cause":{"type":"exception","reason":"Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field=\"medical_record_info\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[91, 123, 112, 97, 116, 105, 101, 110, 116, 95, 117, 105, 110, 58, 49, 52, 53, 53, 55, 53, 56, 49, 52, 56, 44, 116, 114, 101, 97, 116]...', original message: bytes can be at most 32766 in length; got 41352]","caused_by":{"type":"exception","reason":"Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 41352]"}},"status":400}
{"index":"*****","type":"_doc","id":"20000063963","cause":{"type":"exception","reason":"Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field=\"medical_record_info\" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped. Please correct the analyzer to not produce such terms. The prefix of the first immense term is: '[91, 123, 112, 97, 116, 105, 101, 110, 116, 95, 117, 105, 110, 58, 50, 48, 48, 48, 48, 48, 54, 51, 57, 54, 51, 44, 116, 114, 101, 97]...', original message: bytes can be at most 32766 in length; got 3178084]","caused_by":{"type":"exception","reason":"Elasticsearch exception [type=max_bytes_length_exceeded_exception, reason=max_bytes_length_exceeded_exception: bytes can be at most 32766 in length; got 3178084]"}},"status":400}
whose UTF8 encoding is longer than the max length 32766
异常原因是某个模块的长度太长超过了keyword类型的最大长度,将类型设置成text 就好了