来自Flume官网,原文地址:
https://issues.apache.org/jira/browse/FLUME-2052
根据原文描述,有人在使用Flume读取编码混乱的数据过程中,遇到了错误,然后上官网提出改进意见,希望忽略掉这种错误。
细节描述:
<pre name="code" class="html">Details
Type:Improvement Improvement
Status:RESOLVED
Priority:Major Major
Resolution:Fixed
Affects Version/s:
v1.4.0
Fix Version/s:
v1.5.0
Component/s:None
Labels:
MalformedInputException charset flume
Environment:
centOS 6.3
Flume 1.3.0
状态:已将解决
受影响版本:Flume1.4.0及以前的版本
解决版本:Flume1.5.0及以后版本
提交者遇到的报错:
When parsing a file with messed up encoding flume spits this error:
23 May 2013 22:06:29,446 ERROR [pool-12-thread-1] (org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run:164) - Uncaught exception in Runnable
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:162)
at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:134)
at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:72)
at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:91)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:221)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:154)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:351)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
It would be good to skip such characters, ignore them or delete. Corrupt signs come from spamming engines, flume cant handle them at all.
提交者的建议是,最好跳过这些字符,忽略或删除它们。错误百出的报错来自垃圾邮件引擎,flume不能处理它们。
Mike Percy对Flume做出了修改。
Mike Percy made changes -
28/Sep/13 04:58
问题回顾:
Summary:Spooling directory source should be able to replace or ignore malformed characters
Review Request #14396 - Created 九月 28, 2013 and updated 2 years, 1 month ago
Information
Submitter: Mike Percy
Repository: flume-git
Branch:
Bugs: FLUME-2052
Depends On:
Reviewers
Groups: Flume
People:
Description
Spooling directory source should be able to replace or ignore malformed characters instead of hanging.
Testing Done
Added several new unit tests.