电影评分大作业,flume启动后报如此异常
16/07/11 10:48:08 ERROR source.SpoolDirectorySource: FATAL: Spool Directory source r1: { spoolDir: /mnt/project/bootcamp/practise/movielens/data/ml-1m/ }: Uncaught exception in SpoolDirectorySource thread. Restart or reconfigure Flume to continue processing.
java.nio.charset.MalformedInputException: Input length = 1
at java.nio.charset.CoderResult.throwException(CoderResult.java:277)
at org.apache.flume.serialization.ResettableFileInputStream.readChar(ResettableFileInputStream.java:282)
at org.apache.flume.serialization.LineDeserializer.readLine(LineDeserializer.java:133)
at org.apache.flume.serialization.LineDeserializer.readEvent(LineDeserializer.java:71)
at org.apache.flume.serialization.LineDeserializer.readEvents(LineDeserializer.java:90)
at org.apache.flume.client.avro.ReliableSpoolingFileEventReader.readEvents(ReliableSpoolingFileEventReader.java:252)
at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:228)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
# Describe/configure the source
a1.sources.r1.type = spooldir
a1.sources.r1.spoolDir = /mnt/project/bootcamp/practise/movielens/data/ml-1m/
a1.sources.r1.deserializer.outputCharset=UTF-8
a1.sources.r1.deserializer.maxLineLength = 3000
a1.sources.r1.fileHeader = true
a1.sources.r1.fileHeaderKey = key
spool的配置:
# Describe the sink
a1.sinks.k1.type = org.apache.flume.sink.kafka.KafkaSink
a1.sinks.k1.topic = hbasetopic
# k1.metadata.broker.list
a1.sinks.k1.brokerList =sunxiong-2:9092
# a1.sinks.k1.type = logger
# Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
网上说是编码格式的问题,指定了outputCharset,还是不行