flume

最新推荐文章于 2024-03-23 09:11:36 发布

快快快看看你

最新推荐文章于 2024-03-23 09:11:36 发布

阅读量433

点赞数

分类专栏： hadoop 文章标签： flume异常 Java heap space java.lang.OutOfMemor flume-conf.propertie

本文链接：https://blog.csdn.net/u012744265/article/details/51456496

版权

hadoop 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

首先配置flume-conf.properties

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# The configuration file needs to define the sources,
# the channels and the sinks.
# Sources, channels and sinks are defined per agent,
# in this case called 'agent'

# Define source, channel, sink
agent1.sources = spool-source1
agent1.channels = ch1
agent1.sinks = hdfs-sink1

# Define and configure an Spool directory source
agent1.sources.spool-source1.channels = ch1
agent1.sources.spool-source1.type = spooldir
agent1.sources.spool-source1.spoolDir = /home/hadoop/flumedata
agent1.sources.spool-source1.ignorePattern = event(_\d{4}\-\d{2}\-\d{2}_\d{2}_\d{2})?\.log(\.COMPLETED)?
agent1.sources.spool-source1.deserializer.maxLineLength = 20480
# Configure channel
agent1.channels.ch1.type = file
agent1.channels.ch1.checkpointDir = /home/hadoop/work/flume/checkpointDir
agent1.channels.ch1.dataDirs = /home/hadoop/work/flume/dataDirs
# Define and configure a hdfs sink
agent1.sinks.hdfs-sink1.channel = ch1
agent1.sinks.hdfs-sink1.type = hdfs
agent1.sinks.hdfs-sink1.hdfs.path = hdfs://mycluster/flume/%Y%m%d
agent1.sinks.hdfs-sink1.hdfs.useLocalTimeStamp = true
agent1.sinks.hdfs-sink1.hdfs.rollInterval = 300
agent1.sinks.hdfs-sink1.hdfs.rollSize = 67108864
agent1.sinks.hdfs-sink1.hdfs.rollCount = 0
agent1.sinks.hdfs-sink1.hdfs.codeC = snappy

执行bin/flume-ng agent -n agent1 -f conf/flume-conf.properties

当收集数据文件过大时会报异常

java.lang.OutOfMemoryError: Java heap space

at java.util.HashMap.resize(HashMap.java:703)
   at java.util.HashMap.putVal(HashMap.java:662)
   at java.util.HashMap.put(HashMap.java:611)
   at org.apache.flume.channel.file.EventQueueBackingStoreFile.put(EventQueueBackingStoreFile.java:326)
   at org.apache.flume.channel.file.FlumeEventQueue.set(FlumeEventQueue.java:285)
   at org.apache.flume.channel.file.FlumeEventQueue.add(FlumeEventQueue.java:315)
   at org.apache.flume.channel.file.FlumeEventQueue.addTail(FlumeEventQueue.java:209)
   at org.apache.flume.channel.file.FileChannel$FileBackedTransaction.doCommit(FileChannel.java:570)
   at org.apache.flume.channel.BasicTransactionSemantics.commit(BasicTransactionSemantics.java:151)
   at org.apache.flume.channel.ChannelProcessor.processEventBatch(ChannelProcessor.java:192)
   at org.apache.flume.source.SpoolDirectorySource$SpoolDirectoryRunnable.run(SpoolDirectorySource.java:232)
   at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
   at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
   at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
   at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
   at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
   at java.lang.Thread.run(Thread.java:745)