With flume, you can collect log and storage to hdfs.
First, you should create a hdfs path, like "hadoop fs -mkdir /flume_data_pool";
Second, create a conf file in flume conf path,
# Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
#define a memory channel called c1 on a1
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
#Describe/configure the source
a1.sources.r1.channels = c1
a1.sources.r1.type = exec
a1.sources.r1.command = tail -f /share/flume_test/log/logserver.log
#Describe the sink
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = hdfs://master:9000/flume_data_pool
a1.sinks.k1.hdfs.filePrefix = events-
a1.sinks.k1.hdfs.filetype = DataStream
a1.sinks.k1.hdfs.writeFormat = Text
a1.sinks.k1.hdfs.rollSize = 0
a1.sinks.k1.hdfs.rollCount = 60000
a1.sinks.k1.hdfs.rollInterval = 600
Then, exec "./bin/flume-ng agent --conf conf --conf-file ./conf/flume_hdfs_exec.conf --name a1 -Dflume.root.logger=INFO,console", any changes in logserver.log will be save to hdfs://master:9000/flume_data_pool.