海量日志数据资料收集

1、需求资料收集,架构对比

1、数据收集工具 flume

2、kafka消息

3、ELS 数据存贮检索

4、hadoop

 

flume + hadoop + els

收集 持久化存贮 查询展示

 

参考文章:

Elasticsearch数据库,做全文检索 (支持百亿级别 PB级别数据量)

https://blog.csdn.net/aisemi/article/details/80212836

 

elk安装使用

https://www.cnblogs.com/soar1688/p/6849183.html

 

替换logstash输入

/opt/logstash/bin/logstash -e 'input { redis { batch_count => 1 data_type => "list" key => "yx_231_177" host => "47.98.231.177" port => 3389 password => "yxlab3b5e8v2" db => 6 threads => 1 } } output { elasticsearch { hosts => ["localhost"] } }'

 

1基于Docker快速搭建多节点Hadoop集群

http://dockone.io/article/395

 

hadoop3.0搭建docker集群过程

https://blog.csdn.net/yangym2002/article/details/79014378

https://blog.csdn.net/xu470438000/article/details/50512442

https://www.cnblogs.com/linux-wangkun/p/5745154.html

 

https://blog.csdn.net/xu470438000/article/details/50512442

 

1、我的实际安装hadoop3.0的过程

主机是Ubuntu

 

mkdir hadoopbuild

cd hadoopbuild

wget http://download.oracle.com/otn-pub/java/jdk/8u181-b13/96a7b8442fe848ef90c96a2fad6ed6d1/jdk-8u181-linux-x64.tar.gz?AuthParam=1534306800_1bfa0f93f011feb77b533aa38a93056d

mv jdk-8u181-linux-x64.tar.gz\?AuthParam\=1534306800_1bfa0f93f011feb77b533aa38a93056d  jdk-8u181-linux-x64.tar.g

docker pull centos

wget http://mirrors.hust.edu.cn/apache/hadoop/common/hadoop-3.1.1/hadoop-3.1.1.tar.gz

Vi Dockerfile

#centos
# 选择一个已有的os镜像作为基础  
FROM centos 
#指明镜像作者
MAINTAINER hzy
#将JAVA文件添加到镜像中
ADD jdk-8u181-linux-x64.tar.gz /usr/local/
ADD hadoop-3.1.1.tar.gz /usr/local
RUN mv /usr/local/jdk1.8.0_181 /usr/local/jdk1.8
RUN mv /usr/local/hadoop-3.1.1 /usr/local/hadoop-3.1


ENV JAVA_HOME /usr/local/jdk1.8
ENV HADOOP_HOME /usr/local/hadoop-3.1
ENV PATH $JAVA_HOME/bin:$HADOOP_HOME/bin:$PATH
# 安装openssh-server和sudo软件包,并且将sshd的UsePAM参数设置成no  
RUN yum install -y openssh-server sudo  
RUN sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config  
#安装openssh-clients
RUN yum  install -y openssh-clients


# 添加测试用户root,密码root,并且将此用户添加到sudoers里  
RUN echo "root:root" | chpasswd  
RUN echo "root   ALL=(ALL)       ALL" >> /etc/sudoers  
# 下面这两句比较特殊,在centos6上必须要有,否则创建出来的容器sshd不能登录  
RUN ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key  
RUN ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key  


# 启动sshd服务并且暴露22端口  
RUN mkdir /var/run/sshd  
EXPOSE 22  
CMD ["/usr/sbin/sshd", "-D"

docker build -t centos-ssh-root-java-hadoop .

 

 

 

 

发布了29 篇原创文章 · 获赞 8 · 访问量 4万+
展开阅读全文

没有更多推荐了,返回首页

©️2019 CSDN 皮肤主题: 大白 设计师: CSDN官方博客

分享到微信朋友圈

×

扫一扫,手机浏览