Linux用户登录,sudo -s 切换到root用户,/usr/local下创建tool目录(个人习惯)
一、SSH免密登录
cd /root/.ssh,使用rsa生成秘钥,ssh-keygen -t rsa 一路回车即可
cat id_rsa.pub >> authorized_keys
二、JDK安装配置(本地有jdk-8u171-linux-x64.tar.gz包,下同)
1.rz上传jdk并解压到/usr/local/tool目录下
tar -zxvf jdk-8u171-linux-x64.tar.gz
2.vim /etc/profile添加路径
export JAVA_HOME=/usr/local/tool/jdk1.8.0_171
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
3.source /etc/profile刷新并测试java -version
三、Hadoop安装配置(单机)
1.rz上传hadoop并解压到/usr/local/tool目录下,mv修改名称为hadoop
tar -zxvf hadoop-2.6.0-cdh5.4.0.tar.gz
2.vim hadoop-env.sh
export JAVA_HOME=/usr/local/tool/jdk1.8.0_171
3.vim core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/tool/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://ip:8020</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
4.vim hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///usr/local/tool/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///usr/local/tool/hadoop/dfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>ip:50090</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.root.groups</name>
<value>*</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
5.vim mapred-site.xml(cp mapred-site.xml.template mapred-site.xml)
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<property>
<name>mapred.reduce.max.attempts</name>
<value>2</value>
</property>
<property>
<name>mapreduce.jobtracker.http.address</name>
<value>ip:50030</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>ip:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ip:19888</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>ip:9001</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/history/done</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp/hadoop-yarn/staging</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/history/done_intermediate</value>
</property>
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<description>The amount of memory the map/reduce appmaster needs.</description>
<value>4096</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<description>Java opts for map/reduce appmaster</description>
<value>-Xmx3276M</value>
</property>
<property>
<name>mapreduce.task.timeout</name>
<value>480000</value>
</property>
</configuration>
6.vim yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>ip</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>ip:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>ip:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>ip:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>ip:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>ip:8088</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://ip:19888/jobhistory/logs/</value>
</property>
<!--log日志收集 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!--在HDFS上聚合的日志最长保留多少秒。3天-->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.local-dirs</name>
<value>/usr/local/tool/nodemanagerTmp</value>
</property>
<property>
<description>Classpath for typical applications.</description>
<name>yarn.application.classpath</name>
<value>$HADOOP_CONF_DIR,
$HADOOP_COMMON_HOME/share/hadoop/common/*,
$HADOOP_COMMON_HOME/contrib/capacity-scheduler/*,
$HADOOP_COMMON_HOME/share/hadoop/common/lib/*,
$HADOOP_HDFS_HOME/share/hadoop/hdfs/*,$HADOOP_HDFS_HOME/share/hadoop/hdfs/lib/*,
$YARN_HOME/share/hadoop/yarn/*,$YARN_HOME/share/hadoop/yarn/lib/*,
$YARN_HOME/share/hadoop/mapreduce/*,$YARN_HOME/share/hadoop/mapreduce/lib/*,
$YARN_HOME/share/hadoop/mapreduce1/,$YARN_HOME/share/hadoop/mapreduce1/lib/*,
$YARN_HOME/share/hadoop/mapreduce2/,$YARN_HOME/share/hadoop/mapreduce2/lib/*,
$HBASE_HOME/*,$HBASE_HOME/lib/*</value>
</property>
<property>
<description>Amount of physical memory, in MB, that can be allocated
for containers.</description>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>16384</value>
</property>
<property>
<name>yarn.tracking.url.generator</name>
<value>org.apache.hadoop.mapreduce.v2.hs.webapp.MapReduceTrackingUriPlugin</value>
</property>
<property>
<name>yarn.client.nodemanager-connect.max-wait-ms</name>
<value>180000</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>5</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
<!-- scheduler configuration, for multi-tasks run in queue, avoid mapreduce-run & pyspark ACCEPTED not run problem -->
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<!-- 下面配置用来设置集群利用率的阀值, 默认值0.8f,最多可以抢占到集群所有资源的80% -->
<property>
<name>yarn.scheduler.fair.preemption.cluster-utilization-threshold</name>
<value>0.9</value>
</property>
</configuration>
7.格式化文件系统 bin/hdfs namenode -format
8.配置hadoop环境变量(配置后记得刷新)
export HADOOP_HOME=/usr/local/tool/hadoop
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:${HADOOP_HOME}/bin:$PATH
9.编写start.sh的启动脚本,启动hadoop,输入jps进行测试,打开进入http://ip:50070HDFS文件系统即可
四、MySQL安装配置
1.安装mysql时可能会出现这样的问题,ubuntu14命令apt-get install 方式默认安装的是5.5版本,但是我们要用5.7版本,低于该版本会导致SQL语句冲突
解决办法:
1)官网下载 deb_bundle.tar包,如mysql-server_5.7.10-1ubuntu14.04_amd64.deb-bundle.tar,然后mv至目标路径下 并赋权chmod +x,最后解压即可
2)更新mysql的源:
sudo wget http://dev.mysql.com/get/mysql-apt-config_0.7.3-1_all.deb
sudo dpkg -i mysql-apt-config_0.7.3-1_all.deb
sudo apt-get update
sudo apt-get install mysql-server-5.7
3)(推荐使用)
sudo wget http://dev.mysql.com/get/mysql-apt-config_0.3.5-1ubuntu14.04_all.deb
sudo dpkg -i mysql-apt-config_0.3.5-1ubuntu14.04_all.deb
sudo apt-get update
sudo apt-get install mysql-server-5.7
2.远程登录
查所有用户的登录权限:select host,user from mysql.user;
同意所有用户登录:update mysql.user set host=’%’ where user=‘root’;\
刷新权限:flush privileges;
3.mysql建表建库
五、Hive安装配置
1.rz上传hive并解压到/usr/local/tool目录下
tar -zxvf hive-1.1.0-cdh5.4.0.tar.gz
2.创建HDFS的目录用于保存hive数据,并修改权限
hdfs dfs -mkdir /tmp
hdfs dfs -mkdir -p /user/hive/warehouse
hdfs dfs -chmod g+w /tmp
hdfs dfs -chmod g+w /user/hive/warehouse
3.vim hive-env.sh (cp hive-env.sh.template hive-env.sh)
export HADOOP_HOME=/usr/local/tool/hadoop
export HIVE_HOME=/usr/local/tool/hive
export HIVE_CONF_DIR=/usr/local/tool/hive/conf
4.vim hive-site.xml(cp hive-default.xml.template hive-site.xml)
<configuration>
<property>
<name>hive.exec.dynamic.partition</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
<description>location of default database for the warehouse</description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://ip/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>hive.exec.dynamic.partition</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nonstrict</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>123456</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://ip:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>
</configuration>
5.需要在hive/lib下边存放jdbc的驱动包:mysql-connector-java-5.1.35.jar
6.初始化Hive元数据 bin/schematool -dbType mysql -initSchema,如果mysql中多出一个metastore数据库,则说明配置成功
7.hive建表:把下面的文件rz到你安装hive的机子上,执行命令:hive -f dmf.txt
例如:dmf.txt(此txt文件是已经写好的建表建库语句,hive数据库中所需的库还有campaign_success、dde、pubs、video等)
8.配置hive环境变量(配置后记得刷新)
export HIVE_HOME=/usr/local/tool/hive
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export PATH=${JAVA_HOME}/bin:${HIVE_HOME}/bin:$PATH
9.启动hive,最好是检验一下 10000、9083是否起来了,netstat -ntlp |grep 10000/netstat -ntlp |grep 9083
六、zookeeper安装配置
1.rz上传zookeeper并解压到/usr/local/tool目录下
tar -zxvf zookeeper-3.4.9.tar.gz
2.vim zoo.cfg
dataDir=/usr/local/tool/zookeeper/tmp/zookeeper
dataLogDir=/usr/local/tool/zookeeper/tmp/dataLogDir
3.配置zookeeper环境变量(配后记得刷新)
export ZOOKEEPER_HOME=/usr/local/tool/zookeeper
export PATH=$ZOOKEEPER_HOME/bin:$PATH
4.启动zookeeper服务:zkServer.sh start 查看zookeeper的状态: zkServer.sh status
七、HBASE安装配置
1.rz上传hbase并解压到/usr/local/tool目录下
tar -zxvf hbase-1.0.0-cdh5.4.0.tar.gz
2.vim hbase-env.sh
export HBASE_OPTS="-XX:+UseConcMarkSweepGC"
export JAVA_HOME=/usr/local/tool/jdk1.8.0_171
export HBASE_MANAGES_ZK=false
3.vim hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ip:8020/data_team/hbase</value>
<description>The directory shared by RegionServers.</description>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip:2181</value>
</property>
<property>
<name>hbase.master.maxclockskew</name>
<value>300000</value>
<description>Time difference of regionserver from master</description>
</property>
</configuration>
4.配置hbase环境变量(配后记得刷新)
export HBASE_HOME=/usr/local/tool/hbase
export PATH=$HBASE_HOME/bin:$PATH
5.启动hbase,编写启动脚本start.sh启动,启动shell:bin/hbase shell
八、HUE安装配置
1.rz上传hue并解压到/usr/local/tool目录下
tar -zxvf hue-3.11.0.tgz
2.hue第三方依赖安装:sudo apt-get install ant gcc g++ libkrb5-dev libffi-dev libmysqlclient-dev libssl-dev libsasl2-dev libsasl2-modules-gssapi-mit libsqlite3-dev libtidy-0.99-0 libxml2-dev libxslt-dev make libldap2-dev maven python-dev python-setuptools libgmp3-dev
3.参考:https://www.cnblogs.com/xupccc/p/9583656.html
注意:
1)hue.ini文件要修改的东西比较多,注意文件中hadoop、mysql等配置
2)初始化mysql库,bin/hue syncdb 、bin/hue migrate ,生成表
3)用户一定不要用root,自己创建用户都可以,不要使用root用户,切换到hue用户,sudo 启动,否则无法启动hue,无法访问根目录
4.启动hue:./build/env/bin/superviso 访问http://ip:8888
九、Rabbitmq安装配置
1.添加以下安装源到/etc/apt/sources.list中
deb http://www.rabbitmq.com/debian/ testing main
2.更新:apt-get update 安装rabbitmq:apt-get install rabbitmq-server
3.添加用户(cd /usr/lib/rabbitmq/bin)(yi情况而加)
rabbitmqctl add_user app_dpaas_dmf_user app_dpaas_dmf_user
rabbitmqctl add_user app_dpaas_ebee_user app_dpaas_ebee_user
4.添加管理员权限
rabbitmqctl set_user_tags app_dpaas_dmf_user administrator
rabbitmqctl set_user_tags app_dpaas_ebee_user administrator
5.添加读写等权限
rabbitmqctl set_permissions -p / app_dpaas_ebee_user '.*' '.*' '.*'
rabbitmqctl set_permissions -p / app_dpaas_dmf_user '.*' '.*' '.*'
6.安装 RabbitMQWeb管理插件
rabbitmq-plugins enable rabbitmq_management
7.修改安装目录下的ebin/rabbit.app并重启服务
将rabbit.app配置下的 {loopback_users, [<<”guest”>>]} 修改为 {loopback_users, []}
重启服务:service rabbitmq-server restart
8.访问web页面:http://ip:15672
十、Kafka安装配置
1.rz上传kafka并解压到/usr/local/tool目录下
tar -zxvf kafka_2.11-1.1.0.tgz
2.vim server.properties
listeners=PLAINTEXT://ip:9092
3.启动zookeeper
bin/zookeeper-server-start.sh config/zookeeper.properties &
4.启动kafka
bin/kafka-server-start.sh config/server.properties &
5.检查服务
netstat -tunlp|egrep “(2181|9092)”
十一、Vertica安装配置
1.rz上传vertica到/usr/local/tool目录下
参考安装文档:
https://blog.csdn.net/weixin_40366684/article/details/109461400
https://www.dbjungle.com/installing-a-single-node-hpe-vertica-8-cluster-on-ubuntu-14-04/
连接vertica:
用户名:mydba
密码:mydba
2.vertica建库建表