JDK安装
Scala安装
下载-解压-配置到系统环境变量中
Maven安装
Hadoop安装
下载-解压-配置到系统环境变量-检查是否成功
配置ssh:ssh-keygen -t rsa 全部回车
ll -a查看到有一个.ssh目录
[hadoop@hadoop000 .ssh]$ ls
id_rsa id_rsa.pub known_hosts
[hadoop@hadoop000 .ssh]$ cp ~/.ssh/id_rsa.pub ~/.ssh/authorized_keys
[hadoop@hadoop000 .ssh]$ ll
total 16
-rw-r–r--. 1 hadoop hadoop 398 May 1 01:02 authorized_keys
-rw-------. 1 hadoop hadoop 1675 May 1 00:58 id_rsa
-rw-r–r--. 1 hadoop hadoop 398 May 1 00:58 id_rsa.pub
-rw-r–r--. 1 hadoop hadoop 787 May 1 00:29 known_hosts
修改配置文件:
$HADOOP_HOME/etc/hadoop
1.hadoop-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_11
2.core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop000:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/app/tmp</value>
</property>
</configuration>
3.hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
4.slaves
hadoop000
bin目录下的配置文件
[hadoop@hadoop000 bin]$ ls
hadoop hadoop.cmd hdfs hdfs.cmd mapred mapred.cmd rcc yarn yarn.cmd
[hadoop@hadoop000 bin]$ ./hdfs
Usage: hdfs [--config confdir] COMMAND
where COMMAND is one of:
dfs run a filesystem command on the file systems supported in Hadoop.
namenode -format format the DFS filesystem
secondarynamenode run the DFS secondary namenode
namenode run the DFS namenode
journalnode run the DFS journalnode
zkfc run the ZK Failover Controller daemon
datanode run a DFS datanode
dfsadmin run a DFS admin client
haadmin run a DFS HA admin client
fsck run a DFS filesystem checking utility
balancer run a cluster balancing utility
jmxget get JMX exported values from NameNode or DataNode.
mover run a utility to move block replicas across
storage types
oiv apply the offline fsimage viewer to an fsimage
oiv_legacy apply the offline fsimage viewer to an legacy fsimage
oev apply the offline edits viewer to an edits file
fetchdt fetch a delegation token from the NameNode
getconf get config values from configuration
groups get the groups which users belong to
snapshotDiff diff two snapshots of a directory or diff the
current directory contents with a snapshot
lsSnapshottableDir list all snapshottable dirs owned by the current user
Use -help to see options
portmap run a portmap service
nfs3 run an NFS version 3 gateway
cacheadmin configure the HDFS cache
crypto configure HDFS encryption zones
storagepolicies list/get/set block storage policies
version print the version
Most commands print help when invoked w/o parameters.
./hdfs namenode -format 格式化
[hadoop@hadoop000 app]$ cd tmp
[hadoop@hadoop000 tmp]$ ls
dfs kafka-logs kafka-logs-1 kafka-logs-2 kafka-logs-3 zk
[hadoop@hadoop000 tmp]$ ls
dfs kafka-logs kafka-logs-1 kafka-logs-2 kafka-logs-3 zk
[hadoop@hadoop000 tmp]$ cd dfs
[hadoop@hadoop000 dfs]$ ls
name
[hadoop@hadoop000 dfs]$ cd name
[hadoop@hadoop000 name]$ ls
current
[hadoop@hadoop000 name]$ cd current
[hadoop@hadoop000 current]$ ls
fsimage_0000000000000000000 seen_txid
fsimage_0000000000000000000.md5 VERSION
[hadoop@hadoop000 current]$
./start-dfs.sh
19/05/01 01:33:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting namenodes on [hadoop000]
hadoop000: starting namenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-namenode-hadoop000.out
hadoop000: starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-datanode-hadoop000.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is 09:4d:bc:56:fd:5a:da:98:c4:ae:9c:6c:a4:30:0d:8c.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.7.0/logs/hadoop-hadoop-secondarynamenode-hadoop000.out
19/05/01 01:33:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
多出三个进程 : datanode namenode secondarynamenode
网页访问: hadoop000:50070
搭建yarn
[hadoop@hadoop000 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop000 hadoop]$ ls
capacity-scheduler.xml hdfs-site.xml mapred-env.cmd
configuration.xsl hdfs-site.xml~ mapred-env.sh
container-executor.cfg httpfs-env.sh mapred-queues.xml.template
core-site.xml httpfs-log4j.properties mapred-site.xml.template
core-site.xml~ httpfs-signature.secret slaves
hadoop-env.cmd httpfs-site.xml slaves~
hadoop-env.sh kms-acls.xml ssl-client.xml.example
hadoop-env.sh~ kms-env.sh ssl-server.xml.example
hadoop-metrics2.properties kms-log4j.properties yarn-env.cmd
hadoop-metrics.properties kms-site.xml yarn-env.sh
hadoop-policy.xml log4j.properties yarn-site.xml
[hadoop@hadoop000 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@hadoop000 hadoop]$ vi mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
2.yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
启动yarn:
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$ cd sbin
[hadoop@hadoop000 sbin]$ ./start-yarn.sh
多出两个进程:
[hadoop@hadoop000 sbin]$ jps
51328 NodeManager
51231 ResourceManager
webui查看:hadoop000:8088
一个简单的测试:
[hadoop@hadoop000 sbin]$ hadoop fs -ls / (查看hdfs的根目录)
19/05/01 01:52:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
[hadoop@hadoop000 sbin]$ hadoop fs -mkdir /data
19/05/01 01:52:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
[hadoop@hadoop000 sbin]$ hadoop fs -ls /data
19/05/01 01:53:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
[hadoop@hadoop000 sbin]$ ls
distribute-exclude.sh mr-jobhistory-daemon.sh start-secure-dns.sh stop-secure-dns.sh
hadoop-daemon.sh refresh-namenodes.sh start-yarn.cmd stop-yarn.cmd
hadoop-daemons.sh slaves.sh start-yarn.sh stop-yarn.sh
hdfs-config.cmd start-all.cmd stop-all.cmd yarn-daemon.sh
hdfs-config.sh start-all.sh stop-all.sh yarn-daemons.sh
httpfs.sh start-balancer.sh stop-balancer.sh
kms.sh start-dfs.cmd stop-dfs.cmd
Linux start-dfs.sh stop-dfs.sh
[hadoop@hadoop000 sbin]$ hadoop fs -put mr-jobhistory-daemon.sh /data/
19/05/01 01:53:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
[hadoop@hadoop000 sbin]$ hadoop fs -ls /data
19/05/01 01:53:51 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
Found 1 items
-rw-r–r-- 1 hadoop supergroup 4080 2019-05-01 01:53 /data/mr-jobhistory-daemon.sh
yarn的测试:$HADOOP_HOME/share
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$ ls
bin cloudera examples include libexec logs README.txt share
bin-mapreduce1 etc examples-mapreduce1 lib LICENSE.txt NOTICE.txt sbin src
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$ ls
bin cloudera examples include libexec logs README.txt share
bin-mapreduce1 etc examples-mapreduce1 lib LICENSE.txt NOTICE.txt sbin src
[hadoop@hadoop000 hadoop-2.6.0-cdh5.7.0]$ cd share
[hadoop@hadoop000 share]$ ls
doc hadoop
[hadoop@hadoop000 share]$ cd hadoop/
[hadoop@hadoop000 hadoop]$ ls
common hdfs httpfs kms mapreduce mapreduce1 mapreduce2 tools yarn
[hadoop@hadoop000 hadoop]$ cd mapreduce
[hadoop@hadoop000 mapreduce]$ ls
hadoop-mapreduce-client-app-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-common-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-core-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-hs-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-hs-plugins-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-jobclient-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-jobclient-2.6.0-cdh5.7.0-tests.jar
hadoop-mapreduce-client-nativetask-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-client-shuffle-2.6.0-cdh5.7.0.jar
hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar
lib
lib-examples
sources
[hadoop@hadoop000 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.7.0.jar pi 2 3
Number of Maps = 2
Samples per Map = 3
19/05/01 01:57:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Wrote input for Map #0
Wrote input for Map #1
Starting Job
19/05/01 01:57:05 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
19/05/01 01:57:05 INFO input.FileInputFormat: Total input paths to process : 2
19/05/01 01:57:05 INFO mapreduce.JobSubmitter: number of splits:2
19/05/01 01:57:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556700560942_0001
19/05/01 01:57:05 INFO impl.YarnClientImpl: Submitted application application_1556700560942_0001
19/05/01 01:57:06 INFO mapreduce.Job: The url to track the job: http://hadoop000:8088/proxy/application_1556700560942_0001/
19/05/01 01:57:06 INFO mapreduce.Job: Running job: job_1556700560942_0001
19/05/01 01:57:13 INFO mapreduce.Job: Job job_1556700560942_0001 running in uber mode : false
19/05/01 01:57:13 INFO mapreduce.Job: map 0% reduce 0%
19/05/01 01:57:18 INFO mapreduce.Job: map 50% reduce 0%
19/05/01 01:57:19 INFO mapreduce.Job: map 100% reduce 0%
19/05/01 01:57:23 INFO mapreduce.Job: map 100% reduce 100%
19/05/01 01:57:23 INFO mapreduce.Job: Job job_1556700560942_0001 completed successfully
19/05/01 01:57:23 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=50
FILE: Number of bytes written=335397
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=530
HDFS: Number of bytes written=215
HDFS: Number of read operations=11
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=2
Launched reduce tasks=1
Data-local map tasks=2
Total time spent by all maps in occupied slots (ms)=7159
Total time spent by all reduces in occupied slots (ms)=2746
Total time spent by all map tasks (ms)=7159
Total time spent by all reduce tasks (ms)=2746
Total vcore-seconds taken by all map tasks=7159
Total vcore-seconds taken by all reduce tasks=2746
Total megabyte-seconds taken by all map tasks=7330816
Total megabyte-seconds taken by all reduce tasks=2811904
Map-Reduce Framework
Map input records=2
Map output records=4
Map output bytes=36
Map output materialized bytes=56
Input split bytes=294
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=56
Reduce input records=4
Reduce output records=0
Spilled Records=8
Shuffled Maps =2
Failed Shuffles=0
Merged Map outputs=2
GC time elapsed (ms)=208
CPU time spent (ms)=2040
Physical memory (bytes) snapshot=513949696
Virtual memory (bytes) snapshot=8296464384
Total committed heap usage (bytes)=384827392
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=236
File Output Format Counters
Bytes Written=97
Job Finished in 18.542 seconds
Estimated value of Pi is 4.00000000000000000000
[hadoop@hadoop000 mapreduce]$
Hbase安装
下载-解压-配置到系统环境变量中
修改配置文件
JAVA_HOME
#Tell HBase whether it should manage it’s own instance of Zookeeper or not.
export HBASE_MANAGES_ZK=false
2.hbase-site.xml
<property>
<name>hbase.rootdir</name>
<value>hdfs://hadoop000:8020/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>hadoop000:2181</value>
</property>
可能会出现问题!!!
启动hbase:
[hadoop@hadoop000 bin]$ jps
51328 NodeManager
54592 Jps
49937 MainGenericRunner
54147 HMaster
50678 DataNode
50873 SecondaryNameNode
54298 HRegionServer
47724 RemoteMavenServer
47581 Main
50589 NameNode
51231 ResourceManager
43215 QuorumPeerMain
多出这两个进程表示搭建成功
webui : hadoop000:60010
[hadoop@hadoop000 bin]$ ./hbase shell
hbase(main):001:0> version
1.2.0-cdh5.7.0, rUnknown, Wed Mar 23 11:46:29 PDT 2016
hbase(main):002:0> status
1 active master, 0 backup masters, 1 servers, 0 dead, 2.0000 average load
hbase(main):003:0> create 'member','info','address'
0 row(s) in 1.3750 seconds
=> Hbase::Table - member
hbase(main):007:0> describe 'member'
Table member is ENABLED
member
COLUMN FAMILIES DESCRIPTION
{NAME => 'address', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS =
> 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0
', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
{NAME => 'info', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => '
FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0',
BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'}
2 row(s) in 0.1400 seconds
创建了一张表之后