安装jdk
1.下载jdk
2 更改文件权限 chmod 701 jdk-6u14-linux-i586.bin
3 sudo ./jdk-6u14-linux-i586.bin
配置环境变量
1、打开/etc/profile:vi /etc/profile
在文件的最后一行添加:
JAVA_HOME=/home/bb/jdk1.6.0_31
export JRE_HOME=/home/bb/jdk1.6.0_31/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
export HADOOP_HOME=/home/hadoop/tools/hadoop-1.0.3
exportPATH=$PATH:$HADOOP_HOME/bin
2、在终端输入source /etc/profile 保存,然后java -version看是否安装成功
安装ssh
sudo apt-get ssh
配置无密码连接
1、生成密钥
ssh-keygen -t rsa 一路回车
2、更改权限
chmod 755 ./ssh
3、把id_rsa.pub复制到authorized_keys中:cp id_rsa.pub authorized_keys
4、把authorized_keys拷到要远程的机器上:scp authorized_keys 10.108.34.85:/home/hadoop/.ssh/
10.108.34.85为远程服务器
5、开始连接:ssh 10.108.34.85
安装hadoop
1、把下载的hadoop解压到/home/hadoop/tools下
2、
A.配置 在./conf中:
a. core-site.xml:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://10.108.32.97:9000</value> //10.108.32.97为IP
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yourname/tmp</value> //<strong>注意:tmp目录必须为空</strong>
</property>
</configuration>
b.hadoop-env.sh
# The java implementation to use. Required.
export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.06
c.hdfs-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/yourname/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/yourname/hdfs/data</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
d. mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>10.108.32.97:9001</value>
</property>
</configuration>
e. conf/masters:
namenode 的iP地址
f. conf/slaves:
datanode的ip地址
g. scp -r /home/yourname/hadoop slave1:/home/dataname1/
scp -r /home/yourname/hadoop slave2:/home/dataname2/
B.格式化一个新的分布式文件系统:
$ bin/hadoop namenode -format
启动Hadoop守护进程:
$ bin/start-all.sh
将输入文件拷贝到分布式文件系统:
$ bin/hadoop fs -put conf input
运行发行版提供的示例程序:
$ bin/hadoop jar hadoop-*-examples.jar grep input output 'dfs[a-z.]+'
查看输出文件:
将输出文件从分布式文件系统拷贝到本地文件系统查看:
$ bin/hadoop fs -get output output
$ cat output/*
或者
在分布式文件系统上查看输出文件:
$ bin/hadoop fs -cat output/*
完成全部操作后,停止守护进程:
$ bin/stop-all.sh
ubuntu下配置jdk,ssh,hadoop
最新推荐文章于 2024-09-10 12:12:00 发布