BIOS 光驱启动安装RedHat
选择普通的Boot Menu
启动项四个中选Back USB即可
hd-a
后面两块盘装OS,做成raid0,如果选ext4的话,要求关闭磁盘写缓存。
RedHat Minimal安装
配置各个节点IP,并修改/etc/hosts
ifconfig
vi ifcfg-eth1
service network restart
cd /etc/sysconfig/network-scripts/
ifconfig eth1 down
ifconfig eth1 up
Ifdown eth1; ifup eth1
将redhat iso文件挂载到各个节点上,同时配置此iso作为本地源
mount -o loop /opt/app/rhel-server-6.3-x86_64-dvd.iso /media/iso
touch /etc/yum.repos.d/rhel.repo
vi /etc/yum.repos.d/rhel.repo
yum search scp
1. Prepare the uuid_list file in each node with the below command:
ls -l /dev/disk/by-uuid/ | awk '{print $11" "$9}'
then, remove the line containing /dev/sda device item, also in hd-a/hd-b,
remove the line containing /dev/sdx and /dev/sdy
2. Create the mount folder with 'create_mount_folder.sh' script
3. Check all the file system
4. execute 'fstab_edit.sh' to mount all the disks
每个节点上安装openssh-client,xfs相关包
yum search openssh
yum install -y openssh.x86_64
yum install -y openssh-clients.x86_64
intelhadoop/uninstall.sh hd-b hd-c hd-d hd-a(manage节点)
集群安装:
Hd-a/24块 10k 300GB SAS, 由于安装的时候USB被识别成/dev/sda,所以系统的磁盘编号从/dev/sdb开始, /dev/sdb
Hd-b/24块 10k 600GB SAS, 0-23是数据盘,最后两块是os盘
Hd-c/12块 7.2k 2TB SATA, 0 安装了os,
Hd-d/12块 7.2k 3TB SATA, 0
1 磁盘比较多,编写脚本,包含磁盘部分格式化,挂载,检查的脚本,磁盘挂载为xfs
2 redhat最小安装不包含格式化大容量磁盘所需的parted工具和xfs文件系统的支持包,所以先做了一次IM的安装来创建一个yum源,供缺失的包安装;
1 IM和集群的安装比较顺利,但启动HDFS时,datanode启动失败,由于某些盘之前装过hadoop,文件的权限不符合预期
2 许可证从免费版升级成试用版后,资源管理界面还是显示不了;记住一定要重新配置集群并且重新启动hadoop的服务以及resource monitor 服务和intel manager 服务
挂载USB盘
Mount -t vfat /dev/xxx4 /media/usb
/dev/xxx4是设备标识,/media/usb是目标目录
创建yum repo:
Baseurl=xxx
Gpgcheck=0
scp依赖的软件包:
Openssh-client
mount -o loop /opt/app/rhel-server-6.3-x86_64-dvd.iso /media/iso/
NutchIndexing & Bayes:
13/05/31 11:15:47 INFO HiBench.NutchData: Initializing Nutch data generator...
curIndex: 334, total: 335
ERROR: number of words should be greater than 0
/usr/share/dict/words不存在
Yum search words
yum install -y words.noarch
Hivebench run-aggregation.sh
rmr: org.apache.hadoop.security.AccessControlException: Permission denied: user=root, access=ALL, inode="system":mapred:hadoop:rwx------
Because the owner of
/tmp/hadoop-mapred/mapred/system is mapred, and the permission is rwx------
修改./run-aggregation.sh和./run-join.sh
Sudo -u hive hadoop fs -rmr /tmp/hive
Sudo -u hive hadoop fs -mkdir /tmp/hive
Sudo -u hive hadoop fs -chmod -R 777 /tmp/hive
PageRank数字太大了5000000000
NutchIndexing: java heap size
100000 --> 1000000000
Dfsio: rd_file_size: 200 --> 20000, java heap size
Wt_file_size: 100--> 10000, mapred.task.timeout
PageRank:数字太大了5000000000
500000 --> 5000000000
mapred.task.timeout设置
Java heap size调整成1024m
Hivebench
Uservisits: 100000000 --> 10000000000
Pages: 12000000 --> 120000000
Kmeans:
Num_of_samples: 3000000 --> 300000000
Samples_per_inputfile: 600000 --> 6000000
sort:
Datasize: 2400000000 --> 240000000000
Terasort:
Datasize: 100000000 --> 10000000000
Wordcount:
Datasize: 32000000000 -> 320000000000
Hd-a节点的磁盘容量已经不足了,很多mapreduce的spill操作耗费的时间开始加长
不正常的任务
NutchIndexing
Dfsioe
Hivebench
Pagerank
Sort
Terasort
正常的任务512M heapsize
Word count
Kmeans
bayes
修改:
在IM中修改mapred.task.timeout,设置为0;修改heapsize从512m调整为1G
Modify mapred.child.heapsize to 1024m
PageRank: ./conf/configure.sh 修改成500000000
NutchIndexing: ./conf/configure.sh 修改成10000000