A. 3 Scripts as follows
a. check frequency & memory: sanity.sh
for i in `cat hosts_uniq`
do
echo $i
ssh $i cat /proc/cpuinfo | grep "cpu MHz" | grep -v "2533" | grep -v "2534"
ssh $i cat /proc/meminfo | grep "MemFree"
done
b. check interconnect network performance
b.1. Client script
head_node=node001 //suppose we run the server script on node001
for i in `cat hosts_uniq`
do
echo $i
ssh $i "hostname"
ssh $i "ib_send_bw $head_node | grep 65536 | cut -c 40-60"
done
b.2. Server script
for i in `cat hosts_uniq`
do
echo $i
ib_send_bw | grep 65536 | cut -c 40-60
done
B. Usage
1. srun hostname > hosts
2. cat hosts | uniq > hosts_uniq
3. run the freq. mem. script ./sanity.sh
4. logged into node001, run the server script
5. logged into any other nodes, run the client script
Commands to use ....
1. htop vs. top
2. gsh (group shell)