1. Installing Software (or can adduser hadoop)
$ sudo apt-get install ssh
$ sudo apt-get install rsync
2. Prepare to Start the Hadoop Cluster
conf/hadoop-env.sh
export JAVA_HOME=/home/yangwm/Programs/jdk1.6.0_26
3. Try the following command:
$ bin/hadoop
4. Hadoop can also be run on a single-node in a pseudo-distributed mode Configuration
conf/core-site.xml:
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/yangwm/Programs/hadoop_data</value>
</property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
conf/hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
5. Setup passphraseless ssh
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
6. Execution
Format a new distributed-filesystem:
$ bin/hadoop namenode -format
7. Start the hadoop daemons:
$ bin/start-all.sh
8. List files from the distributed filesystem
bin/hadoop fs -ls /
二、hbase安装(本机单节点)
1. Loopback IP
/etc/hosts should look something like this:
127.0.0.1 localhost
127.0.0.1 myhostname # not 127.0.1.1 myhostname, if not for local must: ethIP myhostname
2. Prepare to Start the Hbase Cluster
conf/hbase-env.sh
export JAVA_HOME=/home/yangwm/Programs/jdk1.6.0_26
3. Hbase can also be run on a single-node in a pseudo-distributed mode Configuration
conf/hbase-site.xml:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.master</name>
<value>localhost:60000</value>
</property>
</configuration>
二、hbase shell操作hbase server。
1. hbase client环境
${hbase_home} $ cat conf/hbase-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hbase.zookeeper.quorum</name>
<value>zookeeperHostName1,zookeeperHostName2,zookeeperHostName3</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>6000</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
</configuration>
2. hbase shell使用
${hbase_home} $ bin/hbase shell
or HBASE_CONF_DIR="conf/hbase-site-test.xml" bin/hbase shell
3. 常用语句
帮助语句:
help "command"
系统信息语句:
status 'detailed'
定义语句:
create 'test_ascii', { NAME => 'i', BLOOMFILTER => 'ROW' }
disable 'test_ascii'
alter 'test_ascii', { NAME => 'i', BLOOMFILTER => 'ROW', COMPRESSION=>'GZ' }
drop 'test_ascii'
操作语句:
put 'test_ascii', '10001_1', 'i:id', 1
put 'test_ascii', '10001_1', 'i:uid', 10001
delete 'test_ascii','10001_1'
查看语句:
list
describe 'test_ascii'
count 'test_ascii'
查询语句:
scan 'test_ascii'
scan 'test_ascii', {LIMIT =>10, STARTROW => '10001_1', STOPROW=>'10001_1~'}
get 'test_ascii', '10001_1', {COLUMN => 'i:uid'}
get 'test_ascii', org.apache.hadoop.hbase.util.Bytes.toStringBinary(org.apache.hadoop.hbase.util.Bytes.toBytes("10001_1"))
http://hadoop.apache.org/common/docs/r1.0.2/single_node_setup.html
http://hadoop.apache.org/common/docs/current/api/overview-summary.html#overview_description
http://hbase.apache.org/book.html#configuration
hbase client api guide:http://www.spnguru.com/2010/07/hbase-client-api-guide/
http://www.oratea.net/?p=1087
http://wiki.apache.org/hadoop/UsingLzoCompression
http://www.packtpub.com/article/hbase-basic-performance-tuning
cloudera docs:
http://blog.cloudera.com/blog/2012/07/hbase-log-splitting/
http://blog.cloudera.com/blog/2012/06/hbase-io-hfile-input-output/
http://blog.cloudera.com/blog/2012/06/hbase-write-path/
http://blog.cloudera.com/blog/2012/01/caching-in-hbase-slabcache/
http://blog.cloudera.com/blog/2011/02/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-1/