大数据平台搭建


服务器确认

hostIP             hostname           hdfs      

192.168.1.235        master         namenode

对应大数据组件  hadoop,hive,hbase,zookeeper,sqoop

hostIP             hostname           hdfs

192.168.1.234        slave           datanode  

对应大数据组件:hadoop,hbase,zookeeper

注:两台主机必须是相对独立的计算机。

 

创建新用户并赋予root权限

创建用户hadoop 密码hadoop

adduser hadoop

passwd hadoop

 

赋予root权限:

 修改 /etc/sudoers 文件,找到下面一行,在root下面添加一行,如下所示:

 

## Allow root to run any commands anywhere

root    ALL=(ALL)     ALL

hadoop  ALL=(ALL)     ALL

修改完毕,现在可以用hadoop帐号登录,然后用命令 su - ,即可获得root权限进行操作。

 

修改计算机名称

vi /etc/hostname

master或slave(对应以上)

 

vi /etc/hosts

192.168.1.235 master

192.168.1.234 slave

 

ssh免密登入

ssh-keygen -t rsa -P ""       获取公钥/root/.ssh/id_rsa.pub.

vi /root/.ssh/authorized_keys   新建文件authorized_keys

cat id_rsa.pub >> authorized_keys  将id_rsa.pub内容放入authorized_keys

scp authorized_keys root@slave:/root/.ssh   将master的 authorized_keys 放入slave 的/root/.ssh

cat id_rsa.pub >> authorized_keys  将slave 的id_rsa.pub内容放入authorized_keys

scp authorized_keys root@master:/root/.ssh   将slave的 authorized_keys 放入master 的/root/.ssh

 

 

 

安装Java1.7

1.下载jdk-7u25-linux-x64.tar.gz

2.解压在/home/hadoop目录下并配置好Java环境

mv jdk-7u25-linux-x64 jdk1.7

vi /etc/profile

export JAVA_HOME=/home/hadoop/jdk1.7

export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/

export PATH=$PATH:$JAVA_HOME/bin

export JAVA_BIN=/usr/local/java/jdk1.7

export JAVA_HOME JAVA_BIN PATH CLASSPATH

 

下载hadoop-2.2.0.tar.gz

解压在/home/hadoop/

tar -zxvf hadoop-2.2.0.tar.gz

mv hadoop-2.2.0  hadoop

vi /etc/profile

 

export HADOOP_HOME="/home/hadoop/hadoop"

export PATH=.:$HADOOP_HOME/bin:$JAVA_HOME/bin:$PATH

export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

 

配置hadoop-env.sh,指定java位置

cd /home/hadoop/hadoop/etc/hadoop

vi hadoop-env.sh

export JAVA_HOME=/home/hadoop/jdk1.7

 

配置core-site.xml

<configuration>

 

     <property>

 

         <name>fs.defaultFS</name>

 

         <value>hdfs://master:9000</value>

 

     </property>

 

     <property>

 

 <name>hadoop.tmp.dir</name>

 

 <value>/home/hadoop/data/hadoop/tmp</value>

 

 </property>

 

</configuration>

 

 配置mapred-site.xml

<configuration>

 

<property>

 

 <name>mapreduce.framework.name</name>

 

 <value>yarn</value>

 

 </property>

 

 <property>

 

 <name>mapreduce.jobhistory.address</name>

 

 <value>master:10020</value>

 

 </property>

 

 <property>

 

 <name>mapreduce.jobhistory.webapp.address</name>

 

 <value>master:19888</value>

 

 </property>

 

</configuration>

 

配置hdfs-site.xml

 

<configuration>

<property>

  <name>dfs.replication</name>

  <value>1</value>

  <description>Default block replication.

  The actual number of replications can be specified when the file is created.

  The default is used if replication is not specified in create time.

  </description>

</property>

</configuration>

 

配置slave

slave

 

scp /etc/hosts root@slave:/etc/hosts

scp /etc/profile root@slave:/etc/profile

scp -r /home/hadoop/hadoop root@slave:/home/hadoop

 

 

格式化hdfs

bin/hdfs namenode -format

 

启动hadoop

 bin/start-all.sh

验证:jps

 

 

 

 

下载apache-hive-0.13.1-bin.tar.gz

 

1.安装并启动mysql

 

2.解压hive/home/hadoop

mv apache-hive-0.13.1-bin  hive

vi /etc/profile

export  HIVE_HOME=/home/hadoop/hive

export PATH=$PATH:$HIVE_HOME/bin

 

配置 hive-site.xml

cp hive-default.xml.template hive-site.xml

 

<configuration>

<property>

  <name>hive.metastore.warehouse.dir</name>

  <value>/hive/warehouse</value>

</property>

 

<property>

  <name>javax.jdo.option.ConnectionURL</name>

  <value>jdbc:mysql://localhost:3306/hive?characterEncoding=UTF-8

                    &createDatabaseIfNotExist=true</value>

</property>

 

<property>

  <name>javax.jdo.option.ConnectionDriverName</name>

  <value>com.mysql.jdbc.Driver</value>

</property>

 

<property>

  <name>javax.jdo.option.ConnectionUserName</name>

  <value>root</value>

</property>

 

<property>

  <name>javax.jdo.option.ConnectionPassword</name>

  <value>123456</value>

</property>

</configuration>

 

验证:输入hive

 

 

 

下载sqoop-1.4.6.bin__hadoop-2.0.4.tar.gz

 

1.解压 /home/hadoop

mv sqoop-1.4.6.bin__hadoop-2.0.4    sqoop

2. 配置环境变量和配置文件

cp conf/sqoop-env-template.sh /conf/sqoop-env.sh

sqoop-env.sh中添加如下代码:

export HADOOP_COMMON_HOME=/home/hadoop/hadoop

export HADOOP_MAPRED_HOME=/home/hadoop/hadoop

export HBASE_HOME=/home/hadoop/hbase

export HIVE_HOME=/home/hadoop/hive

export ZOOCFGDIR=/home/hadoop/zookeeper

(如果数据读取不设计hbasehive,那么相关hbasehive的配置可以不加,如果集群有独立的zookeeper集群,那么配置zookeeper,反之,不用配置)

 

3.copy需要的lib包到Sqoop/lib

所需的包:mysqljdbc包(或Oraclejdbc包等)

cp mysql-connector-java-5.1.18.jar  /sqoop/lib/

 

4.添加环境变量

vi  /etc/profile

export SQOOP_HOME=/home/hadoop/sqoop

export PATH=$SQOOP_HOME/bin:$PATH

export LOGDIR=$SQOOP_HOME/logs

 

 

5.测试验证

--列出mysql数据库中的所有数据库

sqoop list-databases --connect jdbc:mysql://IPHOST:3306 --username xxx --password xxx

 

下载zookeeper-3.4.6.tar.gz

解压在/home/hadoop

mv zookeeper-3.4.6  zookeeper

 

vi /etc/profile

export ZOOKEEPER_HOME=/home/hadoop/zookeeper

export PATH=$PATH:$ZOOKEEPER_HOME/bin

 

修改配置文件zoo.cfg

cd /home/hadoop/zookeeper/conf

vi zoo.cfg

dataDir=/home/hadoop/zookeeper/zkdata

dataLogDir=/home/hadoop/zookeeper/zkdatalog

server.1=master:2888:3888

server.2=slave:2888:3888

 

复制到个节点

scp -r zookeeper  root@slave:/home/hadoop/

 

masterslave: 在/home/hadoop/zookeeper/zkdata下添加对应的myid文件,如其值与server.11对应

master myid1  slave:myid2

 

zookeeper验证:

cd /home/hadoop/zookeeper/bin

zkServer.sh  start

 

 

下载hbase-0.98.4-hadoop2-bin.tar.gz

hbase依赖Hadoopzookeeper

解压在/home/hadoop

mv hbase-0.98.4-hadoop2-bin   hbase

vi /etc/profile

export  HBASE_HOME=/home/hadoop/hbase

export PATH=$PATH:$HBASE_HOME/bin

 

配置hbase-site.xml

 

cd /home/hadoop/hbase/conf

vi hbase-site.xml

<configuration>

 <property>

        <name>hbase.tmp.dir</name>

        <value>/home/admin/data/var/hbase</value>

</property>

<property>

       <name>hbase.rootdir</name>

       <value>hdfs://master:9000/hbase</value>

</property>

<property>

       <name>hbase.zookeeper.quorum</name>

       <value>master,slave1</value>

</property>

<property>

       <name>hbase.cluster.distributed</name>

       <value>true</value>

</property>

<property>

       <name>hbase.zookeeper.property.dataDir</name>

       <value>/home/hadoop/zookeeper/zkdata</value>

</property>

<property>

        <name>hbase.master.maxclockskew</name>

        <value>180000</value>

</property>

</configuration>

 

vi regionservers

slave

 

复制到个节点

scp -r hbase root@slave:/home/hadoop/

 

验证sh start-hbase.sh

 

 

hbase shell 查看状态

 

Status

 

 

十一 myeclipse 连接hive

在工程下导入Hadoop/share/hadoop/common Hadoop/share/hadoop/common/libhive/lib下的jar

启动hive --service hiveserver2

我们需要mysql驱动jarmysql-connector-java-5.1.22-bin.jar 拷贝到 $HIVE_HOME/lib/ 目录下和新建hive项目lib下。

 

连接代码:

 

import java.sql.Connection;

import java.sql.DriverManager;

import java.sql.PreparedStatement;

import java.sql.ResultSet;

import java.sql.Statement;

 

public class hive {

private static ResultSet res;

 

public static void main(String[] args) throws Exception {  

        Class.forName("org.apache.hive.jdbc.HiveDriver");  

        Connection conn=DriverManager.getConnection("jdbc:hive2://192.168.1.235:10000", "root", "");  

        Statement stmt = conn.createStatement();

        try {  

         String  tablename="testHiveDriverTable";

          

         String sql = "select * from " + tablename;  

            System.out.println("Running:" + sql);

             res = stmt.executeQuery(sql);  

            System.out.println("执行“select * query”运行结果:");

            while (res.next()) {  

                    System.out.println(res.getInt(1) + "\t" + res.getString(2));                              

            }

            sql="select count(*) from testHiveDriverTable";  

            PreparedStatement pstmt=conn.prepareStatement(sql);  

            ResultSet rs=pstmt.executeQuery();  

            while (rs.next()) {  

                    System.out.println(rs.getInt(1));  

            }

        } catch (Exception e) {  

                e.printStackTrace();  

        } finally{  

                conn.close();  

        }  

        

}

}

 

 

十二 myeclipse连接hbase

1)添加JAR

添加JAR包有两种方法,比较简单的是,在HBase工程上,右击HBase工程,弹出BuildPath->ConfigureBuildPath,在对话框中单击Libraries选项卡,在该选项卡下单击Add External JARs按钮,定位到$HBase/lib目录下所有jar包,Hadoop/share/hadoop下所有jar

2)添加hbase-site.xml配置文件

在工程目录下创建Conf文件夹,将$HBase_HOME/conf/目录中的hbase-site.xml文件复制到该文件夹中。通过右键选择Properties->Java BuildPath->Libraries->Add Class Folder,然后勾选Conf文件进行添加。

 

连接代码如下:

package hbase;

import java.io.IOException;

 

import javax.annotation.Resource;

 

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.hbase.HBaseConfiguration;

import org.apache.hadoop.hbase.HColumnDescriptor;

import org.apache.hadoop.hbase.HTableDescriptor;

import org.apache.hadoop.hbase.MasterNotRunningException;

import org.apache.hadoop.hbase.ZooKeeperConnectionException;

import org.apache.hadoop.hbase.client.Get;

import org.apache.hadoop.hbase.client.HBaseAdmin;

import org.apache.hadoop.hbase.client.HTable;

import org.apache.hadoop.hbase.client.Put;

import org.apache.hadoop.hbase.client.Result;

import org.apache.hadoop.hbase.client.ResultScanner;

import org.apache.hadoop.hbase.client.Scan;

import org.apache.hadoop.hbase.util.Bytes;

 

 

public class HBaseTestCase {

    static Configuration cfg=HBaseConfiguration.create();

    public static void create(String tablename,String columnFamily) throws MasterNotRunningException, ZooKeeperConnectionException, IOException{

        HBaseAdmin admin=new HBaseAdmin(cfg);

        System.out.println(admin.toString());

        if(admin.tableExists(tablename)){

            System.out.println("table Exists");

          System.exit(0);

        }else{

            HTableDescriptor tableDesc=new HTableDescriptor(tablename);

            tableDesc.addFamily(new HColumnDescriptor(columnFamily));             

            admin.createTable(tableDesc);

            System.out.println("create table success");

        }

    }

    public static void put(String tablename,String row,String columnFamily,String column,String data) throws IOException

    {

        HTable table=new HTable(cfg,tablename);

        Put p1=new Put(Bytes.toBytes(row));

        p1.add(Bytes.toBytes(columnFamily),Bytes.toBytes(column),Bytes.toBytes(data));

        table.put(p1);

        System.out.println("put '"+row+"','"+columnFamily+":"+column+"','"+data+"'");

    }

    public static void get(String tablename,String row) throws IOException

    {

        HTable table=new HTable(cfg,tablename);

        Get g=new Get(Bytes.toBytes(row));

        Result result=table.get(g);

        System.out.println("Get: "+result);

    }

    public static void scan(String tablename) throws IOException

    {

        HTable table=new HTable(cfg,tablename);

        Scan s=new Scan();

        ResultScanner rs=table.getScanner(s);

        for(Result r:rs)

        {

            System.out.println("Scan: "+r);

        }

    }

    public static boolean delete(String tablename) throws MasterNotRunningException, ZooKeeperConnectionException, IOException

    {

        HBaseAdmin admin=new HBaseAdmin(cfg);

        if(admin.tableExists(tablename))

        {

            try

            {

                admin.disableTable(tablename);

                admin.deleteTable(tablename);

            }catch(Exception ex)

            {

                ex.printStackTrace();

                return false;

            }

        }

        return true;

    }

    public static void main(String[] args) {

        String tablename="hbase_tb";

        String columnFamily="cf";

        try

        {           

            HBaseTestCase.create(tablename, columnFamily);

            

            HBaseTestCase.put(tablename, "row1", columnFamily, "cl1", "data");

           

            HBaseTestCase.get(tablename, "row1");

            

            HBaseTestCase.scan(tablename);

            

            if(true==HBaseTestCase.delete(tablename))

            {

                System.out.println("Delete table:"+tablename+"success!");

            }

        }catch(Exception e)

        {

            e.printStackTrace();

        }

    }

 

}

 

 

 

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值