安装并配置 Hive、MySQL,编写 HiveSQL 语句实现简单的CRUD操作

一、安装mysql

1、下载安装包

wget http://dev.mysql.com/get/mysql-community-release-el7-5.noarch.rpm

2、解压并安装

rpm -ivh mysql-community-release-el7-5.noarch.rpm

yum install mysql-community-server

3、重启mysql服务

service mysqld restart

4、为root用户设置密码(root)

mysql -u root
mysql> set password for ‘root’@‘localhost’ =password(‘root’);

5、修改配置文件/etc/my.cnf

vi /etc/my.cnf

6、加上以下配置

[mysql] default-character-set =utf8
 grant all privileges on . to root@’ %'identified by ‘root’

7、刷新权限

flush privileges

二、hive的安装及配置

1、下载安装包
http://mirror.bit.edu.cn/apache/hive/
2、用xftp上传至虚拟机,解压安装到指定目录下/opt/module
3、修改etc/profile文件,添加HIVE_HOME安装路径,使其生效

Source profile

4、配置 hive-env.sh

cp hive-env.sh.template  hive-env.sh
修改Hadoop的安装路径
HADOOP_HOME=/opt/module /hadoop-2.7.3
修改Hive的conf目录的路径
export HIVE_CONF_DIR=/opt/module/hive/conf

5、配置hive-site.xml

 <property>
    <name>javax.jdo.option.ConnectionURL</name>
    <value>jdbc:mysql://127.0.0.1:3306/hive?characterEncoding=UTF-8&amp;serverTimezone=GMT%2B8</value>
    <description>
      JDBC connect string for a JDBC metastore.
      To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.
      For example, jdbc:postgresql://myhost/db?ssl=true for postgres database.
    </description>
  </property>
 
  <property>
    <name>javax.jdo.option.ConnectionDriverName</name>
    <value>com.mysql.cj.jdbc.Driver</value>
    <description>Driver class name for a JDBC metastore</description>
  </property>
 
  <property>
    <name>javax.jdo.option.ConnectionUserName</name>
    <value>root</value>
    <description>Username to use against metastore database</description>
  </property>
 
  <property>
    <name>javax.jdo.option.ConnectionPassword</name>
    <value>123456</value>
    <description>password to use against metastore database</description>
  </property>
 
 <property>
    <name>hive.exec.local.scratchdir</name>
    <value>/usr/local/hive/apache-hive-2.3.4-bin/tmp/${user.name}</value>
    <description>Local scratch space for Hive jobs</description>
  </property>
 
  <property>
    <name>hive.downloaded.resources.dir</name>
    <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${hive.session.id}_resources</value>
    <description>Temporary local directory for added resources in the remote file system.</description>
  </property>
 
  <property>
    <name>hive.querylog.location</name>
    <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${system:user.name}</value>
    <description>Location of Hive run time structured log file</description>
  </property>
 
  <property>
    <name>hive.server2.logging.operation.log.location</name>
    <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp/${system:user.name}/operation_logs</value>
    <description>Top level directory where operation logs are stored if logging functionality is enabled</description>
  </property>
 
  <property>
    <name>hive.server2.thrift.bind.host</name>
    <value>bigdata</value>
    <description>Bind host on which to run the HiveServer2 Thrift service.</description>
  </property>
 
  <property>
    <name>system:java.io.tmpdir</name>
    <value>/usr/local/hive/apache-hive-2.3.4-bin/iotmp</value>
    <description/>
  </property>

6、 初始化

schematool -dbType mysql -initSchema

三、编写 HiveQL 语句实现 WordCount 程序

1、先把要统计的文件传到 HDFS 上

vim 1.txt
hdfs dfs -mkdir /input
hdfs dfs -put 1.txt /input
hdfs dfs -ls /input

2、打开 beeline,创建内部表 words

create table words(line string);

3、导入文章内容

load data inpath '/input/1.txt' overwrite into table words;

4、执行 WordCount 操作,将结果保存到新表 wordcount 中

create table wordcount as select word, count(1) as count from (select explode(split(line,' '))as word from words) w group by word order by word;

5、查看统计结果

select * from wordcount;
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,以下是Java代码示例(假设已经导入了相关的MySQLHive依赖): ``` import java.sql.*; import org.apache.hadoop.hive.ql.exec.UDFArgumentException; import org.apache.hadoop.hive.ql.exec.UDFArgumentLengthException; import org.apache.hadoop.hive.ql.exec.UDFArgumentTypeException; import org.apache.hadoop.hive.ql.exec.UDFReturnType; import org.apache.hadoop.hive.ql.metadata.HiveException; import org.apache.hadoop.hive.ql.udf.generic.GenericUDF; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector; import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory; import org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorFactory; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; public class MySQLToHive extends GenericUDF { private Connection conn; @Override public ObjectInspector initialize(ObjectInspector[] arguments) throws UDFArgumentException { if (arguments.length != 3) { throw new UDFArgumentLengthException("The function MySQLToHive requires three arguments: jdbcUrl, username, and password"); } for (ObjectInspector argument : arguments) { if (!argument.getTypeName().equals("string")) { throw new UDFArgumentTypeException(0, "The arguments to MySQLToHive must all be strings"); } } return ObjectInspectorFactory.getStandardListObjectInspector( PrimitiveObjectInspectorFactory.writableStringObjectInspector); } @Override public Object evaluate(DeferredObject[] arguments) throws HiveException { String jdbcUrl = arguments[0].get().toString(); String username = arguments[1].get().toString(); String password = arguments[2].get().toString(); try { if (conn == null || conn.isClosed()) { Class.forName("com.mysql.jdbc.Driver"); conn = DriverManager.getConnection(jdbcUrl, username, password); } Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery("SELECT * FROM my_table"); List<Text> values = new ArrayList<Text>(); while (rs.next()) { Text value = new Text(rs.getString("col1") + "\t" + rs.getString("col2") + "\t" + rs.getString("col3")); values.add(value); } return values; } catch (Exception e) { throw new HiveException(e); } } @Override public String getDisplayString(String[] children) { return "MySQLToHive(jdbcUrl, username, password)"; } @Override public ObjectInspector getObjectInspector() throws UDFArgumentException { return ObjectInspectorFactory.getStandardListObjectInspector( PrimitiveObjectInspectorFactory.writableStringObjectInspector); } @Override public UDFReturnType getReturnType() { return UDFReturnType.LIST; } } ``` 这是一个自定义Hive函数,它将从MySQL中读取数据并将其写入Hive表。它需要三个参数:MySQL数据库的JDBC URL,用户名和密码。它将返回一个包含所有MySQL行数据的列表。修改SQL查询语句以适应你的表格结构。 在编译和打包Java代码后,将JAR文件上传到Hive服务器上,并在Hive中注册该函数: ``` ADD JAR /path/to/my_jar.jar; CREATE TEMPORARY FUNCTION mysql_to_hive AS 'MySQLToHive'; ``` 现在你可以在Hive中使用该函数: ``` INSERT INTO my_hive_table SELECT * FROM TABLE(mysql_to_hive('jdbc:mysql://localhost:3306/my_database', 'my_username', 'my_password')); ``` 这将从MySQL中读取所有行并将它们插入到名为 my_hive_table 的Hive表中。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值