基于Hive的hadoop集群日志的分析及错误排解(eclipse上运行出现错误:Access denied for user 'hivenew'@'172.16.2.36' (using password: YES)):
设计思路:利用hive的方法将hadoop中的指定日志存放在hive表中,然后再把hive表中保存的数据进行筛选后导入mysql数据库(对外可以访问的主机上),以供其他人进行访问。(本文主要是建立过程和执行程序的一些细节展示,包含eclipse和hadoop集群运行程序的情况以及错误排解,程序部分是参考<hadoop实战>一书)
下面来详细讨论其实现方法:
(1)eclipse上创建Map/Reduce工程,命名为HiveHadoopMysqlPro,在创建前,先把hive/lib目录下的jar包复制到hadoop/lib目录下面;
(2)创建GetConnect.java文件:负责hive与mysql之间的连接,程序如下:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class GetConnect {
private static Connection conn = null;
private static Connection conntomysql = null;
private GetConnect() {
}
public static Connection getHiveConn() throws SQLException {
if (conn == null)
{
try {
Class.forName("org.apache.hadoop.hive.jdbc.HiveDriver");
System.out.println("Success loading Hive Driver!");
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}
conn = DriverManager.getConnection(
//172.16.2.42为集群hadoop和hive的所在主机
"jdbc:hive://172.16.2.42:50031/default", "hivenew", "hivenew");
System.out.println(2222);
}
return conn;
}
public static Connection getMysqlConn() throws SQLException {
if (conntomysql == null)
{
try {
Class.forName("com.mysql.jdbc.Driver");
System.out.println("Success loading Mysql Driver!");
} catch (ClassNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
System.exit(1);
}
conntomysql = DriverManager
.getConnection(
//172.16.2.42位对外开放的存放mysql数据的主机,下面参数分别为数据库名,用户,密码;本人hive和mysql主机是在同一个服务器上
"jdbc:mysql://172.16.2.42:3306/hivenew?useUnicode=true&characterEncoding=GBK",
"hivenew", "hivenew");
System.out.println(3333);
}
return conntomysql;
}
public static void closeHive() throws SQLException {
if (conn != null)
conn.close();
}
public static void closemysql() throws SQLException {
if (conntomysql != null)
conntomysql.close();
}
}
(3)创建HiveUtil.java文件:主要负责hive表的创建,数据加载以及按照条件来查询数据等,程序如下:
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
public class HiveUtil {
public static void createTable(String hiveql) throws SQLException{
Connection con=GetConnect.getHiveConn();
Statement stmt = con.createStatement();
ResultSet res = stmt.executeQuery(hiveql);
System.out.println(1);
}
public static ResultSet queryHive(String hiveql) throws SQLException{
Connection con=GetConnect.getHiveConn();
Statement stmt = con.createStatement();
ResultSet res = stmt.executeQuery(hiveql);
System.out.println(2);
return res;
}
public static void loadDate(String hiveql) throws SQLException{
Connection con=GetConnect.getHiveConn();
Statement stmt = con.createStatement();
ResultSet res = stmt.executeQuery(hiveql);
System.out.println(3);
}
public static void hiveTomysql(ResultSet Hiveres) throws SQLException{
System.out.println(4);
Connection con=GetConnect.getMysqlConn();
Statement stmt = con.createStatement();
System.out.println(5);
while (Hiveres.next()) {
String rdate=Hiveres.getString(1);
String time=Hiveres.getString(2);
String type=Hiveres.getString(3);
String relateclass=Hiveres.getString(4);
String information=Hiveres.getString(5)+Hiveres.getString(6)+Hiveres.getString(7); System.out.println(rdate+""+time+""+type+""+relateclass+""+information+"");
int i = stmt.executeUpdate("insert into hadooplog values(0,'"+rdate+"','"+time+"','"+type+"','"+relateclass+"','"+information+"')");
System.out.println(6);
}
}
}
(4)创建ExeHiveQL文件,实现mian函数,作为驱动类,运行时需要两个参数:日志级别 日期,程序首先在hive中创建表,然后加载hadoop日志,最后把筛选出来的数据转存到mysql数据库中(存放在hivenew数据库的hadooplog表中),程序如下:
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import org.apache.hadoop.conf.Configuration;
public class ExeHiveQL {
public static void main(String[] args) throws SQLException {
if (args.length < 2) {
System.out.print("请输入你要查询的条件:日志级别(如ERROR) 日期(如 2015-02-03)");
System.exit(1);
}
String type = args[0];
String date = args[1];
// 在hive中创建表
HiveUtil.createTable("create table if not exists hivelog ( rdate String,time ARRAY<string>,type STRING,relateclass STRING,information1 STRING,information2 STRING,information3 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ' ' COLLECTION ITEMS TERMINATED BY ',' MAP KEYS TERMINATED BY ':'");
// 加载hadoop/logs目录下面以hadoop开头的所有日志文件
HiveUtil.loadDate("load data local inpath '/usr/hadoop/logs/hadoop*' overwrite into table hivelog");
// 查询有用的信息,这里依据日期和日志级别过滤信息
ResultSet res1 = HiveUtil.queryHive("select rdate,time[0],type,relateclass,information1,information2,information3 from hivelog where type='" + type + "' and rdate='" + date + "' ");
// 查出的信息经过变换后保存到mysql中。
HiveUtil.hiveTomysql(res1);
// 最后关闭此次会话的hive连接
GetConnect.closeHive();
// 关闭mysql连接
GetConnect.closemysql();
}
}
(5)程序准备好后,就到测试了,先另开一个终端,并启动Hiveserver服务,并指向一个监听端口,此处为50031(与hive连接配置对应):命令 hive --service Hiveserver 50031;
然后运行程序,通过 Run Configurations 的方法,在Arguments选项下输入两个参数ERROR 2015-03-01;在点击Apply 然后 Run 来运行程序:出现如下错误:
Success loading Hive Driver!
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/E:/HadoopWorkPlat/hadoop/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/E:/HadoopWorkPlat/hadoop/lib/slf4j-log4j12-1.4.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
。。。。。。
Success loading Mysql Driver!
Exception in thread "main" java.sql.SQLException: Access denied for user 'hivenew'@'172.16.2.36' (using password: YES)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:996)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3887)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3823)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
at com.mysql.jdbc.MysqlIO.secureAuth411(MysqlIO.java:4332)
天呀!172.16.2.36是我本地的主机IP地址,怎么会出现这种情况,此时查看hadoop集群上的hive时,发现存放日志的hive表hivelog已经存在,并且里面也有数据存在,证明已经把hadoop的日志存放到hive中;
hive> desc hivelog;
OK
rdate string
time array<string>
type string
relateclass string
information1 string
information2 string
information3 string
Time taken: 0.207 seconds
但是从hive中转移数据到mysql数据库却没有成功,原因估计是通过eclipse的方法来运行程序时,连接mysql的主机是本地主机(172.16.2.36),但是没有访问权限,然后登陆到mysql主机(172.16.2.42),并对主机172.16.2.36赋予权限,但是还是同样的错误;
mysql> grant all privileges on hivenew.* to '172.16.2.36'@localhost identified by '';
Query OK, 0 rows affected (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
后来,想出一个方法:把程序打包到hadoo集群上运行,这样就没有本地主机无法访问的问题了:
首先将class类导出为jar包,并命名为ExeHiveQL.jar,然后上传到hadoop集群,并按照下面命令执行程序:
[hadoop@Masterpc test]$ hadoop jar ExeHiveQL.jar ExeHiveQL ERROR 2015-03-22
Success loading Hive Driver!
Success loading Mysql Driver!
2015-03-22 11:20:35 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException:
2015-03-22 11:20:35 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException:
。。。。。。
查看Hiveserver控制台信息:
Copying data from file:/usr/hadoop/logs/hadoop*
Copying file: file:/usr/hadoop/logs/hadoop-hadoop-balancer-Master.Hadoop.log
。。。。。。
Copying file:
file:/usr/hadoop/logs/hadoop-hadoop-tasktracker-localhost.localdomain.out.5
Loading data to table default.hivelog
Deleted hdfs://172.16.2.42:9000/user/hive/warehouse/hivelog
Table default.hivelog stats: [num_partitions: 0, num_files: 249, num_rows: 0,
total_size: 44511392, raw_data_size: 0]
OK
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201503232255_0016, Tracking URL =
http://Masterpc.hadoop:50030/jobdetails.jsp?jobid=job_201503232255_0016
Kill Command = /usr/hadoop/libexec/../bin/hadoop job -kill
job_201503232255_0016
Hadoop job information for Stage-1: number of mappers: 3; number of reducers:
0
2015-03-24 09:48:55,243 Stage-1 map = 0%, reduce = 0%
2015-03-24 09:49:01,274 Stage-1 map = 33%, reduce = 0%, Cumulative CPU 2.16
sec
。。。。。。
2015-03-24 09:49:07,318 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 5.92
sec
2015-03-24 09:49:09,329 Stage-1 map = 100%, reduce = 100%, Cumulative CPU
5.92 sec
MapReduce Total cumulative CPU time: 5 seconds 920 msec
Ended Job = job_201503232255_0016
MapReduce Jobs Launched:
Job 0: Map: 3 Cumulative CPU: 5.92 sec HDFS Read: 44534827 HDFS Write:
16896 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 920 msec
OK
然后进入mysql控制台,查看hivenew数据库中的hadooplog表的内容:
mysql> use hivenew;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
mysql> select * from hadooplog where rdate = '2015-03-22';
+------+------------+----------+-------+--------------------------------------------------+----------------------------------------------------------------------------------------------------+
| id | rdate | time | type | relateclass | information |
+------+------------+----------+-------+--------------------------------------------------+----------------------------------------------------------------------------------------------------+
| 2458 | 2015-03-22 | 11:20:35 | ERROR | org.apache.hadoop.security.UserGroupInformation: | PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: |
| 2459 | 2015-03-22 | 11:20:35 | ERROR | org.apache.hadoop.security.UserGroupInformation: | PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: |
| 2460 | 2015-03-22 | 11:20:35 | ERROR | org.apache.hadoop.security.UserGroupInformation: | PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: |
| 2461 | 2015-03-22 | 11:20:36 | ERROR | org.apache.hadoop.security.UserGroupInformation: | PriviledgedActionExceptionas:hadoopcause:org.apache.hadoop.hdfs.server.namenode.SafeModeException: |
。。。。。。(省)