1. 在centos6.5中下载eclipse:
2. 把hive mvn工程变为java工程
mkdir
workspace
cd
workspace
git clone https:
//github
.com
/apache/hive
.git
cd
hive
mvn clean
install
-DskipTests -Phadoop-2
mvn eclipse:clean
mvn eclipse:eclipse -DdownloadSources -DdownloadJavadocs -Phadoop-2
cd
itests
mvn clean
install
-DskipTests -Phadoop-2
mvn eclipse:clean
mvn eclipse:eclipse -DdownloadSources -DdownloadJavadocs -Phadoop-2
|
3. 把hive java工程导入eclipse
4. 下面调试配置中文件保存不依赖hadoop hdfs,用本地文件系统代替; metastore不依赖mysql, 用本地文件代替;mapred本地化。总体来说此hive参数配置可以不依赖hadoop cluster进行源码调试
export
HIVE_OPTS='--hiveconf mapred.job.tracker=
local
--hiveconf fs.default.name=
file
:
///tmp
\
--hiveconf hive.metastore.warehouse.
dir
=
file
:
///tmp/warehouse
\
--hiveconf javax.jdo.option.ConnectionURL=jdbc:derby:;databaseName=
/tmp/metastore_db
;create=
true
'
|
5. 启动源代码目录中hive程序
./bin/hive
6. 初始化表结构和数据
在类似/home/hadoop/git/hive/data/scripts/q_test_init.sql文件中找到代码调试所需要的初始数据,然后拷贝出来在hive中执行
比如执行下面hive语句:
set
hive.stats.dbclass=fs;
-
DROP
TABLE
IF EXISTS src;
CREATE
TABLE
src (
key
STRING COMMENT
'default'
, value STRING COMMENT
'default'
) STORED
AS
TEXTFILE;
LOAD
DATA
LOCAL
INPATH
"/home/hadoop/git/hive/data/files/kv1.txt"
INTO
TABLE
src;
ANALYZE
TABLE
src COMPUTE
STATISTICS
;
ANALYZE
TABLE
src COMPUTE
STATISTICS
FOR
COLUMNS
key
,value;
|
7. hive进入调试模式
./bin/hive --debug
8. 在hive调试模式打开之后马上打开eclipse的进程attach功能,在此之前要在源码上加入调试断点。
在eclipse中打开Run/Debug Configurations.../Remote Java Application/Connection Type/Standard(Socket Attach), 点击Debug
9. 在hive中执行查询语句,此时会触发断点,进入eclipse调试模式
hive> select count(1) from src;
参考文档:
1. https://cwiki.apache.org/confluence/display/Hive/HiveDeveloperFAQ#HiveDeveloperFAQ-HowdoIimportintoEclipse?
2. https://cwiki.apache.org/confluence/display/Hive/DeveloperGuide#DeveloperGuide-DebuggingHiveCode
3. TestCliDriver.vm和生成的TestCliDriver.java //此模板和生成的源文件包含了hive cli集成测试的概要流程
4. QTestUtil.java //包含hive集成测试执行的详细步骤