•数据仓库,
•把类似SQL的语法转化成MapReduce程序
•不支持Index,Transaction,分钟级别的延时
•不支持SQL的Having
•数据类型支持
–基本类型string,int,double,boolean等
–复杂类型的Array,Map,Struct
•%export HIVE_HOME=/home/my/hive
•运行:bin/hive
•hive>SHOW TABLES;
•hive-f script.q
•hive-e 'SELECT * FROM dummy'
•建表
•CREATETABLErecords
–(year STRING,temperature INT, quality INT)
–ROWFORMAT DELIMITED
– FIELDS TERMINATED BY '\t';
•从文件载入:
–LOADDATALOCAL INPATH 'input/ncdc/micro-tab/sample.txt'
–OVERWRITEINTO TABLE records
Errorwhile making MR scratch directory :
•把hadoop的配置文件core-site.xml中的:
•fs.default.name里的值改成hosts里的名称
•然后重启hadoop和hive
•如果提示namenode is in safe mod
–hadoop dfsadmin -safemode leave
–或在hdfs上建立相关目录并加权限
–%hadoop fs -mkdir /tmp
–%hadoop fs -chmod a+w /tmp
–%hadoop fs -mkdir/user/hive/warehouse
–%hadoop fs -chmod a+w/user/hive/warehouse
•托管表会移动数据到Hive的数据仓库目录
–CREATETABLE managed_table(dummy STRING);
–LOADDATA INPATH '/user/tom/data.txt' INTO table managed_table;
•外部表:
–CREATEEXTERNALTABLE external_table(dummy STRING)LOCATION '/user/tom/external_table';
–LOADDATA INPATH '/user/tom/data.txt' INTO TABLE external_table;
–删除外部表的时候不会删除数据,只删除metaata
•CREATETABLE bucketed_users (idINT, name STRING)
•CLUSTEREDBY (id) INTO 4 BUCKETS;
•分隔成4片,用于拆分成多个MapReduce任务
•selectmyFun(age)from tab3;
•publicclass MyFunextends UDF {
•}
•编写完以后注册:
–createtemporary function myFun as 'com.MyFun'