1、Hive查询
知识点总结有以下几点:
(1)表别名
hive> SELECT name, salary FROM employees;
hive> SELECT e.name, e.salary FROM employees e;
(2)列别名
hive> SELECT upper(name), salary, deductions["Federal Taxes"] as fed_taxes,
> round(salary * (1 - deductions["Federal Taxes"])) as salary_minus_fed_taxes
> FROM employees LIMIT 2;
(3)子查询
hive> FROM (
> SELECT upper(name), salary, deductions["Federal Taxes"] as fed_taxes,
> round(salary * (1 - deductions["Federal Taxes"])) as salary_minus_fed_taxes
> FROM employees
> ) e
> SELECT e.name, e.salary_minus_fed_taxes
> WHERE e.salary_minus_fed_taxes > 70000;
(4)关键字case
hive> SELECT name, salary,
> CASE
> WHEN salary < 50000.0 THEN 'low'
> WHEN salary >= 50000.0 AND salary < 70000.0 THEN 'middle'
> WHEN salary >= 70000.0 AND salary < 100000.0 THEN 'high'
> ELSE 'very high'
> END AS bracket FROM employees;
(5)Where (筛选)and Group By
hive> SELECT year(ymd), avg(price_close) FROM stocks
> WHERE exchange = 'NASDAQ' AND symbol = 'AAPL'
> GROUP BY year(ymd);
参考文章:
HIVE Queryes 查询
PS:文章中有一处错误,现在hive里没有having函数。
2、Hive函数
(1)cast:类型转换函数。
例如将字符”1″转换为整数:cast(’1′ as bigint),如果转换失败返回NULL。
(2)explode(array<TYPE> a):数组一条记录中有多个参数,将参数拆分,每个参数生成一列。
示例:
myCol [1,2] [3,4] | SELECT explode(myCol) AS myNewCol FROM myTable | myNewCol 1 2 3 4 |
里面还有很多内置运算符,可以直接查找对应函数意义。
3、Join用法
我都是参考的这篇文章(Hive的JOIN用法),里面说的很详细,就不在此赘述了。