1.sql4种以上去重
2.sql行转列,不用explode
3sql抽样不用sample,按照type字段,每个type抽样5%
4.python,有序数组,平方去重后的长度[-1,0,1,2],不用set
5.字典排序,按照value排序
{a:16,z:2,c:4}—>{a:16,c:4,z:2}
6.[[1,2],[2.3],[3,4.5]]—>[1,2,3,4,5] 时间空间复杂度
7.多线程
2.1 数据库三范式
2.2 数据结构有哪些
2.3 排序算法了解哪些,时间复杂度多少
2.4 python的集合
2.5 python多线程,多进程
2.6 MR原理,shuffle过程
2.7 树的遍历,前序、中序、后序、层序
2.8 SQL,table2列,x,y坐标,(1)最高值和最低值的xy坐标(2)波峰和波谷坐标
(1)
select a.x,a.y from t a join
(
select max(y) max_y ,min(y) min_y from t
)b
on a.y=max_y or a.y=min_y
(2)
select x,y from
(select
x,
y,
lag(y,1) over(order by x) as lag_y,
lead(y,1) over(order by x) as lead_y
from t)
where (y>lay_y and y>lead_y) or (y<lay_y and y<lead_y)
2.9 SQL: send,rcv,msg3列,求发送消息最多的三组
select frome,receiver,msg_num from
(select group_one,msg_num ,row_number() over(order by msg_num desc )as rn
select frome+receiver as group_one, count(1) as msg_num from
(select
max(cast(send_uid as string),cast(received_uid as string)) as fromer,
min(cast(send_uid as string),cast(received_uid as string)) as receiver,
msg from t_message
)a
group by frome+receiver)b
)b
where rn <=3
Hive Sql中六种面试题型总结: https://blog.csdn.net/lightupworld/article/details/108583548