SELECT
date_format(a.start_time, 'yyyy-MM-dd') AS `date`,
COUNT(*) AS `排课课节数`,
SUM(a.stu_num) AS `排课学生人次`,
count(case when a.category=0 then 1 end) as `1v1课节数`,
sum(case when a.category=0 then a.stu_num end) as `1v1学生人次`,
count(case when a.category=7 then 1 end) as `小班课课节数`,
sum(case when a.category=7 then a.stu_num end) as `小班课学生人次`,
count(case when a.category in (1,2) then 1 end) as `大班课课节数`,
sum(case when a.category in (1,2) then a.stu_num end) as `大班课学生人次`
FROM
epg_ods.zby_api_cloudclass_lesson a
WHERE
a.start_time > CURRENT_DATE()
AND a.start_time < date_sub(CURRENT_DATE(),-8)
AND a.deleted_at is NULL
GROUP BY date_format(a.start_time, 'yyyy-MM-dd')
ORDER BY `date` ASC;
报错
Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
后来去yarn上看了下日志
Caused by: java.lang.ClassCastException: java.sql.Timestamp cannot be cast to java.lang.String
at java.lang.String.compareTo(String.java:111)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.compareToRange(RecordReaderImpl.java:2295)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicateRange(RecordReaderImpl.java:2444)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.evaluatePredicate(RecordReaderImpl.java:2383)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.pickRowGroups(RecordReaderImpl.java:2595)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.readStripe(RecordReaderImpl.java:2658)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceStripe(RecordReaderImpl.java:3095)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.advanceToNextRow(RecordReaderImpl.java:3137)
at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.<init>(RecordReaderImpl.java:289)
at org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:566)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:227)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.<init>(OrcInputFormat.java:159)
at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1027)
at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:68)
... 16 more```
之后逐个条件去注释测试,后来发现将where的 a.start_time < date_sub(CURRENT_DATE(),0)条件注释掉就ok了
```sql
SELECT
date_format(a.start_time, 'yyyy-MM-dd') AS `date`,
COUNT(*) AS `排课课节数`,
SUM(a.stu_num) AS `排课学生人次`,
count(case when a.category=0 then 1 end) as `1v1课节数`,
sum(case when a.category=0 then a.stu_num end) as `1v1学生人次`,
count(case when a.category=7 then 1 end) as `小班课课节数`,
sum(case when a.category=7 then a.stu_num end) as `小班课学生人次`,
count(case when a.category in (1,2) then 1 end) as `大班课课节数`,
sum(case when a.category in (1,2) then a.stu_num end) as `大班课学生人次`
FROM
epg_ods.zby_api_cloudclass_lesson a
WHERE
a.start_time > CURRENT_DATE()
-- AND a.start_time < date_sub(CURRENT_DATE(),-8)
AND a.deleted_at is NULL
GROUP BY date_format(a.start_time, 'yyyy-MM-dd')
ORDER BY `date` ASC;
我想了一个方法就是将 a.start_time < date_sub(CURRENT_DATE(),0)这个条件转换成左边为date类型再去比较,发现成功了
SELECT
date_format(a.start_time, 'yyyy-MM-dd') AS `date`,
COUNT(*) AS `排课课节数`,
SUM(a.stu_num) AS `排课学生人次`,
count(case when a.category=0 then 1 end) as `1v1课节数`,
sum(case when a.category=0 then a.stu_num end) as `1v1学生人次`,
count(case when a.category=7 then 1 end) as `小班课课节数`,
sum(case when a.category=7 then a.stu_num end) as `小班课学生人次`,
count(case when a.category in (1,2) then 1 end) as `大班课课节数`,
sum(case when a.category in (1,2) then a.stu_num end) as `大班课学生人次`
FROM
epg_ods.zby_api_cloudclass_lesson a
WHERE
a.start_time > CURRENT_DATE()
AND date(a.start_time) < date_sub(CURRENT_DATE(),-8)
AND a.deleted_at is NULL
GROUP BY date_format(a.start_time, 'yyyy-MM-dd')
ORDER BY `date` ASC;
总结如下:hive用timestamp类型的时间与string类型比较大小时,需要将timestamp类型转成date类型再去比较。(方法不唯一,应该是只要能进行自动类型转换就可以比较)
后续发现
timestamp与date类型可以直接比较,与string类型的比较会报错