函数
-
函数
Oralce
REGEXP_LIKE
Spark
RLIKE、REGEXP
Hive
RLIKE、REGEXP
建表Oracle
CREATE TABLE TEST_REGEXP (
ID VARCHAR2(100),
NAME VARCHAR2(100)
) ;
INSERT INTO TEST_REGEXP (ID, NAME) VALUES('001', '张三');
INSERT INTO TEST_REGEXP (ID, NAME) VALUES('002', '张三1');
INSERT INTO TEST_REGEXP (ID, NAME) VALUES('003', '2张三');
Hive
CREATE TABLE TEST_REGEXP (
ID string,
NAME string
) ;
INSERT INTO TEST_REGEXP VALUES('001', '张三');
INSERT INTO TEST_REGEXP VALUES('002', '张三1');
INSERT INTO TEST_REGEXP VALUES('003', '2张三');
示例
查询NAME含有数字的记录
Oracle
SELECT * FROM TEST_REGEXP WHERE REGEXP_LIKE(NAME,'\d+');
SELECT * FROM TEST_REGEXP WHERE REGEXP_LIKE(NAME,'[0-9]+');
select * from TEST_REGEXP where name regexp '\\d+';
select * from TEST_REGEXP where name regexp '[0-9]+';
Hive
select * from TEST_REGEXP where name rlike '\\d+'
select * from TEST_REGEXP where name rlike '[0-9]+';
Spark
Spark 和 Hive一样
select * from TEST_REGEXP where name rlike '\\d+'
select * from TEST_REGEXP where name rlike '[0-9]+';
select * from TEST_REGEXP where name regexp '\\d+';
select * from TEST_REGEXP where name regexp '[0-9]+';
不过在代码里 \ 需要转义
package com.dkl.blog.spark.sql
import org.apache.spark.sql.SparkSession
/**
* Created by dongkelun on 2019/12/2 19:27
*/
object Test_RegExp {
def main(args: Array[String]): Unit = {
val spark = SparkSession.builder().appName("NewUVDemo").master("local").getOrCreate()
import spark.implicits._
import spark.sql
val df = spark.sparkContext.parallelize(
Array(
("001", "张三"), ("002", "张三1"), ("003", "2张三")
)).toDF("ID","NAME")
df.createOrReplaceTempView("TEST_REGEXP")
sql("select * from TEST_REGEXP where name rlike '\\\\d+'").show()
sql("select * from TEST_REGEXP where name rlike '[0-9]+'").show()
sql("select * from TEST_REGEXP where name regexp '\\\\d+'").show()
sql("select * from TEST_REGEXP where name regexp '[0-9]+'").show()
spark.close()
}
}
小结
Oralce和Hive、Spark除了函数不同外,正则也多少有不同,比如上例中Oraqlce只有一个 \ 而Hive和Spark有两个\,具体的正则匹配规则可参考网上的资料,其中下面参考的资料中也有一些规则可以参考