各种样式的原始日志的解析方法.
一 文本格式
1. 字段间以空格分割, 字段中有空格时两端必加双引号
52.212.126.146 [11/Aug/2023:16:59:50+0800] "GET /?prefix=tran&max=2 HTTP/1.1"
-- spark-sql解析字段(按csv格式读取)
create temporary view tmp using csv options ('path'='oss_path', 'delimiter'=' ');
select
_c0 as remote_ip
,_c1 as time
,_c2 as request_url
from
tmp
;