目录
0. 相关文章链接
1. 环境准备
1.1. 构建服务器环境
关于构建Spark向Hudi中插入数据的服务器环境,可以参考博文的另外一篇博文,在CentOS7上安装HDFS即可,博文连接:数据湖之Hudi(6):Hudi与Spark和HDFS的集成安装使用
1.2. 构建Maven项目和写入数据
此博文演示的是使用Spark代码查询Hudi中已有表中的数据,需要先构建一个Maven项目,并向Hudi中插入一些模拟数据,这些可以参考博主的另外一篇博文进行操作,博文连接:数据湖之Hudi(9):使用Spark向Hudi中插入数据
2. Maven依赖
在另一篇博文中有Maven依赖,但在这里还是补充一下
<repositories>
<repository>
<id>aliyun</id>
<url>http://maven.aliyun.com/nexus/content/groups/public/</url>
</repository>
<repository>
<id>cloudera</id>
<url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
</repository>
<repository>
<id>jboss</id>
<url>http://repository.jboss.com/nexus/content/groups/public</url>
</repository>
</repositories>
<properties>
<scala.version>2.12.10</scala.version>
<scala.binary.version>2.12</scala.binary.version>
<spark.version>3.0.0</spark.version>
<hadoop.version>3.0.0</hadoop.version>
<hudi.version>0.9.0</hudi.version>
</properties>
<dependencies>
<!-- 依赖Scala语言 -->
<dependency>
<groupId>org.scala-lang</groupId>
<artifactId>scala-library</artifactId>
<version>${scala.version}</version>
</dependency>
<!-- Spark Core 依赖 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Spark SQL 依赖 -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_${scala.binary.version}</artifactId>
<version>${spark.version}</version>
</dependency>
<!-- Hadoop Client 依赖 -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-client</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>${hadoop.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
<version>${hadoop.version}</version>
</dependency>
<!-- hudi-spark3 -->
<dependency>
<groupId>org.apache.hudi</groupId>
<artifactId>hudi-spark3-bundle_2.12</artifactId>
<version>${hudi.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-avro_2.12</artifactId>
<version>${spark.version}</version>
</dependency>
</dependencies>
<build>
<outputDirectory>target/classes</outputDirectory>
<testOutputDirectory>target/test-classes</testOutputDirectory>
<resources>
<resource>
<directory>${project.basedir}/src/main/resources</directory>
</resource>
</resources>
<!-- Maven 编译的插件 -->
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.0</version>
<configuration>
<source>1.8</source>
<target>1.8</target>
<encoding>UTF-8</encoding>
</configuration>
</plugin>
<plugin>
<groupId>net.alchim31.maven</groupId>
<artifactId>scala-maven-plugin</artifactId>
<version>3.2.0</version>
<executions>
<execution>
<goals>
<goal>compile</goal>
<goal>testCompile</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>
3. 核心代码
第1步、模拟产生插入数据
第2步、将插入数据写入到Hudi中
第3步、模拟产生更新数据
第4步、将更新数据使用Append模式更新到Hudi中
package com.ouyang.hudi.crud
import scala.collection.JavaConverters._
import org.apache.hudi.QuickstartUtils._
import org.apache.hudi.DataSourceWriteOptions._
import org.apache.hudi.config.HoodieWriteConfig._
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}
/**
* @ date: 2022/2/23
* @ author: yangshibiao
* @ desc: 更新(Update)数据
* 第1步、模拟产生插入数据
* 第2步、将插入数据写入到Hudi中
* 第3步、模拟产生更新数据
* 第4步、将更新数据使用Append模式更新到Hudi中
*/
object Demo03_Update {
def main(args: Array[String]): Unit = {
System.setProperty("HADOOP_USER_NAME", "root")
// 创建SparkSession实例对象,设置属性
val spark: SparkSession = {
SparkSession.builder()
.appName(this.getClass.getSimpleName.stripSuffix("$"))
.master("local[4]")
// 设置序列化方式:Kryo
.config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
.getOrCreate()
}
// 定义变量:表名称、保存路径
val tableName: String = "tbl_trips_cow"
val tablePath: String = "/hudi-warehouse/tbl_trips_cow"
// 导入隐式转换和相关方法
import spark.implicits._
// 创建模拟器,因为要更新数据,所以用同一个模拟器
val dataGen: DataGenerator = new DataGenerator()
// 第1步、模拟乘车数据,并将数据转换成DF
val inserts = convertToStringList(dataGen.generateInserts(20))
val insertDF: DataFrame = spark.read.json(
spark.sparkContext.parallelize(inserts.asScala, 2).toDS()
)
// 第2步、插入数据到Hudi表,选择Overwrite模式,这样不管这个目录和表是否有无数据都会重写
println("插入数据中:" + System.currentTimeMillis())
insertDF.write
.mode(SaveMode.Overwrite)
.format("hudi")
.option("hoodie.insert.shuffle.parallelism", "2")
.option("hoodie.upsert.shuffle.parallelism", "2")
// Hudi 表的属性值设置
.option(PRECOMBINE_FIELD.key(), "ts")
.option(RECORDKEY_FIELD.key(), "uuid")
.option(PARTITIONPATH_FIELD.key(), "partitionpath")
.option(TBL_NAME.key(), tableName)
.save(tablePath)
// 获取更新前Hudi中的数据,并打印
println("获取更新前Hudi中的数据中:" + System.currentTimeMillis())
val updateBeforeDF: DataFrame = spark.read.format("hudi").load(tablePath)
updateBeforeDF.printSchema()
updateBeforeDF.show(100, truncate = false)
println("==================== 分割线 ====================")
// 第3步、用同一个模拟器生成更新数据,并也将数据转换成DF
val updates = convertToStringList(dataGen.generateUpdates(20))
val updateDF: DataFrame = spark.read.json(
spark.sparkContext.parallelize(updates.asScala, 2).toDS()
)
// 第4步、将更新数据插入数据到Hudi表中,使用Append模式才会更新
println("更新数据中:" + System.currentTimeMillis())
updateDF.write
.mode(SaveMode.Append)
.format("hudi")
.option("hoodie.insert.shuffle.parallelism", "2")
.option("hoodie.upsert.shuffle.parallelism", "2")
// Hudi 表的属性值设置
.option(PRECOMBINE_FIELD.key(), "ts")
.option(RECORDKEY_FIELD.key(), "uuid")
.option(PARTITIONPATH_FIELD.key(), "partitionpath")
.option(TBL_NAME.key(), tableName)
.save(tablePath)
// 获取更新后Hudi中的数据,并打印
println("获取更新后Hudi中的数据中:" + System.currentTimeMillis())
val updateAfterDF: DataFrame = spark.read.format("hudi").load(tablePath)
updateAfterDF.printSchema()
updateAfterDF.show(100, truncate = false)
}
}
第一次插入数据后在代码中会对数据进行读取打印,再对数据进行更新后,再次对数据进行读取打印了,如下所示,可以看出确实对数据进行了更新:
插入数据中:1645638938444
获取更新前Hudi中的数据中:1645638946581
root
|-- _hoodie_commit_time: string (nullable = true)
|-- _hoodie_commit_seqno: string (nullable = true)
|-- _hoodie_record_key: string (nullable = true)
|-- _hoodie_partition_path: string (nullable = true)
|-- _hoodie_file_name: string (nullable = true)
|-- begin_lat: double (nullable = true)
|-- begin_lon: double (nullable = true)
|-- driver: string (nullable = true)
|-- end_lat: double (nullable = true)
|-- end_lon: double (nullable = true)
|-- fare: double (nullable = true)
|-- rider: string (nullable = true)
|-- ts: long (nullable = true)
|-- uuid: string (nullable = true)
|-- partitionpath: string (nullable = true)
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+--------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key |_hoodie_partition_path |_hoodie_file_name |begin_lat |begin_lon |driver |end_lat |end_lon |fare |rider |ts |uuid |partitionpath |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+--------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220224015538 |20220224015538_1_7 |c4c672c4-bc22-4954-94ec-8ad80aa3664a|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.5731835407930634 |0.4923479652912024 |driver-213|0.08988581780930216|0.42520899698713666 |64.27696295884016 |rider-213|1645569108280|c4c672c4-bc22-4954-94ec-8ad80aa3664a|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_8 |0299897c-a852-4129-ae52-0dfc3d76b5c2|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.023755167724156978|0.6322099740212305 |driver-213|0.2171902015800108 |0.2132173852420407 |15.330847537835645|rider-213|1645125971122|0299897c-a852-4129-ae52-0dfc3d76b5c2|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_9 |6b3794f7-b26d-4c6e-8ea6-2bd8ed6992df|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.8742041526408587 |0.7528268153249502 |driver-213|0.9197827128888302 |0.362464770874404 |19.179139106643607|rider-213|1645603972111|6b3794f7-b26d-4c6e-8ea6-2bd8ed6992df|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_10 |902ecdf8-640e-4847-834b-7e483f5adcf4|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.8675932789048282 |0.9563153782052657 |driver-213|0.8534087075068594 |0.4153669760172203 |64.12151064878266 |rider-213|1645222316086|902ecdf8-640e-4847-834b-7e483f5adcf4|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_11 |599d5efd-9a84-4232-b871-225258cb8520|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.11488393157088261 |0.6273212202489661 |driver-213|0.7454678537511295 |0.3954939864908973 |27.79478688582596 |rider-213|1645100687224|599d5efd-9a84-4232-b871-225258cb8520|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_12 |6ef20b48-3403-496e-81c5-6964f0c170bd|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.21624150367601136 |0.14285051259466197|driver-213|0.5890949624813784 |0.0966823831927115 |93.56018115236618 |rider-213|1645207425020|6ef20b48-3403-496e-81c5-6964f0c170bd|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_13 |873407d3-8824-49b4-98aa-a597a0240d45|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.2947661370147079 |0.8039197581711358 |driver-213|0.8248244842522374 |0.3873920783955822 |84.9600214569341 |rider-213|1645435050727|873407d3-8824-49b4-98aa-a597a0240d45|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_14 |cb5d57f3-e44e-42b7-b35f-7de3047acfb0|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.1856488085068272 |0.9694586417848392 |driver-213|0.38186367037201974|0.25252652214479043 |33.92216483948643 |rider-213|1645473662083|cb5d57f3-e44e-42b7-b35f-7de3047acfb0|americas/united_states/san_francisco|
|20220224015538 |20220224015538_0_1 |43cb0114-c7a2-4b56-aecf-d73a49c0345e|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.0750588760043035 |0.03844104444445928|driver-213|0.04376353354538354|0.6346040067610669 |66.62084366450246 |rider-213|1645059243518|43cb0114-c7a2-4b56-aecf-d73a49c0345e|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_2 |cc224f56-5b5c-4d85-b56a-87c74c1a7b2e|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.983428192817987 |0.3961523475372767 |driver-213|0.20548299593469077|0.9836743920572577 |60.047501243947934|rider-213|1645597679831|cc224f56-5b5c-4d85-b56a-87c74c1a7b2e|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_3 |97391e4d-350d-4d67-93c7-9e1a2ac60fc0|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.6372504913279929 |0.04241635032425073|driver-213|0.36284275950041867|0.6591829686989255 |44.839244944180244|rider-213|1645629063846|97391e4d-350d-4d67-93c7-9e1a2ac60fc0|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_4 |2ec62676-7a1c-4f02-8b11-791f7847eabd|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.4726905879569653 |0.46157858450465483|driver-213|0.754803407008858 |0.9671159942018241 |34.158284716382845|rider-213|1645203465242|2ec62676-7a1c-4f02-8b11-791f7847eabd|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_5 |3552981d-38c3-4a2f-8f89-7d2d3be1d341|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.9025710109008239 |0.2693250504574297 |driver-213|0.6357677757664507 |0.25770004462445395 |87.08158608552242 |rider-213|1645597722203|3552981d-38c3-4a2f-8f89-7d2d3be1d341|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_6 |7eb0cf33-b411-40a1-9066-c0a67738b4af|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.6100070562136587 |0.8779402295427752 |driver-213|0.3407870505929602 |0.5030798142293655 |43.4923811219014 |rider-213|1645155397506|7eb0cf33-b411-40a1-9066-c0a67738b4af|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_2_15 |9fb42eaa-8b5c-4b70-bc90-ff3e298b659b|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.09384124531808036 |0.9623582692596406 |driver-213|0.44485904691083133|0.5550300795070142 |53.69977335639399 |rider-213|1645255841245|9fb42eaa-8b5c-4b70-bc90-ff3e298b659b|asia/india/chennai |
|20220224015538 |20220224015538_2_16 |c68d347f-5e21-4e4b-8f8c-382c57100f3f|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.8679173655153939 |0.17992665967365185|driver-213|0.7721097247931136 |0.9662606385568611 |70.59591659793207 |rider-213|1645253155280|c68d347f-5e21-4e4b-8f8c-382c57100f3f|asia/india/chennai |
|20220224015538 |20220224015538_2_17 |7c6d6dd1-fe81-481b-8e74-6284cff7f3d2|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.40613510977307 |0.5644092139040959 |driver-213|0.798706304941517 |0.02698359227182834 |17.851135255091155|rider-213|1645469686991|7c6d6dd1-fe81-481b-8e74-6284cff7f3d2|asia/india/chennai |
|20220224015538 |20220224015538_2_18 |61e28fd7-f0ac-4637-a64e-5285ab83538f|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.49527694252432053 |0.28072552620450797|driver-213|0.44848221556652057|0.565791994047955 |93.00604432281203 |rider-213|1645481507252|61e28fd7-f0ac-4637-a64e-5285ab83538f|asia/india/chennai |
|20220224015538 |20220224015538_2_19 |5e432421-b019-4354-9929-61895cdaa213|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.9090538095331541 |0.8801105093619153 |driver-213|0.5873040159790485 |0.028263672792464445|40.211140833035394|rider-213|1645638754639|5e432421-b019-4354-9929-61895cdaa213|asia/india/chennai |
|20220224015538 |20220224015538_2_20 |02ca3ce3-4925-480a-8b70-73111e35afff|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.651058505660742 |0.8192868687714224 |driver-213|0.20714896002914462|0.06224031095826987 |41.06290929046368 |rider-213|1645485296066|02ca3ce3-4925-480a-8b70-73111e35afff|asia/india/chennai |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+--------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
==================== 分割线 ====================
更新数据中:1645638947885
获取更新后Hudi中的数据中:1645638951488
root
|-- _hoodie_commit_time: string (nullable = true)
|-- _hoodie_commit_seqno: string (nullable = true)
|-- _hoodie_record_key: string (nullable = true)
|-- _hoodie_partition_path: string (nullable = true)
|-- _hoodie_file_name: string (nullable = true)
|-- begin_lat: double (nullable = true)
|-- begin_lon: double (nullable = true)
|-- driver: string (nullable = true)
|-- end_lat: double (nullable = true)
|-- end_lon: double (nullable = true)
|-- fare: double (nullable = true)
|-- rider: string (nullable = true)
|-- ts: long (nullable = true)
|-- uuid: string (nullable = true)
|-- partitionpath: string (nullable = true)
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+--------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key |_hoodie_partition_path |_hoodie_file_name |begin_lat |begin_lon |driver |end_lat |end_lon |fare |rider |ts |uuid |partitionpath |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+--------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220224015547 |20220224015547_1_24 |c4c672c4-bc22-4954-94ec-8ad80aa3664a|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.47255932910824583 |0.09835174451313866 |driver-192|0.8768271062363665 |0.391583018565109 |82.6183030502974 |rider-192|1645611981331|c4c672c4-bc22-4954-94ec-8ad80aa3664a|americas/united_states/san_francisco|
|20220224015547 |20220224015547_1_25 |0299897c-a852-4129-ae52-0dfc3d76b5c2|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.7885334532337877 |0.8573824804430561 |driver-192|0.47332186591003045|0.9927159674996295 |50.45582154226707 |rider-192|1645344506822|0299897c-a852-4129-ae52-0dfc3d76b5c2|americas/united_states/san_francisco|
|20220224015547 |20220224015547_1_26 |6b3794f7-b26d-4c6e-8ea6-2bd8ed6992df|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.584204225520771 |0.7212263680879302 |driver-192|0.5501675314928346 |0.6226833057042072 |60.704347025098535|rider-192|1645461680678|6b3794f7-b26d-4c6e-8ea6-2bd8ed6992df|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_10 |902ecdf8-640e-4847-834b-7e483f5adcf4|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.8675932789048282 |0.9563153782052657 |driver-213|0.8534087075068594 |0.4153669760172203 |64.12151064878266 |rider-213|1645222316086|902ecdf8-640e-4847-834b-7e483f5adcf4|americas/united_states/san_francisco|
|20220224015547 |20220224015547_1_27 |599d5efd-9a84-4232-b871-225258cb8520|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.04327820937619131 |0.8562530975462316 |driver-192|0.4539370966816483 |0.5535762898838785 |75.48086309564754 |rider-192|1645588556978|599d5efd-9a84-4232-b871-225258cb8520|americas/united_states/san_francisco|
|20220224015547 |20220224015547_1_28 |6ef20b48-3403-496e-81c5-6964f0c170bd|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.5142305232303094 |0.30495686857778403 |driver-192|0.29666655980198253|0.16768228612130764|24.070894571476064|rider-192|1645452618245|6ef20b48-3403-496e-81c5-6964f0c170bd|americas/united_states/san_francisco|
|20220224015538 |20220224015538_1_13 |873407d3-8824-49b4-98aa-a597a0240d45|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-27-28_20220224015538.parquet|0.2947661370147079 |0.8039197581711358 |driver-213|0.8248244842522374 |0.3873920783955822 |84.9600214569341 |rider-213|1645435050727|873407d3-8824-49b4-98aa-a597a0240d45|americas/united_states/san_francisco|
|20220224015547 |20220224015547_1_29 |cb5d57f3-e44e-42b7-b35f-7de3047acfb0|americas/united_states/san_francisco|9dfc33ef-edc7-463c-8a4a-fd78c6f2372b-0_1-70-86_20220224015547.parquet|0.4878809010360382 |0.07610014905198248 |driver-192|0.9334457064050349 |0.6330100459693088 |90.84944020139248 |rider-192|1645359179566|cb5d57f3-e44e-42b7-b35f-7de3047acfb0|americas/united_states/san_francisco|
|20220224015538 |20220224015538_0_1 |43cb0114-c7a2-4b56-aecf-d73a49c0345e|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.0750588760043035 |0.03844104444445928 |driver-213|0.04376353354538354|0.6346040067610669 |66.62084366450246 |rider-213|1645059243518|43cb0114-c7a2-4b56-aecf-d73a49c0345e|americas/brazil/sao_paulo |
|20220224015547 |20220224015547_0_21 |cc224f56-5b5c-4d85-b56a-87c74c1a7b2e|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-64-85_20220224015547.parquet|0.4925455806562906 |0.5324426130133701 |driver-192|0.964861920281932 |0.4727110150355711 |72.67793086410465 |rider-192|1645341842778|cc224f56-5b5c-4d85-b56a-87c74c1a7b2e|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_3 |97391e4d-350d-4d67-93c7-9e1a2ac60fc0|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.6372504913279929 |0.04241635032425073 |driver-213|0.36284275950041867|0.6591829686989255 |44.839244944180244|rider-213|1645629063846|97391e4d-350d-4d67-93c7-9e1a2ac60fc0|americas/brazil/sao_paulo |
|20220224015538 |20220224015538_0_4 |2ec62676-7a1c-4f02-8b11-791f7847eabd|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-21-27_20220224015538.parquet|0.4726905879569653 |0.46157858450465483 |driver-213|0.754803407008858 |0.9671159942018241 |34.158284716382845|rider-213|1645203465242|2ec62676-7a1c-4f02-8b11-791f7847eabd|americas/brazil/sao_paulo |
|20220224015547 |20220224015547_0_22 |3552981d-38c3-4a2f-8f89-7d2d3be1d341|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-64-85_20220224015547.parquet|0.14503019204958845 |0.5281436198246144 |driver-192|0.3291184473506418 |0.772134626462835 |85.36791718953374 |rider-192|1645104198479|3552981d-38c3-4a2f-8f89-7d2d3be1d341|americas/brazil/sao_paulo |
|20220224015547 |20220224015547_0_23 |7eb0cf33-b411-40a1-9066-c0a67738b4af|americas/brazil/sao_paulo |f5a4fc01-eeb7-4129-a898-b892a8ec27ab-0_0-64-85_20220224015547.parquet|0.024995362119815567|0.5120368636375937 |driver-192|0.21729959707372848|0.08151154133724581|19.873758263401708|rider-192|1645597329495|7eb0cf33-b411-40a1-9066-c0a67738b4af|americas/brazil/sao_paulo |
|20220224015547 |20220224015547_2_30 |9fb42eaa-8b5c-4b70-bc90-ff3e298b659b|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-70-87_20220224015547.parquet|0.6228854864580208 |0.8315496170667523 |driver-192|0.6281051198140281 |0.9312237784651692 |67.243450582925 |rider-192|1645075234625|9fb42eaa-8b5c-4b70-bc90-ff3e298b659b|asia/india/chennai |
|20220224015547 |20220224015547_2_31 |c68d347f-5e21-4e4b-8f8c-382c57100f3f|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-70-87_20220224015547.parquet|0.970612666616691 |0.017082935178053815|driver-192|0.11178708874754062|0.1450793330198833 |20.404106962358203|rider-192|1645312560622|c68d347f-5e21-4e4b-8f8c-382c57100f3f|asia/india/chennai |
|20220224015538 |20220224015538_2_17 |7c6d6dd1-fe81-481b-8e74-6284cff7f3d2|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.40613510977307 |0.5644092139040959 |driver-213|0.798706304941517 |0.02698359227182834|17.851135255091155|rider-213|1645469686991|7c6d6dd1-fe81-481b-8e74-6284cff7f3d2|asia/india/chennai |
|20220224015547 |20220224015547_2_32 |61e28fd7-f0ac-4637-a64e-5285ab83538f|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-70-87_20220224015547.parquet|0.8945841313717807 |0.3945018779685283 |driver-192|0.8920584575412743 |0.9759079698192936 |71.07035158051175 |rider-192|1645336706237|61e28fd7-f0ac-4637-a64e-5285ab83538f|asia/india/chennai |
|20220224015547 |20220224015547_2_33 |5e432421-b019-4354-9929-61895cdaa213|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-70-87_20220224015547.parquet|0.26636532270940916 |0.6539904550963876 |driver-192|0.27262593896775367|0.4292589705152693 |69.9025398548803 |rider-192|1645057355007|5e432421-b019-4354-9929-61895cdaa213|asia/india/chennai |
|20220224015538 |20220224015538_2_20 |02ca3ce3-4925-480a-8b70-73111e35afff|asia/india/chennai |81ee8ecf-1087-401c-ba32-e939b3c23050-0_2-27-29_20220224015538.parquet|0.651058505660742 |0.8192868687714224 |driver-213|0.20714896002914462|0.06224031095826987|41.06290929046368 |rider-213|1645485296066|02ca3ce3-4925-480a-8b70-73111e35afff|asia/india/chennai |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+--------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
同时,因为这是用的cow表类型,所以可以在hdfs文件系统中看到2个parquet文件,如下图所示:
注:Hudi系列博文为通过对Hudi官网学习记录所写,其中有加入个人理解,如有不足,请各位读者谅解☺☺☺
注:其他相关文章链接由此进(包括Hudi在内的各数据湖相关博文) -> 数据湖 文章汇总