数据湖之Hudi(10):使用Spark查询Hudi中的数据

本文介绍了如何使用Spark查询Hudi数据湖中的数据,包括直接查询和基于时间条件的查询。通过构建Spark项目,设置Maven依赖,然后使用DSL语法进行数据过滤,展示了查询和筛选Hudi表中特定时间范围内的数据记录。
摘要由CSDN通过智能技术生成

目录

0. 相关文章链接

1. 环境准备

1.1. 构建服务器环境

1.2. 构建Maven项目和写入数据

2. Maven依赖

3. 核心代码

3.1. 直接查询

3.2. 条件查询


0. 相关文章链接

数据湖 文章汇总

1. 环境准备

1.1. 构建服务器环境

关于构建Spark向Hudi中插入数据的服务器环境,可以参考博文的另外一篇博文,在CentOS7上安装HDFS即可,博文连接:数据湖之Hudi(6):Hudi与Spark和HDFS的集成安装使用

1.2. 构建Maven项目和写入数据

此博文演示的是使用Spark代码查询Hudi中已有表中的数据,需要先构建一个Maven项目,并向Hudi中插入一些模拟数据,这些可以参考博主的另外一篇博文进行操作,博文连接:数据湖之Hudi(9):使用Spark向Hudi中插入数据

2. Maven依赖

在另一篇博文中有Maven依赖,但在这里还是补充一下

    <repositories>
        <repository>
            <id>aliyun</id>
            <url>http://maven.aliyun.com/nexus/content/groups/public/</url>
        </repository>
        <repository>
            <id>cloudera</id>
            <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url>
        </repository>
        <repository>
            <id>jboss</id>
            <url>http://repository.jboss.com/nexus/content/groups/public</url>
        </repository>
    </repositories>
 
    <properties>
        <scala.version>2.12.10</scala.version>
        <scala.binary.version>2.12</scala.binary.version>
        <spark.version>3.0.0</spark.version>
        <hadoop.version>2.7.3</hadoop.version>
        <hudi.version>0.9.0</hudi.version>
    </properties>
 
    <dependencies>
 
        <!-- 依赖Scala语言 -->
        <dependency>
            <groupId>org.scala-lang</groupId>
            <artifactId>scala-library</artifactId>
            <version>${scala.version}</version>
        </dependency>
 
        <!-- Spark Core 依赖 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
        <!-- Spark SQL 依赖 -->
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-sql_${scala.binary.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
 
        <!-- Hadoop Client 依赖 -->
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>${hadoop.version}</version>
        </dependency>
 
        <!-- hudi-spark3 -->
        <dependency>
            <groupId>org.apache.hudi</groupId>
            <artifactId>hudi-spark3-bundle_2.12</artifactId>
            <version>${hudi.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-avro_2.12</artifactId>
            <version>${spark.version}</version>
        </dependency>
 
    </dependencies>
 
    <build>
        <outputDirectory>target/classes</outputDirectory>
        <testOutputDirectory>target/test-classes</testOutputDirectory>
        <resources>
            <resource>
                <directory>${project.basedir}/src/main/resources</directory>
            </resource>
        </resources>
        <!-- Maven 编译的插件 -->
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>3.0</version>
                <configuration>
                    <source>1.8</source>
                    <target>1.8</target>
                    <encoding>UTF-8</encoding>
                </configuration>
            </plugin>
            <plugin>
                <groupId>net.alchim31.maven</groupId>
                <artifactId>scala-maven-plugin</artifactId>
                <version>3.2.0</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

3. 核心代码

3.1. 直接查询

采用Snapshot快照方式从Hudi表查询数据,编写DSL代码,按照业务分析数据

package com.ouyang.hudi.crud

import org.apache.hudi.QuickstartUtils.DataGenerator
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}

/**
 * @ date: 2022/2/23
 * @ author: yangshibiao
 * @ desc: 快照方式查询(Snapshot Query)数据,采用DSL方式
 */
object Demo02_SnapshotQuery {

    def main(args: Array[String]): Unit = {

        // 创建SparkSession实例对象,设置属性
        val spark: SparkSession = {
            SparkSession.builder()
                .appName(this.getClass.getSimpleName.stripSuffix("$"))
                .master("local[4]")
                // 设置序列化方式:Kryo
                .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
                .getOrCreate()
        }

        // 定义变量:表名称、保存路径
        val tableName: String = "tbl_trips_cow"
        val tablePath: String = "/hudi-warehouse/tbl_trips_cow"

        // 构建数据生成器,模拟产生业务数据
        import spark.implicits._

        val tripsDF: DataFrame = spark.read.format("hudi").load(tablePath)
        tripsDF.printSchema()
        tripsDF.show(10, truncate = false)

        // 查询费用大于20,小于50的乘车数据
        tripsDF
            .filter($"fare" >= 20 && $"fare" <= 50)
            .select($"driver", $"rider", $"fare", $"begin_lat", $"begin_lon", $"partitionpath", $"_hoodie_commit_time")
            .orderBy($"fare".desc, $"_hoodie_commit_time".desc)
            .show(100, truncate = false)
    }
}

执行上述代码,点击运行会查询该路径下所有数据,打印数据格式和部分数据,如下所示:

root
 |-- _hoodie_commit_time: string (nullable = true)
 |-- _hoodie_commit_seqno: string (nullable = true)
 |-- _hoodie_record_key: string (nullable = true)
 |-- _hoodie_partition_path: string (nullable = true)
 |-- _hoodie_file_name: string (nullable = true)
 |-- begin_lat: double (nullable = true)
 |-- begin_lon: double (nullable = true)
 |-- driver: string (nullable = true)
 |-- end_lat: double (nullable = true)
 |-- end_lon: double (nullable = true)
 |-- fare: double (nullable = true)
 |-- rider: string (nullable = true)
 |-- ts: long (nullable = true)
 |-- uuid: string (nullable = true)
 |-- partitionpath: string (nullable = true)

+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key                  |_hoodie_partition_path              |_hoodie_file_name                                                    |begin_lat           |begin_lon          |driver    |end_lat            |end_lon            |fare              |rider    |ts           |uuid                                |partitionpath                       |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220223222328     |20220223222328_1_33 |bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5655712287397079  |0.8032800489802543 |driver-213|0.18240785532240533|0.869159296395892  |92.0536330577404  |rider-213|1645625676345|bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_34 |99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.6626987497394154  |0.22504711188369042|driver-213|0.35712946224267583|0.244841817279154  |10.72756362186601 |rider-213|1645326839179|99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_35 |bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.11488393157088261 |0.6273212202489661 |driver-213|0.7454678537511295 |0.3954939864908973 |27.79478688582596 |rider-213|1645094601577|bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_36 |59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5751612868373159  |0.46940431249093517|driver-213|0.6855658616896665 |0.12686440203574556|11.212022663263122|rider-213|1645283606578|59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_37 |5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.1856488085068272  |0.9694586417848392 |driver-213|0.38186367037201974|0.25252652214479043|33.92216483948643 |rider-213|1645133755620|5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_38 |d64b94ec-d8e8-44f3-a5c0-e205e034aa5d|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5731835407930634  |0.4923479652912024 |driver-213|0.08988581780930216|0.42520899698713666|64.27696295884016 |rider-213|1645298902122|d64b94ec-d8e8-44f3-a5c0-e205e034aa5d|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_39 |f0d208fb-b5aa-4236-acbc-a6ec283c5693|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.30057620949299213 |0.3883212395069259 |driver-213|0.8529563766655098 |0.18417876489592633|57.62896261799536 |rider-213|1645483784517|f0d208fb-b5aa-4236-acbc-a6ec283c5693|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_40 |61602de6-6839-4eb2-88ed-75fdf28bbd1f|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.023755167724156978|0.6322099740212305 |driver-213|0.2171902015800108 |0.2132173852420407 |15.330847537835645|rider-213|1645026565110|61602de6-6839-4eb2-88ed-75fdf28bbd1f|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_41 |6b8c7cdd-0302-4110-bced-a996d56828e8|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5692544178629111  |0.610843492129245  |driver-213|0.366234158145209  |0.2051302267345806 |77.05976291070496 |rider-213|1645519660912|6b8c7cdd-0302-4110-bced-a996d56828e8|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_42 |3732e4e6-2095-4eb8-903b-8daf3d307607|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|9.544772278234914E-4|0.7150696027624646 |driver-213|0.4142563844059821 |0.1214902298018885 |24.65031205441023 |rider-213|1645112245071|3732e4e6-2095-4eb8-903b-8daf3d307607|americas/united_states/san_francisco|
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+--------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
only showing top 10 rows

可以在Spark中利用DSL语法对结果进行过滤和筛选,打印结果如下所示:

+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+
|driver    |rider    |fare              |begin_lat           |begin_lon          |partitionpath                       |_hoodie_commit_time|
+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+
|driver-213|rider-213|49.899171213436844|0.49054633351061006 |0.8716474406347761 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|49.57985534250222 |0.13036108279724024 |0.2365242449257826 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|49.121690071563506|0.3880100101379198  |0.8750494376540229 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|46.971815642308016|0.6325393869124881  |0.7723215898397776 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|46.65992353549729 |0.9924142645535157  |0.3157934820865995 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|44.839244944180244|0.6372504913279929  |0.04241635032425073|americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|43.4923811219014  |0.6100070562136587  |0.8779402295427752 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|42.76921664939422 |0.20404106962358204 |0.41452263884832685|americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|42.46412330377599 |0.8918316400031095  |0.11580010866153201|americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|41.076686078636236|0.5712378196458244  |0.4559336764388273 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|41.06290929046368 |0.651058505660742   |0.8192868687714224 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|40.211140833035394|0.9090538095331541  |0.8801105093619153 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|39.31163975206524 |0.7548086309564753  |0.9049457113019617 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|38.697902072535484|0.9199515909032545  |0.2895800693712469 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|38.61457381408665 |0.39253605282983284 |0.5761097193536119 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|34.158284716382845|0.4726905879569653  |0.46157858450465483|americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|33.92216483948643 |0.1856488085068272  |0.9694586417848392 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|31.32477949501916 |0.7267793086410466  |0.2202009625132143 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|30.80177695413958 |0.3613216010259426  |0.8750683366449247 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|30.47844781909017 |0.10509642405359532 |0.07682825311613706|asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|30.24821012722806 |0.6437496229932878  |0.3259549255934986 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|28.874644702723472|0.04316839215753254 |0.49689215534636744|americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|28.53709038726113 |0.132849613764075   |0.2370254092732652 |asia/india/chennai                  |20220223222328     |
|driver-213|rider-213|27.911375263393268|0.9461601725825765  |0.07097928915812768|americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|27.79478688582596 |0.11488393157088261 |0.6273212202489661 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|27.66236301605771 |0.7527035644196625  |0.7525032121800279 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|25.216729525590676|0.48687190581855855 |0.03482702091010481|americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|24.65031205441023 |9.544772278234914E-4|0.7150696027624646 |americas/united_states/san_francisco|20220223222328     |
|driver-213|rider-213|22.991770617403628|0.699025398548803   |0.8105360506582145 |americas/brazil/sao_paulo           |20220223222328     |
|driver-213|rider-213|22.85729206746916 |0.5378950285504629  |0.14011059922351543|americas/brazil/sao_paulo           |20220223222328     |
+----------+---------+------------------+--------------------+-------------------+------------------------------------+-------------------+

3.2. 条件查询

查询Hudi表数据,可以依据时间进行过滤查询,设置属性:"as.of.instant",值的格式:"20220223222328"或"2022-02-23 22:23:28",这只会获取符合条件的数据。

具体代码如下所示:

package com.ouyang.hudi.crud

import org.apache.hudi.QuickstartUtils.DataGenerator
import org.apache.spark.sql.{DataFrame, SaveMode, SparkSession}

/**
 * @ date: 2022/2/23
 * @ author: yangshibiao
 * @ desc: 快照方式查询(Snapshot Query)数据,采用DSL方式
 */
object Demo02_SnapshotQuery {

    def main(args: Array[String]): Unit = {

        // 创建SparkSession实例对象,设置属性
        val spark: SparkSession = {
            SparkSession.builder()
                .appName(this.getClass.getSimpleName.stripSuffix("$"))
                .master("local[4]")
                // 设置序列化方式:Kryo
                .config("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
                .getOrCreate()
        }

        // 定义变量:表名称、保存路径
        val tableName: String = "tbl_trips_cow"
        val tablePath: String = "/hudi-warehouse/tbl_trips_cow"

        import org.apache.spark.sql.functions._

        // 方式一:指定字符串,按照日期时间过滤获取数据
        val df1 = spark.read
            .format("hudi")
            .option("as.of.instant", "20220223222328")
            .load(tablePath)
            .sort(col("_hoodie_commit_time").desc)
        df1.printSchema()
        df1.show(numRows = 5, truncate = false)

        println("==================== 分割线 ====================")

        // 方式二:指定字符串,按照日期时间过滤获取数据
        val df2 = spark.read
            .format("hudi")
            .option("as.of.instant", "2022-02-23 22:23:28")
            .load(tablePath)
            .sort(col("_hoodie_commit_time").desc)
        df2.printSchema()
        df2.show(numRows = 5, truncate = false)
    }
}

打印数据格式和部分数据如下所示:

root
 |-- _hoodie_commit_time: string (nullable = true)
 |-- _hoodie_commit_seqno: string (nullable = true)
 |-- _hoodie_record_key: string (nullable = true)
 |-- _hoodie_partition_path: string (nullable = true)
 |-- _hoodie_file_name: string (nullable = true)
 |-- begin_lat: double (nullable = true)
 |-- begin_lon: double (nullable = true)
 |-- driver: string (nullable = true)
 |-- end_lat: double (nullable = true)
 |-- end_lon: double (nullable = true)
 |-- fare: double (nullable = true)
 |-- rider: string (nullable = true)
 |-- ts: long (nullable = true)
 |-- uuid: string (nullable = true)
 |-- partitionpath: string (nullable = true)

+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key                  |_hoodie_partition_path|_hoodie_file_name                                                    |begin_lat          |begin_lon         |driver    |end_lat             |end_lon            |fare             |rider    |ts           |uuid                                |partitionpath     |
+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
|20220223222328     |20220223222328_2_43 |c7c3c014-0dc4-42e3-a674-020ffc29a028|asia/india/chennai    |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.03154543220118411|0.2887009329948117|driver-213|0.7883536904111458  |0.629523587592623  |86.92639065900747|rider-213|1645123906580|c7c3c014-0dc4-42e3-a674-020ffc29a028|asia/india/chennai|
|20220223222328     |20220223222328_2_45 |c59fa19a-b76a-4477-8015-a49615305292|asia/india/chennai    |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.4805271604136475 |0.8630157667444018|driver-213|0.3272256283194892  |0.6298100777642365 |99.46343958295148|rider-213|1645259758661|c59fa19a-b76a-4477-8015-a49615305292|asia/india/chennai|
|20220223222328     |20220223222328_2_47 |1c73e11f-19f0-48cf-ba76-b79a75af9fd7|asia/india/chennai    |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.7413486368980094 |0.9417400045187958|driver-213|0.03903494276309427 |0.12892252065489862|5.585015784895486|rider-213|1645511312485|1c73e11f-19f0-48cf-ba76-b79a75af9fd7|asia/india/chennai|
|20220223222328     |20220223222328_2_49 |80e12a32-f802-469a-a072-f92d1ed1ca11|asia/india/chennai    |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.132849613764075  |0.2370254092732652|driver-213|0.012105237836192995|0.9180654821797201 |28.53709038726113|rider-213|1645556382792|80e12a32-f802-469a-a072-f92d1ed1ca11|asia/india/chennai|
|20220223222328     |20220223222328_2_50 |bb60dcb8-618c-444b-98ad-c22d0a128f33|asia/india/chennai    |7a997a16-fd0c-48b5-95dd-d50e5216dbab-0_2-28-30_20220223222328.parquet|0.770028447157646  |0.730140741480257 |driver-213|0.2776410021076544  |0.02677801967450366|8.123010514625829|rider-213|1645461203317|bb60dcb8-618c-444b-98ad-c22d0a128f33|asia/india/chennai|
+-------------------+--------------------+------------------------------------+----------------------+---------------------------------------------------------------------+-------------------+------------------+----------+--------------------+-------------------+-----------------+---------+-------------+------------------------------------+------------------+
only showing top 5 rows

==================== 分割线 ====================
root
 |-- _hoodie_commit_time: string (nullable = true)
 |-- _hoodie_commit_seqno: string (nullable = true)
 |-- _hoodie_record_key: string (nullable = true)
 |-- _hoodie_partition_path: string (nullable = true)
 |-- _hoodie_file_name: string (nullable = true)
 |-- begin_lat: double (nullable = true)
 |-- begin_lon: double (nullable = true)
 |-- driver: string (nullable = true)
 |-- end_lat: double (nullable = true)
 |-- end_lon: double (nullable = true)
 |-- fare: double (nullable = true)
 |-- rider: string (nullable = true)
 |-- ts: long (nullable = true)
 |-- uuid: string (nullable = true)
 |-- partitionpath: string (nullable = true)

+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key                  |_hoodie_partition_path              |_hoodie_file_name                                                    |begin_lat          |begin_lon          |driver    |end_lat            |end_lon            |fare              |rider    |ts           |uuid                                |partitionpath                       |
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
|20220223222328     |20220223222328_1_33 |bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5655712287397079 |0.8032800489802543 |driver-213|0.18240785532240533|0.869159296395892  |92.0536330577404  |rider-213|1645625676345|bd6d99d0-107e-4891-9da6-f243b51323bc|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_34 |99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.6626987497394154 |0.22504711188369042|driver-213|0.35712946224267583|0.244841817279154  |10.72756362186601 |rider-213|1645326839179|99bb3a25-669f-4d55-a36f-4ae0b76f76de|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_35 |bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.11488393157088261|0.6273212202489661 |driver-213|0.7454678537511295 |0.3954939864908973 |27.79478688582596 |rider-213|1645094601577|bd4ae628-3885-4b26-8a50-c14f8e42a265|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_36 |59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.5751612868373159 |0.46940431249093517|driver-213|0.6855658616896665 |0.12686440203574556|11.212022663263122|rider-213|1645283606578|59d2ddd0-e836-4443-a816-0ce489c004f2|americas/united_states/san_francisco|
|20220223222328     |20220223222328_1_37 |5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|42e6c711-76e7-4b7c-a6d9-80b1e7aa61a1-0_1-28-29_20220223222328.parquet|0.1856488085068272 |0.9694586417848392 |driver-213|0.38186367037201974|0.25252652214479043|33.92216483948643 |rider-213|1645133755620|5d149bc7-78a8-46df-b2b0-a038dc79e378|americas/united_states/san_francisco|
+-------------------+--------------------+------------------------------------+------------------------------------+---------------------------------------------------------------------+-------------------+-------------------+----------+-------------------+-------------------+------------------+---------+-------------+------------------------------------+------------------------------------+
only showing top 5 rows

注:Hudi系列博文为通过对Hudi官网学习记录所写,其中有加入个人理解,如有不足,请各位读者谅解☺☺☺

注:其他相关文章链接由此进(包括Hudi在内的各数据湖相关博文) -> 数据湖 文章汇总


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

电光闪烁

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值