实践数据湖iceberg 第十三课 metadata比数据文件大很多倍的问题

该文详细记录了使用Iceberg构建数据湖过程中遇到的问题,包括元数据文件大小远超数据文件、小文件合并后元数据未减少以及快照清理代码执行后未能有效清理元数据。作者通过检查文件目录结构和清理日志,发现元数据目录下有大量metadata.json文件,而清理快照操作并未删除这些文件。问题的解决方法尚待进一步研究。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

系列文章目录

实践数据湖iceberg 第一课 入门
实践数据湖iceberg 第二课 iceberg基于hadoop的底层数据格式
实践数据湖iceberg 第三课 在sqlclient中,以sql方式从kafka读数据到iceberg
实践数据湖iceberg 第四课 在sqlclient中,以sql方式从kafka读数据到iceberg(升级版本到flink1.12.7)
实践数据湖iceberg 第五课 hive catalog特点
实践数据湖iceberg 第六课 从kafka写入到iceberg失败问题 解决
实践数据湖iceberg 第七课 实时写入到iceberg
实践数据湖iceberg 第八课 hive与iceberg集成
实践数据湖iceberg 第九课 合并小文件
实践数据湖iceberg 第十课 快照删除
实践数据湖iceberg 第十一课 测试分区表完整流程(造数、建表、合并、删快照)
实践数据湖iceberg 第十二课 catalog是什么
实践数据湖iceberg 第十三课 metadata比数据文件大很多倍的问题
实践数据湖iceberg 第十四课 元数据合并(解决元数据随时间增加而元数据膨胀的问题)



问题提出

数据不断写入iceberg, 也进行合并与清理快照,发现快照和manifest文件都被清理,但metadata的文件没有被清理的痕迹

数据文件只有6.3M,数据个数20个,但metadata总大小33.1G,metadata个数8715个, 清理最后一个快照前5分钟的所有数据,发现对数据没影响

问题解决方法? 待后续解决,关注后面更新。。。

出现问题的建表方式

基于hiveCatalog在sqlClient建表,建表语句,具体查看11课。
在第11课结尾中也发现这个问题。单独写一篇文章以显示它的重要性。

iceberg小文件合并后出现的问题(现状)

文件大小

[root@hadoop103 ~]# hadoop fs -du -h   /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/
6.3 M   /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data
33.1 G  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata

文件个数

[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data|wc
     21      61    2940
[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata|wc
   8715   26144 1246221

metadata目录

-rw-r--r--   2 root supergroup    8118751 2022-01-26 11:19 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08690-b9a3c862-443e-4f6b-a1fc-c17fe3e517dc.metadata.json
-rw-r--r--   2 root supergroup    8119685 2022-01-26 11:20 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08691-34894f4a-d881-4b8f-b228-7adba992a08f.metadata.json
-rw-r--r--   2 root supergroup    8120615 2022-01-26 11:21 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08692-1ce25766-4ca5-473e-945f-3fd848cae5e3.metadata.json
-rw-r--r--   2 root supergroup    8121549 2022-01-26 11:22 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08693-4bd481a5-f32b-4f15-aad7-4cd3a5af6b39.metadata.json
-rw-r--r--   2 root supergroup    8122483 2022-01-26 11:23 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08694-4f3554aa-4db7-443d-bbb9-ac0871ec02da.metadata.json
-rw-r--r--   2 root supergroup    8123417 2022-01-26 11:24 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08695-e8bf9bda-44e7-4624-83a2-d64db09f5660.metadata.json
-rw-r--r--   2 root supergroup    8124351 2022-01-26 11:25 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08696-2b95f1d4-6843-41e6-9e16-77bbe1875b7f.metadata.json
-rw-r--r--   2 root supergroup    8125285 2022-01-26 11:26 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08697-f11c1b8f-f987-4589-8159-521c65328163.metadata.json
-rw-r--r--   2 root supergroup    8126219 2022-01-26 11:27 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08698-fb8b744a-db03-4b80-8612-15de1d6278cc.metadata.json
-rw-r--r--   2 root supergroup    8127153 2022-01-26 11:28 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08699-a6b6683d-d9f1-45a1-a09b-b242a8284b96.metadata.json
-rw-r--r--   2 root supergroup    8128087 2022-01-26 11:29 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08700-cad78b24-8cd7-464f-95fe-296e96bfd648.metadata.json
-rw-r--r--   2 root supergroup    8129021 2022-01-26 11:30 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08701-0f702902-b2ae-4029-b8cd-97b5df0474ff.metadata.json
-rw-r--r--   2 root supergroup    8129955 2022-01-26 11:31 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08702-91dbcc1f-9d40-4662-874e-8f1091c0a52f.metadata.json
-rw-r--r--   2 root supergroup    8130889 2022-01-26 11:32 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08703-2c78ad8f-69ff-408f-afec-8d707ff944e8.metadata.json
-rw-r--r--   2 root supergroup    8131823 2022-01-26 11:33 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08704-84085a27-b185-468f-9c23-2984a9330762.metadata.json
-rw-r--r--   2 root supergroup    8132757 2022-01-26 11:34 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08705-edc7f661-0ed2-4e46-82a0-a2006dd01ad5.metadata.json
-rw-r--r--   2 root supergroup    8133691 2022-01-26 11:35 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08706-9c3378aa-21cb-48bf-be52-70b25ea59308.metadata.json
-rw-r--r--   2 root supergroup    8343948 2022-01-27 11:52 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08707-afd79c3c-e280-45c4-9797-2fa9a4fa27f4.metadata.json
-rw-r--r--   2 root supergroup    8344913 2022-01-27 14:16 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08708-75efd8f6-ba3f-47dc-8b89-b3177c477a62.metadata.json
-rw-r--r--   2 root supergroup    8345875 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08709-78209251-777c-4a4f-9292-64cf3f2190ae.metadata.json
-rw-r--r--   2 root supergroup      23219 2022-01-27 15:17 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
-rw-r--r--   2 root supergroup       5777 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m0.avro
-rw-r--r--   2 root supergroup       6441 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m1.avro
-rw-r--r--   2 root supergroup       5771 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m2.avro
-rw-r--r--   2 root supergroup       3844 2022-01-27 14:38 /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/snap-7762404597294868190-1-6c6d7719-74a9-4817-914a-b0df5eb8f6ba.avro

大小格式化

7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08684-d4af58ae-4967-48a6-ac40-9308a075fe00.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08685-89f09f2f-6cdf-43d8-acc2-79496dcaf18d.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08686-9be5033f-2592-4696-9c2f-5d1d408910c6.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08687-f111331a-599f-4068-9590-e57c76e46c31.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08688-18779a1c-fd2d-43c2-9c62-4d1efb4caed2.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08689-a1bfd5ea-23a1-431b-8208-a82f2561952e.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08690-b9a3c862-443e-4f6b-a1fc-c17fe3e517dc.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08691-34894f4a-d881-4b8f-b228-7adba992a08f.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08692-1ce25766-4ca5-473e-945f-3fd848cae5e3.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08693-4bd481a5-f32b-4f15-aad7-4cd3a5af6b39.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08694-4f3554aa-4db7-443d-bbb9-ac0871ec02da.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08695-e8bf9bda-44e7-4624-83a2-d64db09f5660.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08696-2b95f1d4-6843-41e6-9e16-77bbe1875b7f.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08697-f11c1b8f-f987-4589-8159-521c65328163.metadata.json
7.7 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08698-fb8b744a-db03-4b80-8612-15de1d6278cc.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08699-a6b6683d-d9f1-45a1-a09b-b242a8284b96.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08700-cad78b24-8cd7-464f-95fe-296e96bfd648.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08701-0f702902-b2ae-4029-b8cd-97b5df0474ff.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08702-91dbcc1f-9d40-4662-874e-8f1091c0a52f.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08703-2c78ad8f-69ff-408f-afec-8d707ff944e8.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08704-84085a27-b185-468f-9c23-2984a9330762.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08705-edc7f661-0ed2-4e46-82a0-a2006dd01ad5.metadata.json
7.8 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08706-9c3378aa-21cb-48bf-be52-70b25ea59308.metadata.json
8.0 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08707-afd79c3c-e280-45c4-9797-2fa9a4fa27f4.metadata.json
8.0 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08708-75efd8f6-ba3f-47dc-8b89-b3177c477a62.metadata.json
8.0 M     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08709-78209251-777c-4a4f-9292-64cf3f2190ae.metadata.json
22.7 K    /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
5.6 K     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m0.avro
6.3 K     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m1.avro
5.6 K     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/6c6d7719-74a9-4817-914a-b0df5eb8f6ba-m2.avro
3.8 K     /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/snap-7762404597294868190-1-6c6d7719-74a9-4817-914a-b0df5eb8f6ba.avro

data目录:

[root@hadoop101 ~]# hadoop fs -du -h /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data
169.1 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00001.parquet
169.0 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00002.parquet
169.1 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-3c21e5b1-54e8-42b1-8bdc-a0b8f1514ee1-00003.parquet
3.1 M    /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-cdcc5019-0c59-41e4-80c6-1d4185455065-00001.parquet
508      /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00000-0-dd8bc29f-831a-4904-830e-2ef56e4a4743-08707.parquet
169.0 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00001.parquet
169.1 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00002.parquet
169.1 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-139af0f5-d3ee-4f35-bd2e-73ce2aaf4792-00003.parquet
552      /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00001-0-e9e8a782-fa82-4c4d-9786-c05b8aab251a-08707.parquet
5.9 K    /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-a0f46641-b14d-4f8b-a16e-4c768bcba775-00109.parquet
169.1 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00001.parquet
164.7 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00002.parquet
169.2 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00003.parquet
169.0 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00002-0-fe001b68-3753-44a7-adb4-63d43c8b3226-00004.parquet
169.2 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00001.parquet
169.0 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00002.parquet
168.9 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00003.parquet
168.9 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00003-0-1d71db79-abf1-4088-9282-bc907e45e262-00004.parquet
527.5 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00001.parquet
169.0 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00002.parquet
168.8 K  /user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/data/00004-0-fea6f5d5-759f-4769-9ced-b3ecca214e36-00003.parquet

清理最后一个快照的5分钟前的所有快照代码

执行合并、清理代码
清理最后一个快照的5分钟前的所有快照


import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment
import org.apache.hadoop.conf.Configuration
import org.apache.iceberg.catalog.{Namespace, TableIdentifier}
import org.apache.iceberg.flink.actions.Actions
import org.apache.iceberg.flink.{CatalogLoader, TableLoader}
import org.apache.log4j.{Level, Logger}
import org.slf4j.LoggerFactory

import java.util
import java.util.concurrent.TimeUnit

object FlinkDataStreamSmallFileCompactTest {
  private var logger: org.slf4j.Logger = _

  def main(args: Array[String]): Unit = {
    logger = LoggerFactory.getLogger(this.getClass.getSimpleName)
    Logger.getLogger("org.apache").setLevel(Level.INFO)
    Logger.getLogger("hive.metastore").setLevel(Level.WARN)
    Logger.getLogger("akka").setLevel(Level.WARN)

  
    // hive catalog
    val env = StreamExecutionEnvironment.getExecutionEnvironment
    System.setProperty("HADOOP_USER_NAME", "root")
    val map = new util.HashMap[String, String]()
    map.put("type", "iceberg")
    map.put("catalog-type", "hive")
    map.put("property-version", "2")
    map.put("/warehouse", "/user/hive/warehouse")
    //    map.put("datanucleus.schema.autoCreateTables", "true")
    //    压缩小文件
    //    快照过期处理
    map.put("uri", "thrift://hadoop101:9083")
    val iceberg_catalog = CatalogLoader.hive(
      "hive_catalog6", //catalog名称
      new Configuration(),
      new util.HashMap()
    )
//    val identifier = TableIdentifier.of(Namespace.of("iceberg_db6"), //db名称
//      "behavior_with_date_log_ib") //表名称  behavior_with_date_log_ib   behavior_log_ib6
    val identifier = TableIdentifier.of(Namespace.of("iceberg_db6"), //db名称
      "behavior_log_ib6") //表名称  behavior_with_date_log_ib   behavior_log_ib6
    val loader = TableLoader.fromCatalog(iceberg_catalog, identifier)
    loader.open()
    val table = loader.loadTable()
    Actions.forTable(env, table)
      .rewriteDataFiles
      .maxParallelism(5)
      .targetSizeInBytes(128 * 1024 * 1024)
      .execute
    // 清除5分钟前历史快照
    val snapshot = table.currentSnapshot
     val old = snapshot.timestampMillis - TimeUnit.MINUTES.toMillis(5)
    if (snapshot != null) {
      table.expireSnapshots
        .expireOlderThan(old)
        .commit()
      println(s" behavior_with_date_log_ib 表 清理完成!!!")
    }
  }
}

清理日志:
发现:没有数据被清理

22/02/10 19:48:51 INFO conf.HiveConf: Found configuration file file:/E:/workspace/jt_workspace/iceberg-learning/flink-iceberg-learning/target/classes/hive-site.xml
22/02/10 19:48:51 WARN conf.HiveConf: HiveConf of name hive.metastore.event.db.notification.api.auth does not exist
22/02/10 19:48:51 INFO security.JniBasedUnixGroupsMapping: Error getting groups for root: Unknown error.
22/02/10 19:48:51 WARN security.UserGroupInformation: No groups available for user root
22/02/10 19:48:51 INFO iceberg.BaseMetastoreTableOperations: Refreshing table metadata from new version: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_log_ib6/metadata/08710-d69a0a2b-959e-488d-8443-471986f49e32.metadata.json
22/02/10 19:48:56 INFO iceberg.BaseMetastoreCatalog: Table loaded by catalog: hive_catalog6.iceberg_db6.behavior_log_ib6
22/02/10 19:48:56 INFO iceberg.BaseTableScan: Scanning table hive_catalog6.iceberg_db6.behavior_log_ib6 snapshot 7762404597294868190 created at 2022-01-27 14:38:10.105 with filter true
22/02/10 19:48:56 INFO iceberg.RemoveSnapshots: Expiring snapshots older than: Thu Jan 27 14:33:10 CST 2022 (1643265190105)
22/02/10 19:48:56 INFO iceberg.BaseMetastoreTableOperations: Nothing to commit.
22/02/10 19:48:56 INFO iceberg.RemoveSnapshots: Committed snapshot changes

其他表删除的日志:

22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=6848801094803890889, timestamp_ms=1644485336293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18788, added-data-files=3, added-records=5961, added-files-size=51810, changed-partition-count=2, total-records=4985317, total-files-size=43416360, total-data-files=105, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6848801094803890889-1-d96ba7dc-7ff2-40ad-a582-f33c987a6740.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=5895976650901516425, timestamp_ms=1644485396286, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18789, added-data-files=2, added-records=5960, added-files-size=50611, changed-partition-count=1, total-records=4991277, total-files-size=43466971, total-data-files=107, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5895976650901516425-1-a9f423cc-0133-4118-9292-016d5227f57a.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=3903341502082098658, timestamp_ms=1644485457083, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18790, added-data-files=2, added-records=5960, added-files-size=50631, changed-partition-count=1, total-records=4997237, total-files-size=43517602, total-data-files=109, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3903341502082098658-1-7a86b5d3-8c5e-4a9c-96c3-85a0c5fa3df0.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=1095796975631658317, timestamp_ms=1644485516288, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18791, added-data-files=2, added-records=5959, added-files-size=51052, changed-partition-count=1, total-records=5003196, total-files-size=43568654, total-data-files=111, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1095796975631658317-1-b071bfb7-3109-4a92-972d-c620138f7220.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=451594432613548689, timestamp_ms=1644485576287, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18792, added-data-files=2, added-records=5959, added-files-size=50810, changed-partition-count=1, total-records=5009155, total-files-size=43619464, total-data-files=113, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-451594432613548689-1-4de192bc-1b21-445b-903f-a88137b930c5.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=22739922463920002, timestamp_ms=1644485636293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18793, added-data-files=2, added-records=5962, added-files-size=50713, changed-partition-count=1, total-records=5015117, total-files-size=43670177, total-data-files=115, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-22739922463920002-1-1c513718-42d8-41a0-82ea-486d2a4a3bbb.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=5013785705895265232, timestamp_ms=1644485696292, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18794, added-data-files=2, added-records=5961, added-files-size=50652, changed-partition-count=1, total-records=5021078, total-files-size=43720829, total-data-files=117, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5013785705895265232-1-03d4f1b3-c4ee-4217-b0f2-19168a8ed28e.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=2526947968329093048, timestamp_ms=1644485756306, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18795, added-data-files=4, added-records=5961, added-files-size=52941, changed-partition-count=2, total-records=5027039, total-files-size=43773770, total-data-files=121, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2526947968329093048-1-8336f6a1-7039-41a7-b736-229ce5bcf10a.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=2484166318625325659, timestamp_ms=1644485816296, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18796, added-data-files=2, added-records=5959, added-files-size=50849, changed-partition-count=1, total-records=5032998, total-files-size=43824619, total-data-files=123, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2484166318625325659-1-02b973a7-2012-4661-9147-145ea82b5126.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=1992367331685787804, timestamp_ms=1644485876293, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18797, added-data-files=2, added-records=5464, added-files-size=46683, changed-partition-count=1, total-records=5038462, total-files-size=43871302, total-data-files=125, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1992367331685787804-1-c0b03758-41c3-46bc-b157-8e846674b1e2.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Expired snapshot: BaseSnapshot{id=3398467964620293154, timestamp_ms=1644485936300, operation=append, summary={flink.job-id=78930f941991e19112d3917fd4dd4cb2, flink.max-committed-checkpoint-id=18798, added-data-files=3, added-records=5960, added-files-size=52223, changed-partition-count=2, total-records=5044422, total-files-size=43923525, total-data-files=128, total-delete-files=0, total-position-deletes=0, total-equality-deletes=0}, manifest-list=hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3398467964620293154-1-d10ea8c5-3986-45b1-bde6-6ed75148dce2.avro, schema-id=0}
22/02/10 17:44:27 INFO iceberg.RemoveSnapshots: Committed snapshot changes; cleaning up expired manifests and data files.
22/02/10 17:44:31 WARN iceberg.RemoveSnapshots: Manifests to delete: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/278e6825-3381-47aa-a08b-4d86a1a0f0e6-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/c00444d4-86e4-4df9-b7b1-29bc15e203a5-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m1.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m4.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/c7fb0523-a144-4bcd-89f3-56c0984561d1-m21.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/92f9c63e-bc85-4965-9a75-b346fe797ad9-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m3.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/b98bd620-63f7-4cc5-8b77-1c6b4ba1cf95-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9380e713-bd4a-41b4-9140-704a7624d2bf-m6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/836c917c-2207-400a-a74c-edc562a9603a-m0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/9305f3e1-9e54-4499-ae7e-8bacc7816c31-m0.avro
22/02/10 17:44:31 WARN iceberg.RemoveSnapshots: Manifests Lists to delete: hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-541878440800103826-1-1b8107b6-6f58-41b3-bca3-21bf624c4719.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7791706873901858756-1-b98bd620-63f7-4cc5-8b77-1c6b4ba1cf95.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5657396929463700436-1-c00444d4-86e4-4df9-b7b1-29bc15e203a5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3398467964620293154-1-d10ea8c5-3986-45b1-bde6-6ed75148dce2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6570416090976553560-1-e329179f-e202-41f6-852e-2585b46eee2e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4278660516617569111-1-d6a355d7-f7d4-4be6-b640-674aedea38d0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7075499429808392849-1-9ade184e-f771-4413-81a8-a968785638f9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2342072444431983976-1-8a281c33-2d42-4828-adb9-0fcbc49cbacd.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1095796975631658317-1-b071bfb7-3109-4a92-972d-c620138f7220.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6005578883465127048-1-7ccf0fb1-9472-4ec4-8198-0dc6b911bdf7.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5532078125138954836-1-6bb16325-09df-4638-8a61-e02a6f5e53f6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4262237804586276768-1-e7c26525-29d1-4a1a-867e-cd9790a55068.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1722651238361119409-1-df33539d-c13f-4d07-aa7e-657b42df1f78.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-22739922463920002-1-1c513718-42d8-41a0-82ea-486d2a4a3bbb.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-468106048969373971-1-5d4446d8-d779-426a-8243-8b857383fd3e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8018294029736388458-1-c208faf8-7d30-474a-880c-8191db9cd448.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1470299392035948712-1-d9dd390e-ee51-4ffd-ae30-e91a6d019757.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1992367331685787804-1-c0b03758-41c3-46bc-b157-8e846674b1e2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7058941970938557666-1-9602f8d7-4638-4d23-af5b-64d3382e1644.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5361802753278781380-1-9b81fdec-ecee-4330-8541-aea40c878268.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2183998972431095493-1-1b41dec9-c9e6-441e-bdfd-a5cbd52b11fc.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1078972720570425309-1-be9504a7-fb98-48df-9999-c2857d856af7.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3495751966676651473-1-5dfc4f7b-c16d-4429-8182-a375db8ec903.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2843486521572234923-1-754d9e82-2175-4762-8737-d95ae98200d4.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8309783644936857381-1-9305f3e1-9e54-4499-ae7e-8bacc7816c31.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2484166318625325659-1-02b973a7-2012-4661-9147-145ea82b5126.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6848801094803890889-1-d96ba7dc-7ff2-40ad-a582-f33c987a6740.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-1559852676159610002-1-615b94f7-5a3a-42a3-9181-1ad9a2425427.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5487640863335657501-1-cf9145ca-9184-4095-9af5-625307270cde.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3606588897957810627-1-92f9c63e-bc85-4965-9a75-b346fe797ad9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6539673233134379517-1-345c9b77-be31-449c-a67e-970b80078069.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5895976650901516425-1-a9f423cc-0133-4118-9292-016d5227f57a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3903341502082098658-1-7a86b5d3-8c5e-4a9c-96c3-85a0c5fa3df0.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-9158469320395181971-1-7674e415-c2f5-4566-b251-20c2636dfc1f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-2526947968329093048-1-8336f6a1-7039-41a7-b736-229ce5bcf10a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7955617778669899471-1-1049aed0-3215-4267-82fe-e37df441957f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-7923280809105826466-1-40b70bd0-e8fb-4186-8c1f-97a427649160.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-878441999283792062-1-68955262-5444-4898-8d98-f93736abcd9b.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4364834558723325257-1-00fb7f19-4224-4da8-b1f0-d85ed241d7eb.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-451594432613548689-1-4de192bc-1b21-445b-903f-a88137b930c5.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5974895447555666685-1-36baa0b7-0f1f-4e1c-9595-69179fb09aa9.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6532101506813450600-1-5898b192-1821-48bd-9c6c-98cd496ba37a.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8489936993001945197-1-353c552c-1595-495f-8e44-641f47ebf250.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-8127473867318873076-1-f0905056-841f-496b-a6fd-133ca6f121d2.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3511541622291330360-1-9380e713-bd4a-41b4-9140-704a7624d2bf.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-4031948148957742647-1-278e6825-3381-47aa-a08b-4d86a1a0f0e6.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-3212606031402422010-1-a70f8e62-8c34-432e-b752-9063ed2c902f.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-5013785705895265232-1-03d4f1b3-c4ee-4217-b0f2-19168a8ed28e.avro, hdfs://ns/user/hive/warehouse/hive_catalog6/iceberg_db6.db/behavior_with_date_log_ib/metadata/snap-6026165152411827559-1-5ca27510-3e7c-446e-8f69-eddd80bb2b66.avro
 behavior_with_date_log_ib 表 清理完成!!!

Process finished with exit code 0


总结

iceberg的文件合并与快照删除特点:

合并:会生成新的文件
快照删除:会删除snap和Manifests 文件,metadata文件没有合并,并清理老metadata

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值