Spark技术栈整理

最新推荐文章于 2023-10-13 13:18:36 发布

Haven.Liu

最新推荐文章于 2023-10-13 13:18:36 发布

阅读量677

点赞数

分类专栏：大数据 Spark 文章标签： SparkSQL

本文链接：https://blog.csdn.net/LiuKingJia/article/details/107644800

版权

大数据同时被 2 个专栏收录

12 篇文章 1 订阅

订阅专栏

Spark

2 篇文章 0 订阅

订阅专栏

一、REFRESH TABLE

问题 1. 当Spark on Hive时，Spark读不到hive的数据，而hiveSQL可以查询数据？

问题 2. SparkSQL和HiveSQL 查询的结果不一致？

解决：刷新Spark的缓存：

REFRESH TABLE test.dws_d_driver

代码刷新：

【1】hiveContext刷新

import org.apache.spark.sql.hive.HiveContext

hiveContext.refreshTable("tableName")

【2】SparkSQL刷新

sql_context.sql("REFRESH TABLE table_name")

官方文档：

refreshTable(String tableName)

Invalidate and refresh all the cached the metadata of the given table. For performance reasons, Spark SQL or the external data source library it uses might cache certain metadata about a table, such as the location of blocks. When those change outside of Spark SQL, users should call this function to invalidate the cache.

http://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/hive/HiveContext.html#refreshTable-java.lang.String-

二、spark-submit 详细参数说明

参数名	参数说明
--master	master 的地址，提交任务到哪里执行，例如 spark://host:port, yarn, local
--deploy-mode	在本地 (client) 启动 driver 或在 cluster 上启动，默认是 client
--class	应用程序的主类，仅针对 java 或 scala 应用
--name	应用程序的名称
--jars	用逗号分隔的本地 jar 包，设置后，这些 jar 将包含在 driver 和 executor 的 classpath 下
--packages	包含在driver 和executor 的 classpath 中的 jar 的 maven 坐标
--exclude-packages	为了避免冲突而指定不包含的 package
--repositories	远程 repository
--conf PROP=VALUE	指定 spark 配置属性的值，例如 -conf spark.executor.extraJavaOptions="-XX:MaxPermSize=256m"
--properties-file	加载的配置文件，默认为 conf/spark-defaults.conf
--driver-memory	Driver内存，默认 1G
--driver-java-options	传给 driver 的额外的 Java 选项
--driver-library-path	传给 driver 的额外的库路径
--driver-class-path	传给 driver 的额外的类路径
--driver-cores	Driver 的核数，默认是1。在 yarn 或者 standalone 下使用
--executor-memory	每个 executor 的内存，默认是1G
--total-executor-cores	所有 executor 总共的核数。仅仅在 mesos 或者 standalone 下使用
--num-executors	启动的 executor 数量。默认为2。在 yarn 下使用
--executor-core	每个 executor 的核数。在yarn或者standalone下使用