![](https://img-blog.csdnimg.cn/20201014180756923.png?x-oss-process=image/resize,m_fixed,h_64,w_64)
大数据
stSahana
欢迎关注,共同交流
展开
-
HIVE删除__HIVE_DEFAULT_PARTITION__
因为脏数据的写入导致hive产生了__HIVE_DEFAULT_PARTITION__的分区通过alter table tmp.test drop partition(eventdate>20210329) 来删除会报错hive 版本:1.1.0CDH:5.13.0数据库找到对应分区登录hive的元数据库select * from PARTITIONS where PART_NAME like '%__HIVE_DEFAULT_PARTITION__%';PART_IDC.原创 2021-04-01 21:29:14 · 3459 阅读 · 1 评论 -
常用的Spark脚本
日常用到的spark脚本,便于日常粘贴写ESval options = Map("es.index.auto.create" -> "true","es.nodes.wan.only" -> "true","es.nodes" -> "192.168.3.1:9200","es.port" -> "9200","es.mapping.id" -> "id")import org.apache.spark.sql.SaveMode.withColumn("i.原创 2021-03-23 17:18:30 · 946 阅读 · 0 评论 -
Flink 侧路输出
DataStream<String> text = env.socketTextStream(hostname, port, "\n");text.filter(e->Integer.parseInt(e)>20).print();text.filter(e->Integer.parseInt(e)>10).print();env.execute();执行图如下 DataStream<String> text = env.socketTextS原创 2020-07-11 17:05:03 · 215 阅读 · 0 评论 -
工具类--hdfs小文件合并
package cn.ac.iieimport org.apache.hadoop.conf.Configurationimport org.apache.hadoop.fs.{FileStatus, Path}import org.apache.spark.sql.SparkSessionobject MergerFile { def main(args: Array[String]): Unit = { val spark: SparkSession = SparkSessio原创 2020-06-22 21:05:22 · 631 阅读 · 0 评论 -
hive简易配置
需要提前配置好hadooptar -zxvf apache-hive-2.3.1-bin.tar.gz mv apache-hive-2.3.1-bin /usr/local/hiveecho export HIVE_HOME=/usr/local/hive>>~/.bashrcecho export PATH=$PATH:$HIVE_HOME/bin>>~/.bashrcecho expo原创 2017-11-28 16:20:23 · 250 阅读 · 0 评论 -
Hadoop 伪分布式环境搭建
参考网址: http://hadoop.apache.org/docs/r2.8.2/hadoop-project-dist/hadoop-common/SingleCluster.html#Pseudo-Distributed_Operation 需提前配置好java环境,下载安装包并解压好,进入加压后的文件夹配置信息vi etc/hadoop/core-site.xml 内容如下<c原创 2017-11-12 23:15:04 · 249 阅读 · 0 评论 -
spark读写HBASE
环境配置scala -> 2.11.12spark->2.2.0HBASE ->1.4.0 注意:用2.0的jar包写入不进去,但也不报错/** * spark直接读写Hbase,已测试 * @Author: stsahana * @Date: 2019-8-21 18:27 **/object HbaseDemo { def main(ar...原创 2019-08-26 21:51:31 · 400 阅读 · 0 评论 -
spark自定义函数
参考:1. https://spark.apache.org/docs/latest/sql-getting-started.html#untyped-user-defined-aggregate-functions2.https://spark.apache.org/docs/latest/sql-getting-started.html#type-safe-user-defined-agg...原创 2019-08-26 22:08:38 · 378 阅读 · 0 评论