大数据
文章平均质量分 52
Fei-joe
这个作者很懒,什么都没留下…
展开
-
databricks spark 集群连接AWS s3 数据
dbutils.fs.unmount("/mnt/s3data")access_key = "***********"secret_key = "*****************"encoded_secret_key = secret_key.replace("/", "%2F")aws_bucket_name = "feia**est"mount_name = "s3data"dbutils.fs.mount("s3a://%s:%s@%s" % (access_key, encoded.原创 2021-02-24 05:43:58 · 803 阅读 · 0 评论 -
2021-01-26
Csv to Excel$ pip install pyexcel pyexcel-xlsxyou can do it in one command line:from pyexcel.cookbook import merge_all_to_a_book# import pyexcel.ext.xlsx # no longer required if you use pyexcel >= 0.2.2 import globmerge_all_to_a_book(glob....原创 2021-01-26 19:01:07 · 205 阅读 · 0 评论 -
spark on hive
本文主要记录如何安装配置Hive on Spark,在执行以下步骤之前,请先确保已经安装Hadoop集群,Hive,MySQL,JDK,Scala,具体安装步骤不再赘述。背景Hive默认使用MapReduce作为执行引擎,即Hive on mr。实际上,Hive还可以使用Tez和Spark作为其执行引擎,分别为Hive on Tez和Hive on Spark。由于MapReduce中间计...原创 2018-10-30 12:07:49 · 1070 阅读 · 0 评论 -
oracle 逗号拆分字段 转多行
oracle 逗号分隔列转多行数据(动态)Oracle APEXdate table below:test SQL:SELECTActive_yn,REGEXP_SUBSTR(DAYS,'[^,]+',1,LEVEL)NAMEFROMJoeCONNECTBYLEVEL<=REGEXP_COUNT(DAYS,'[^,]+')ANDROWID=PRIORROWIDANDPRIORDBMS_RANDO...原创 2020-12-15 21:53:10 · 741 阅读 · 0 评论 -
centos7 部署superset 最完整包括调优连接clickhouse
# 依赖库安装yum upgrade python-setuptoolsyum install gcc gcc-c++ libffi-devel python-devel python-pip python-wheel openssl-devel libsasl2-devel openldap-develyum groupinstall "Development tools"yum ins...原创 2020-04-14 22:55:09 · 865 阅读 · 1 评论