spark 宽表 mysql_spark精选代码一，行列转换即宽表窄表转换

最新推荐文章于 2023-03-07 18:12:35 发布

小花学姐

最新推荐文章于 2023-03-07 18:12:35 发布

阅读量317

点赞数

文章标签： spark 宽表 mysql

本文链接：https://blog.csdn.net/weixin_30478241/article/details/113381145

版权

不定期上代码干货

spark列转行

from pyspark import SparkContext, SparkConf

from pyspark.sql import SparkSession, SQLContext, Row, functions as F

from pyspark.sql.functions import array, col, explode, struct, lit

conf = SparkConf().setAppName("test").setMaster("local[*]")

sc = SparkContext(conf=conf)

spark = SQLContext(sc)

# df is datasource, by will exclude column

def df_columns_to_line(df, by):

# Filter dtypes and split into column names and type description

df_a = df.select([col(c).cast("string") for c in df.columns])

cols, dtypes = zip(*((c, t) for (c, t) in df_a.dtypes if c not in by))

# Spark SQL supports only homogeneous columns

assert len(set(dtypes)) == 1, "All columns have to be of the same type"

# Create and explode an array of (column_name, column_value) str

确定要放弃本次机会？

福利倒计时

: :

立减 ¥

普通VIP年卡可用

关注关注