pyspark学习笔记（一），修改列的dtype

最新推荐文章于 2024-06-05 11:00:00 发布

冰色的圆

最新推荐文章于 2024-06-05 11:00:00 发布

阅读量4.3k

点赞数 2

分类专栏： pyspark学习文章标签：修改的dtype

本文链接：https://blog.csdn.net/weixin_42099369/article/details/84950896

版权

pyspark学习专栏收录该内容

1 篇文章 0 订阅

订阅专栏

先查看一下各列

df.printSchema()
root
 |-- Id: string (nullable = true)
 |-- groupId: string (nullable = true)
 |-- matchId: string (nullable = true)
 |-- assists: string (nullable = true)
 |-- boosts: string (nullable = true)
 |-- damageDealt: string (nullable = true)
 |-- DBNOs: string (nullable = true)
 |-- headshotKills: string (nullable = true)
 |-- heals: string (nullable = true)
 |-- killPlace: string (nullable = true)
 |-- killPoints: string (nullable = true)
 |-- kills: string (nullable = true)
 |-- killStreaks: string (nullable = true)
 |-- longestKill: string (nullable = true)
 |-- maxPlace: string (nullable = true)
 |-- numGroups: string (nullable = true)
 |-- revives: string (nullable = true)
 |-- rideDistance: string (nullable = true)
 |-- roadKills: string (nullable = true)
 |-- swimDistance: string (nullable = true)
 |-- teamKills: string (nullable = true)
 |-- vehicleDestroys: string (nullable = true)
 |-- walkDistance: string (nullable = true)
 |-- weaponsAcquired: string (nullable = true)
 |-- winPoints: string (nullable = true)
 |-- winPlacePerc: string (nullable = true)

看到kills的dtype是string

根据官方文档，修改一下：

df.kills.astype("int")
Out[29]: Column<b'CAST(kills AS INT)'>

再看一下列属性，发现没变：

df.select("kills").dtypes
Out[34]: [('kills', 'string')]

一个可行的方法：

df = df.withColumn("kills",df.kills.astype("int"))
df.select("kills").dtypes
Out[36]: [('kills', 'int')]

成功了

冰色的圆

关注

2
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录