Spark遇见问题【一】——DataFrame object has no attribute col

DataFrame object has no attribute ‘col’
In Spark: The Definitive Guide it says:

If you need to refer to a specific DataFrame’s column, you can use the col method on the specific DataFrame.

问题描述

For example (in Python/Pyspark):

最关键的问题就是这个语句在python语句中是不对的,这是在Scala中使用的

df.col("count")

However, when I run the latter code on a dataframe containing a column count I get the error

‘DataFrame’ object has no attribute ‘col’.

If I try column I get a similar error.

Is the book wrong, or how should I go about doing this?

I’m on Spark 2.3.1. The dataframe was created with the following:

df = spark.read.format("json").load("/Users/me/Documents/Books/Spark-The-Definitive

解决:

The book you’re referring to describes Scala / Java API. In PySpark use []

df["count"]

但是同时也要区分着这种情况:

例如:

df.where(col(''ORIGIN_COUNTRY_NAME') != 'United States').show()

如果这里出现:

name ‘col’ is not defined

这是因为你可能没有导入col这个函数

需要加上:

from spark.sql.functions import col

参考资料:
【1】DataFrame object has no attribute ‘col’

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值