I am using python3 on Spark(2.2.0). I want to apply my UDF to a specified list of strings.
df = ['Apps A','Chrome', 'BBM', 'Apps B', 'Skype']
def calc_app(app, app_list):
browser_list = ['Chrome', 'Firefox', 'Opera']
chat_list = ['WhatsApp', 'BBM', 'Skype']
sum = 0
for data in app:
name = data['name']
if name in app_list:
sum += 1
return sum
calc_appUDF = udf(calc_app)
df = df.withColumn('app_browser', calc_appUDF(df['apps'], browser_list))
df = df.withColumn('app_chat', calc_appUDF(df['apps'], chat_list))
But it failed and returns : 'Unsupported literal type class java.util.ArrayList'
解决方案
I