scala和spark用到的依赖_如何在IntelliJ IDEA中创建Spark / Scala项目（无法解析build.sbt中的依赖项）？...

最新推荐文章于 2023-03-01 11:50:28 发布

weixin_39653761

最新推荐文章于 2023-03-01 11:50:28 发布

阅读量369

点赞数

文章标签： scala和spark用到的依赖

本文链接：https://blog.csdn.net/weixin_39653761/article/details/111952688

版权

I'm trying to build and run a Scala/Spark project in IntelliJ IDEA.

I have added org.apache.spark:spark-sql_2.11:2.0.0 in global libraries and my build.sbt looks like below.

name := "test"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.0.0"

libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.0.0"

I still get an error that says

unknown artifact. unable to resolve or indexed

under spark-sql.

When tried to build the project the error was

Error:(19, 26) not found: type sqlContext, val sqlContext = new sqlContext(sc)

I have no idea what the problem could be. How to create a Spark/Scala project in IntelliJ IDEA?

Update:

Following the suggestions I updated the code to use Spark Session, but it still unable to read a csv file. What am I doing wrong here? Thank you!

val spark = SparkSession

.builder()

.appName("Spark example")

.config("spark.some.config.option", "some value")

.getOrCreate()

import spark.implicits._

val testdf = spark.read.csv("/Users/H/Desktop/S_CR_IP_H.dat")

testdf.show() //it doesn't show anything

//pdf.select("DATE_KEY").show()

解决方案

sql should upper case letters as below

val sqlContext = new SQLContext(sc)

SQLContext is deprecated for newer versions of spark so I would suggest you to use SparkSession

val spark = SparkSession.builder().appName("testings").getOrCreate

val sqlContext = spark.sqlContext

If you want to set the master through your code instead of from spark-submit command then you can set .master as well (you can set configs too)

val spark = SparkSession.builder().appName("testings").master("local").config("configuration key", "configuration value").getOrCreate

val sqlContext = spark.sqlContext

Update

Looking at your sample data

DATE|PID|TYPE

8/03/2017|10199786|O

and testing your code

val testdf = spark.read.csv("/Users/H/Desktop/S_CR_IP_H.dat")

testdf.show()

I had output as

+--------------------+

| _c0|

+--------------------+

| DATE|PID|TYPE|

|8/03/2017|10199786|O|

+--------------------+

Now adding .option for delimiter and header as

val testdf2 = spark.read.option("delimiter", "|").option("header", true).csv("/Users/H/Desktop/S_CR_IP_H.dat")

testdf2.show()

Output was

+---------+--------+----+

| DATE| PID|TYPE|

+---------+--------+----+

|8/03/2017|10199786| O|

+---------+--------+----+

Note: I have used .master("local") for SparkSession object

weixin_39653761

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
scala和spark用到的依赖_如何在IntelliJ IDEA中创建Spark / Scala项目（无法解析build.sbt中的依赖项）？...

I'm trying to build and run a Scala/Spark project in IntelliJ IDEA.I have added org.apache.spark:spark-sql_2.11:2.0.0 in global libraries and my build.sbt looks like below.name := "test"version := "1....
复制链接

扫一扫