Spark 04 Spark SQL 使用

1 SQLContext的使用

Spark1.x中Spark SQL的入口:SQLContext

The entry point into all functionality in Spark SQL is the SQLContext class, or one of its descendants. To create a basic SQLContext, all you need is a SparkContext.

val sc: SparkContext // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)

// this is used to implicitly convert an RDD to a DataFrame.
import sqlContext.implicits._
1.1 编程

1)在pom.xml中添加依赖

<properties>
    <scala.version>2.11.8</scala.version>
    <spark.version>2.2.0</spark.version>
</properties>

<!-- scala -->
<dependency>
  <groupId>org.scala-lang</groupId>
  <artifactId>scala-library</artifactId>
  <version>${scala.version}</version>
</dependency>

<!-- sparksql -->
<dependency>
  <groupId>org.apache.spark</groupId>
  <artifactId>spark-sql_2.11</artifactId>
  <version>${spark.version}</version>
</dependency>

2)SQLContextApp.scala

package com.lihaogn.spark

import org.apache.spark.sql.SQLContext
import org.apache.spark.{SparkConf, SparkContext}

/**
  * SQLContext的使用
  */

object SQLContextApp {

  def main(args: Array[String]): Unit = {
    val path=args(0)

    // 1) 创建相应的Context
    val sparkConf =new SparkConf()

    val sc=new SparkContext(sparkConf)

    val sqlContext=new SQLContext(sc)

    // 2) 相关的处理:json
    val people=sqlContext.read.format("json").load(path)
    people.printSchema()
    people.show()

    // 3) 关闭资源
    sc.stop()
  }
}

3)提交运行

spark-submit \
--name SQLContextApp \
--class com.lihaogn.spark.SQLContextApp \
--master local[2] \
/Users/Mac/my_lib/sql-1.0.jar \
/Users/Mac/app/spark-2.2.0-bin-2.6.0-cdh5.7.0/examples/src/main/resources/people.json

4)结果

2 HiveContext的使用

In addition to the basic SQLContext, you can also create a HiveContext, which provides a superset of the functionality provided by the basic SQLContext. Additional features include the ability to write queries using the more complete HiveQL parser, access to Hive UDFs, and the ability to read data from Hive tables. To use a HiveContext, you do not need to have an existing Hive setup, and all of the data sources available to a SQLContext are still available.

Spark1.x中Spark SQL的入口:HiveContext

2.1 编程

1)pom.xml 添加依赖

<!-- 使用hiveContext需要的依赖-->
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-hive_2.11</artifactId>
      <version>${spark.version}</version>
    </dependency>

2)HiveContextApp.scala

package com.lihaogn.spark

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.sql.hive.HiveContext

/**
  * HiveContext的使用
  * 使用时需要通过 --jars 将MySQL驱动包传递到classpath
  */
object HiveContextApp {

  def main(args: Array[String]): Unit = {

    // 1) 创建相应的Context
    val sparkConf =new SparkConf()

    val sc=new SparkContext(sparkConf)

    val hiveContext=new HiveContext(sc)

    // 2) 相关的处理:载入表
    hiveContext.table("student").show

    // 3) 关闭资源
    sc.stop()
  }
}

3)编译

mvn clean package -DskipTests

4)提交运行

spark-submit \
--name HiveContextApp \
--class com.lihaogn.spark.HiveContextApp \
--master local[2] \
--jars /Users/Mac/software/mysql-connector-java-5.1.27-bin.jar \
/Users/Mac/my_lib/sql-1.0.jar

5)结果

3 SparkSession的使用

Spark2.x中Spark SQL的入口:SparkSession

The entry point into all functionality in Spark is the SparkSession class. To create a basic SparkSession, just use SparkSession.builder():

import org.apache.spark.sql.SparkSession

val spark = SparkSession
  .builder()
  .appName("Spark SQL basic example")
  .config("spark.some.config.option", "some-value")
  .getOrCreate()

// For implicit conversions like converting RDDs to DataFrames
import spark.implicits._
3.1 编程

1)SparkSessionApp.scala

package com.lihaogn.spark

import org.apache.spark.sql.SparkSession

/**
  * SparkSession的使用
  */
object SparkSessionApp {

  def main(args: Array[String]): Unit = {

    val spark = SparkSession.builder().appName("SparkSessionApp").master("local[2]").getOrCreate()

    val people = spark.read.json("/Users/Mac/app/spark-2.2.0-bin-2.6.0-cdh5.7.0/examples/src/main/resources/people.json")

    people.show()

    spark.stop()
  }
}

2)运行结果

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值