spark spring mysql_spring-boot集成spark并使用spark-sql

本文介绍了如何在Spring Boot应用中集成Spark,并使用Spark SQL进行数据处理。通过添加相关依赖,创建配置类和启动类,演示了如何在YARN上运行Spark任务,从Hive查询数据并打印结果。
摘要由CSDN通过智能技术生成

首先添加相关依赖:

xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">

4.0.0

org.springframework.boot

spring-boot-starter-parent

1.5.6.RELEASE

com.cord

spark-example

1.0-SNAPSHOT

spark-example

http://www.example.com

UTF-8

UTF-8

1.8

2.10.3

1.8

1.8

org.springframework.boot

spring-boot-starter

1.5.6.RELEASE

org.springframework.boot

spring-boot-starter-logging

org.apache.spark

spark-core_2.10

1.6.1

provided

org.slf4j

slf4j-log4j12

log4j

log4j

org.apache.spark

spark-sql_2.10

1.6.1

provided

org.apache.spark

spark-hive_2.10

1.6.1

provided

org.scala-lang

scala-library

${scala.version}

provided

mysql

mysql-connector-java

5.1.22

org.apache.maven.plugins

maven-shade-plugin

org.springframework.boot

spring-boot-maven-plugin

1.5.6.RELEASE

false

false

*:*

META-INF/*.SF

META-INF/*.DSA

META-INF/*.RSA

implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">

META-INF/spring.handlers

implementation="org.springframework.boot.maven.PropertiesMergingResourceTransformer">

META-INF/spring.factories

implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">

META-INF/spring.schemas

implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer" />

implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">

com.cord.StartApplication

package

shade

需要注意的是依赖中排除掉的日志模块,以及特殊的打包方式

定义配置类:

SparkContextBean.class

@Configuration

public class SparkContextBean {

private String appName = "sparkExp";

private String master = "local";

@Bean

@ConditionalOnMissingBean(SparkConf.class)

public SparkConf sparkConf() throws Exception {

SparkConf conf = new SparkConf().setAppName(appName).setMaster(master);

return conf;

}

@Bean

@ConditionalOnMissingBean

public JavaSparkContext javaSparkContext() throws Exception {

return new JavaSparkContext(sparkConf());

}

@Bean

@ConditionalOnMissingBean

public HiveContext hiveContext() throws Exception {

return new HiveContext(javaSparkContext());

}

......

}

启动类:

StartApplication.class

@SpringBootApplication

public class StartApplication implements CommandLineRunner {

@Autowired

private HiveContext hc;

public static void main(String[] args) {

SpringApplication.run(StartApplication.class, args);

}

@Override

public void run(String... args) throws Exception {

DataFrame df = hc.sql("select count(1) from LCS_DB.STAFF_INFO");

List result = df.javaRDD().map((Function) row -> {

return row.getLong(0);

}).collect();

result.stream().forEach(System.out::println);

}

}

执行方式:

spark-submit \

--class com.cord.StartApplication \

--executor-memory 4G \

--num-executors 8 \

--master yarn-client \

/data/cord/spark-example-1.0-SNAPSHOT.jar

参考链接:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值