spark教程

​linux 安装部署spark:

https://www.cnblogs.com/tijun/p/7561718.html

https://blog.csdn.net/heartsdance/article/details/119751588

https://blog.csdn.net/weixin_43854358/article/details/90666193 

(这篇不错,有case)

如何排查启动失败问题:

https://blog.csdn.net/C_time/article/details/100023332

测试圆周率:

bin/spark-submit --master spark://10.153.110.18:8077 --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.4.7.jar 100 (100可自行设定)

pyspark SparkConf详解:

https://blog.csdn.net/weixin_40161254/article/details/87916880

b9b上的spark位置:

/home/disk1/software/spark-2.4.7-bin-hadoop2.7/

b9b 上的 spark日志位置:

/home/disk1/software/spark-2.4.7-bin-hadoop2.7/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-yq01-aip-aip07a73b9b.yq01.baidu.com.out

b9b的spark web-ui(可查看任务日志等):

http://yq01-aip-aip07a73b9b.yq01.baidu.com:8078/

本地提交spark任务排查

https://www.freesion.com/article/7551171582/

spark统计:

rdd统计:

https://blog.csdn.net/liangzelei/article/details/80573015

dataframe 统计:

https://cloud.tencent.com/developer/article/1031061

https://blog.csdn.net/suzyu12345/article/details/79673557

https://zhuanlan.zhihu.com/p/237637848

内存不足的解决方案:WARN MemoryStore: Not enough space to cache rdd

https://www.playpi.org/2020012201.html

关于是否会删除缓存:

https://www.jianshu.com/p/761fa2ee868e

pyspark 提交 :spark-sumbit

www.jianshu.com/p/df0a189ff28b

https://www.cnblogs.com/piperck/p/10121097.html

https://zhuanlan.zhihu.com/p/101740397

Pyspark 支持yarn模式:

https://www.cnblogs.com/yanshw/p/12083488.html

关于在python脚本中设置master(不想用submit脚本,这个是做不到滴):

https://stackoverflow.com/questions/31327275/pyspark-on-yarn-cluster-mode

pyspark支持yarn-cluster:

spark-submit具体参数含义:

https://www.malaoshi.top/show_1IXnhwPEDg0.html

https://xujiyou.work/%E5%A4%A7%E6%95%B0%E6%8D%AE/Spark/spark-submit%E8%AF%A6%E8%A7%A3.html

b9b上面排查spark的python任务错误的日志(注意std_out):

/home/work/hadoop-2.10.0/logs/userlogs/application_1629377733083_3679/

or

/home/work/hadoop-2.10.0/logs/userlogs

spark自定义分区:

https://blog.csdn.net/weixin_45102492/article/details/104726795

spark actions 、 transfoamations等:

https://spark.apache.org/docs/latest/rdd-programming-guide.html#actions

厂内b9b机器跑bmr:

./bin/run-example --master yarn --deploy-mode cluster --files /home/aicu-tob/software/baidu_spark_emr/output_afs_agent/conf/yarn-site.xml SparkPi

./bin/run-example --master yarn --deploy-mode cluster --class org.apache.spark.examples.SparkPi examples/jars/spark-examples_2.11-2.4.3.2-baidu.jar

yarn-site.xml 配置:

https://ifeve.com/spark-yarn-run-spark/

https://blog.csdn.net/Jerry_991/article/details/85042305 

工单链接:

https://console.cloud.baidu-int.com/ticket/new/?productId=217

厂内的spark的各种问题(这煞笔公司什么时候倒闭算了):

http://wiki.baidu.com/pages/viewpage.action?pageId=324590584#id-6.%E5%B8%B8%E8%A7%81%E9%97%AE%E9%A2%98-9.%E7%94%A8hadoopfs-ls%E6%9F%A5%E7%9C%8Bafs%EF%BC%8C%E6%8A%A5ls:NoFileSystemforscheme:afs%E7%9A%84%E9%94%99%E8%AF%AF

Spark_env教程:

https://blog.csdn.net/u010199356/article/details/89056304

spark join教程:

https://www.sohu.com/a/427258627_315839

https://zhuanlan.zhihu.com/p/317226768

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值