Spark3 使用 NVIDIA GPU 计算

前言

首先看官方给出的性能差距图,从图中可以看出性能确实有质的飞跃
perf-cost[1].png
由于没有专业的显卡,我只能拿出家用 RTX 2060进行测试,测试环境如下

  1. CentOS 7
  2. CPU(i7-10700)
  3. GPU(RTX 2060 -> 6G)
  4. 内存(16G)

环境准备

  1. Spark3+
  2. NVIDIA GPU驱动(linux)
  3. cuda 11.8
  4. Spark-rapids
  5. TPC-DS
  6. Miniconda (Python3.9+)

本文采用 NVIDIA官方 spark-rapids 技术进行GPU加速测试
官方给出的环境要求:
To enable GPU processing acceleration you will need:

  • Apache Spark 3.1+
  • A Spark cluster configured with GPUs that comply with the requirements for RAPIDS.
    • One GPU per executor.
  • The RAPIDS Accelerator for Apache Spark plugin jar.
  • To set the config spark.plugins to com.nvidia.spark.SQLPlugin

环境安装

NVIDIA GPU驱动安装

检查环境
sudo yum install pciutils
[root@nebula3 nds]# lspci | grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1)

然后在官网上下载适用于CentOS系统的英伟达显卡驱动(选择对应的显卡型号):https://www.nvidia.cn/geforce/drivers/
或者直接下载 cuda 驱动

linux上多个CUDA切换使用(小白教程)

安装需要的环境
# 1.禁用Nouveau驱动,打开终端,执行命令
sudo echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
# 2.更新内核驱动,执行命令:
sudo yum update

# 3.安装依赖包,执行命令:
sudo yum install gcc kernel-devel kernel-headers

# 4.关闭X Server服务,执行命令:
sudo service stop lightdm

# 5.进入命令行模,命令:
sudo init 3
# 如果在这一步无法安装 NVIDIA 驱动,请重启电脑并查看 nouveau 是否完全关闭
# lsmod | grep nouveau

# 6.运行安装包,执行命令:
sudo sh NVIDIA-Linux-x86_64-*-run

# 7.安装完成后,重启系统,执行命令:
sudo reboot

# 8.启动系统后,检查显卡驱动是否安装成功,执行命令:
nvidia-smi

可查看完整安装教程 :CentOS 7.6安装 NVIDIA 独立显卡驱动(完整版)

cudf安装(可选-主要针对 Python)

rapidsai / cudf

conda install -c rapidsai -c conda-forge -c nvidia \
    cudf=23.06 python=3.10 cudatoolkit=11.8

测试流程

参考教程

首先看官方给出的性能差距图,从图中可以看出性能确实有质的飞跃
perf-cost[1].png
由于没有专业的显卡,我只能拿出家用 RTX 2060进行测试,测试环境如下

  1. CentOS 7
  2. CPU(i7-10700)
  3. GPU(RTX 2060 -> 6G)
  4. 内存(16G)

环境准备

  1. Spark3+
  2. NVIDIA GPU驱动(linux)
  3. cuda 11.8
  4. Spark-rapids
  5. TPC-DS
  6. Miniconda (Python3.9+)

本文采用 NVIDIA官方 spark-rapids 技术进行GPU加速测试
官方给出的环境要求:
To enable GPU processing acceleration you will need:

  • Apache Spark 3.1+
  • A Spark cluster configured with GPUs that comply with the requirements for RAPIDS.
    • One GPU per executor.
  • The RAPIDS Accelerator for Apache Spark plugin jar.
  • To set the config spark.plugins to com.nvidia.spark.SQLPlugin

环境安装

NVIDIA GPU驱动安装

检查环境
sudo yum install pciutils
[root@nebula3 nds]# lspci | grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation TU106 [GeForce RTX 2060 Rev. A] (rev a1)

然后在官网上下载适用于CentOS系统的英伟达显卡驱动(选择对应的显卡型号):https://www.nvidia.cn/geforce/drivers/
或者直接下载 cuda 驱动

linux上多个CUDA切换使用(小白教程)

安装需要的环境
# 1.禁用Nouveau驱动,打开终端,执行命令
sudo echo "blacklist nouveau" >> /etc/modprobe.d/blacklist.conf
# 2.更新内核驱动,执行命令:
sudo yum update

# 3.安装依赖包,执行命令:
sudo yum install gcc kernel-devel kernel-headers

# 4.关闭X Server服务,执行命令:
sudo service stop lightdm

# 5.进入命令行模,命令:
sudo init 3
# 如果在这一步无法安装 NVIDIA 驱动,请重启电脑并查看 nouveau 是否完全关闭
# lsmod | grep nouveau

# 6.运行安装包,执行命令:
sudo sh NVIDIA-Linux-x86_64-*-run

# 7.安装完成后,重启系统,执行命令:
sudo reboot

# 8.启动系统后,检查显卡驱动是否安装成功,执行命令:
nvidia-smi

可查看完整安装教程 :CentOS 7.6安装 NVIDIA 独立显卡驱动(完整版)

cudf安装(可选-主要针对 Python)

rapidsai / cudf

conda install -c rapidsai -c conda-forge -c nvidia \
    cudf=23.06 python=3.10 cudatoolkit=11.8

测试流程

参考教程

  1. 如何开始使用 GPU 加速的 Spark 3? | NVIDIA
  2. GitHub - NVIDIA/spark-rapids-benchmarks: Spark RAPIDS Benchmarks – benchmark sets and utilities for the RAPIDS Accelerator for Apache Spark
  3. spark-rapids

按照 https://github.com/NVIDIA/spark-rapids-benchmarks/nds 官方教程走就行
大概步骤如下:

  1. 安装 TPC-DS,随机生成数据
  2. 下载 spark-rapids.jar https://nvidia.github.io/spark-rapids/docs/archive.html,并设置环境变量;spark运行参数,请查看 base.template 和 convert_submit_*.template

性能报告

CPU(i7-10700)

Spark configuration follows:

('spark.driver.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.executor.id', 'driver')
('spark.driver.memory', '10G')
('spark.executor.instances', '8')
('spark.driver.port', '42825')
('spark.app.submitTime', '1684138774104')
('spark.app.id', 'local-1684138775340')
('spark.executor.cores', '12')
('spark.executor.memory', '16G')
('spark.rdd.compress', 'True')
('spark.executor.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.app.name', 'NDS - transcode - parquet')
('spark.serializer.objectStreamReset', '100')
('spark.sql.warehouse.dir', 'file:/root/spark/test/spark-rapids-benchmarks/nds/spark-warehouse')
('spark.sql.shuffle.partitions', '200')
('spark.master', 'local[*]')
('spark.submit.pyFiles', '')
('spark.submit.deployMode', 'client')
('spark.app.startTime', '1684138774633')

image.png

GPU(RTX 2060 - 6G)

Spark configuration follows:

('spark.app.startTime', '1684293945663')
('spark.driver.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.jars', 'file:///root/spark/test/spark-rapids-benchmarks/nds/rapids-4-spark_2.12-23.04.1.jar')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.incompatibleOps.enabled', 'true')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.driver.user.timezone', 'Asia/Shanghai')
('spark.driver.port', '37812')
('spark.rapids.driver.user.timezone', 'Asia/Shanghai')
('spark.app.submitTime', '1684293945280')
('spark.executor.resource.gpu.discoveryScript', './getGpusResources.sh')
('spark.executor.cores', '12')
('spark.executor.memory', '16G')
('spark.sql.warehouse.dir', 'file:/root/spark/test/spark-rapids-benchmarks/nds/spark-warehouse')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.explain', 'NOT_ON_GPU')
('spark.app.name', 'NDS - transcode - parquet')
('spark.serializer.objectStreamReset', '100')
('spark.files', 'file:///root/spark/spark-3.3.2-bin-hadoop3/examples/src/main/scripts/getGpusResources.sh')
('spark.sql.shuffle.partitions', '200')
('spark.master', 'local[*]')
('spark.submit.deployMode', 'client')
('spark.sql.legacy.parquet.datetimeRebaseModeInWrite', 'CORRECTED')
('spark.sql.extensions', 'com.nvidia.spark.rapids.SQLExecPlugin,com.nvidia.spark.udf.Plugin,com.nvidia.spark.rapids.optimizer.SQLOptimizerPlugin')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.variableFloatAgg.enabled', 'true')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.concurrentGpuTasks', '2')
('spark.rapids.memory.pinnedPool.size', '8g')
('spark.rapids.sql.incompatibleOps.enabled', 'true')
('spark.executor.id', 'driver')
('spark.app.id', 'local-1684293946417')
('spark.executor.instances', '8')
('spark.driver.memory', '10G')
('spark.app.initial.file.urls', 'file:///root/spark/spark-3.3.2-bin-hadoop3/examples/src/main/scripts/getGpusResources.sh')
('spark.executor.resource.gpu.amount', '1')
('spark.plugins', 'com.nvidia.spark.SQLPlugin')
('spark.rapids.sql.variableFloatAgg.enabled', 'true')
('spark.rapids.sql.concurrentGpuTasks', '2')
('spark.rdd.compress', 'True')
('spark.executor.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.submit.pyFiles', '')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.memory.pinnedPool.size', '8g')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.multiThreadedRead.numThreads', '20')
('spark.rapids.sql.multiThreadedRead.numThreads', '20')
('spark.repl.local.jars', 'file:///root/spark/test/spark-rapids-benchmarks/nds/rapids-4-spark_2.12-23.04.1.jar')
('spark.sql.files.maxPartitionBytes', '2g')
('spark.rapids.sql.explain', 'NOT_ON_GPU')
('spark.app.initial.jar.urls', 'spark://nebula3:37812/jars/rapids-4-spark_2.12-23.04.1.jar')

image.png

测试对比表

以下数据测试结果均为 transform 时长,不包含数据读取时长

表名大小(MB)CPU /sGPU /s
catalog_sales29593.6258168
web_sales14745.612096
inventory81927245
store_returns3276.82735
catalog_returns2150.4175
web_returns1004.7104
customer257150.002
customer_address10690.047
customer_demographics76.9100.005
item56.520.004
date_dim100.80.004
time_dim4.90.50.003
catalog_page2.70.20.002
web_page0.18984380.0380.002
household_demographics0.14462890.0540.002
promotion0.11914060.0460.002
store0.10185550.0430.002
call_center0.00888670.0380.003
web_site0.0065430.030.003
reason0.00186820.0230.004
warehouse0.00169940.0410.004
ship_mode0.00106140.0330.003
income_band0.00031280.0240.003
  1. 数据量大于 100MB

image.png

  1. 数据量小于 100MB

image.png

总结

从图中可以看出,只有 6G 的显存性能都能提高将近一半;但或是显存太小的原因,随着数据量增长,时长也在慢慢逼近。甚至在测试48G 数据量时,GPU 抛出了 oom。由于我调参进行了多次,在不砍掉执行数和分区数的情况下,都无一成功运行。为了保证测试的一致性,砍掉了48G大数据量试验。

结论:GPU加速数据分析属实有效,但专业级显卡的价格属实与家用的不在一个水平线上,只有企业追求性能时,使用GPU加速数据分析是一个不错的选择。

GPU Models: NVIDIA P100, V100, T4 and A2/A10/A30/A100 GPUs

按照 https://github.com/NVIDIA/spark-rapids-benchmarks/nds 官方教程走就行
大概步骤如下:

  1. 安装 TPC-DS,随机生成数据
  2. 下载 spark-rapids.jar https://nvidia.github.io/spark-rapids/docs/archive.html,并设置环境变量;spark运行参数,请查看 base.template 和 convert_submit_*.template

性能报告

CPU(i7-10700)

Spark configuration follows:

('spark.driver.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.executor.id', 'driver')
('spark.driver.memory', '10G')
('spark.executor.instances', '8')
('spark.driver.port', '42825')
('spark.app.submitTime', '1684138774104')
('spark.app.id', 'local-1684138775340')
('spark.executor.cores', '12')
('spark.executor.memory', '16G')
('spark.rdd.compress', 'True')
('spark.executor.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.app.name', 'NDS - transcode - parquet')
('spark.serializer.objectStreamReset', '100')
('spark.sql.warehouse.dir', 'file:/root/spark/test/spark-rapids-benchmarks/nds/spark-warehouse')
('spark.sql.shuffle.partitions', '200')
('spark.master', 'local[*]')
('spark.submit.pyFiles', '')
('spark.submit.deployMode', 'client')
('spark.app.startTime', '1684138774633')

image.png

GPU(RTX 2060 - 6G)

Spark configuration follows:

('spark.app.startTime', '1684293945663')
('spark.driver.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.jars', 'file:///root/spark/test/spark-rapids-benchmarks/nds/rapids-4-spark_2.12-23.04.1.jar')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.incompatibleOps.enabled', 'true')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.driver.user.timezone', 'Asia/Shanghai')
('spark.driver.port', '37812')
('spark.rapids.driver.user.timezone', 'Asia/Shanghai')
('spark.app.submitTime', '1684293945280')
('spark.executor.resource.gpu.discoveryScript', './getGpusResources.sh')
('spark.executor.cores', '12')
('spark.executor.memory', '16G')
('spark.sql.warehouse.dir', 'file:/root/spark/test/spark-rapids-benchmarks/nds/spark-warehouse')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.explain', 'NOT_ON_GPU')
('spark.app.name', 'NDS - transcode - parquet')
('spark.serializer.objectStreamReset', '100')
('spark.files', 'file:///root/spark/spark-3.3.2-bin-hadoop3/examples/src/main/scripts/getGpusResources.sh')
('spark.sql.shuffle.partitions', '200')
('spark.master', 'local[*]')
('spark.submit.deployMode', 'client')
('spark.sql.legacy.parquet.datetimeRebaseModeInWrite', 'CORRECTED')
('spark.sql.extensions', 'com.nvidia.spark.rapids.SQLExecPlugin,com.nvidia.spark.udf.Plugin,com.nvidia.spark.rapids.optimizer.SQLOptimizerPlugin')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.variableFloatAgg.enabled', 'true')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.concurrentGpuTasks', '2')
('spark.rapids.memory.pinnedPool.size', '8g')
('spark.rapids.sql.incompatibleOps.enabled', 'true')
('spark.executor.id', 'driver')
('spark.app.id', 'local-1684293946417')
('spark.executor.instances', '8')
('spark.driver.memory', '10G')
('spark.app.initial.file.urls', 'file:///root/spark/spark-3.3.2-bin-hadoop3/examples/src/main/scripts/getGpusResources.sh')
('spark.executor.resource.gpu.amount', '1')
('spark.plugins', 'com.nvidia.spark.SQLPlugin')
('spark.rapids.sql.variableFloatAgg.enabled', 'true')
('spark.rapids.sql.concurrentGpuTasks', '2')
('spark.rdd.compress', 'True')
('spark.executor.extraJavaOptions', '-XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED')
('spark.submit.pyFiles', '')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.memory.pinnedPool.size', '8g')
('spark.plugins.internal.conf.com.nvidia.spark.SQLPlugin.spark.rapids.sql.multiThreadedRead.numThreads', '20')
('spark.rapids.sql.multiThreadedRead.numThreads', '20')
('spark.repl.local.jars', 'file:///root/spark/test/spark-rapids-benchmarks/nds/rapids-4-spark_2.12-23.04.1.jar')
('spark.sql.files.maxPartitionBytes', '2g')
('spark.rapids.sql.explain', 'NOT_ON_GPU')
('spark.app.initial.jar.urls', 'spark://nebula3:37812/jars/rapids-4-spark_2.12-23.04.1.jar')

image.png

测试对比表

以下数据测试结果均为 transform 时长,不包含数据读取时长

表名大小(MB)CPU /sGPU /s
catalog_sales29593.6258168
web_sales14745.612096
inventory81927245
store_returns3276.82735
catalog_returns2150.4175
web_returns1004.7104
customer257150.002
customer_address10690.047
customer_demographics76.9100.005
item56.520.004
date_dim100.80.004
time_dim4.90.50.003
catalog_page2.70.20.002
web_page0.18984380.0380.002
household_demographics0.14462890.0540.002
promotion0.11914060.0460.002
store0.10185550.0430.002
call_center0.00888670.0380.003
web_site0.0065430.030.003
reason0.00186820.0230.004
warehouse0.00169940.0410.004
ship_mode0.00106140.0330.003
income_band0.00031280.0240.003
  1. 数据量大于 100MB

image.png

  1. 数据量小于 100MB

image.png

总结

从图中可以看出,只有 6G 的显存性能都能提高将近一半;但或是显存太小的原因,随着数据量增长,时长也在慢慢逼近。甚至在测试48G 数据量时,GPU 抛出了 oom。由于我调参进行了多次,在不砍掉执行数和分区数的情况下,都无一成功运行。为了保证测试的一致性,砍掉了48G大数据量试验。

结论:GPU加速数据分析属实有效,但专业级显卡的价格与家用的不在一个水平线上,只有企业追求性能时,使用GPU加速数据分析是一个不错的选择。

GPU Models: NVIDIA P100, V100, T4 and A2/A10/A30/A100 GPUs
  • 0
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ZackYoungH

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值