超大数据之GPU聚类 (10亿量级)

最新推荐文章于 2024-06-26 09:47:27 发布

mrsj88

最新推荐文章于 2024-06-26 09:47:27 发布

阅读量817

点赞数

分类专栏： NVIDIA CUDA 文章标签： performance dataset algorithm report

本文链接：https://blog.csdn.net/mrsj88/article/details/4969458

版权

该文章探讨了利用GPU加速处理十亿级大规模数据集的聚类问题。通过K-Means算法实例，GPU优化实现相比于8核CPU版本性能提升了一个数量级，相较于单核基准测试MineBench提升了两个数量级。

摘要由CSDN通过智能技术生成

先转一篇以前的文章.

Clustering billions of data points using GPUs

"In this paper, we report our research on using GPUs to accelerate clustering of very large data sets, which are common in today's real world applications. While many published works have shown that GPUs can be used to accelerate various general purpose applications with respectable performance gains, few attempts have been made to tackle very large problems. Our goal here is to investigate if GPUs can be useful accelerators even with very large data sets that cannot fit into GPU's onboard memory.

Using a popular clustering algorithm, K-Means, as an example, our results have been very positive. On a data set with a billion data points, our GPU-accelerated implementation achieved an order of magnitude performance gain over a highly optimized CPU-only version running on 8 cores, and more than two orders of magnitude gain over a popular benchmark, MineBench, running on a single core."

http://portal.acm.org/citation.cfm?id=1531668

mrsj88

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
超大数据之GPU聚类 (10亿量级)

先转一篇以前的文章. Clustering billions of data points using GPUs"In this paper, we report our research on using GPUs to accelerateclustering of very large data sets, which are common in today
复制链接

扫一扫