大数据 端到端_端到端数据分析性能

大数据 端到端

I came across an article from NVIDIA talking about their TPCx-BB benchmark results on A100. As a data scientist, I was immediately intrigued because I’m a big fan of the Transaction Processing Performance Council (TPC) benchmarks, which provide reasonable and objective performance metrics. Also, the TPC has clear rules about how their benchmarks are used and how results are reported to ensure that results from different vendors can be directly compared. I’ll say more about this later, but first let’s talk about the end-to-end data analytics workflow.

我碰到了NVIDIA的一篇文章,谈论了他们在A100上的TPCx-BB基准测试结果。 作为数据科学家,我立即被吸引,因为我非常喜欢事务处理性能委员会(TPC)基准,该基准提供了合理和客观的性能指标。 此外,TPC对于如何使用其基准以及如何报告结果有明确的规则,以确保可以直接比较来自不同供应商的结果。 稍后我将详细说明,但首先让我们谈谈端到端数据分析工作流。

I’ve drawn a rough sketch of the end-to-end data analytics workflow based on my experience as a data scientist (Figure 1). Not all of my data science projects pass through every stage of this workflow, but it represents the sum total of my projects. Consequently, my computing environment must be able to handle all stages, especially the early stages: OLTP (online transactional processing) and OLAP (online analytical processing). As every data scientist knows, by the time you get to modeling, the hard work is already done. OLTP deals with managing data stores, while OLAP deals mainly with information retrieval. TPCx-BB is mainly an OLAP benchmark.

根据我作为数据科学家的经验,我已经绘制了端到端数据分析工作流的粗略草图(图1)。 并非我所有的数据科学项目都贯穿此工作流程的每个阶段,但它代表了我的项目总数。 因此,我的计算环境必须能够处理所有阶段,尤其是早期阶段:OLTP(在线事务

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值