高性能计算--HPCC--他人评述

原文: http://gigaom.com/cloud/lexisnexis-open-sources-its-hadoop-killer/
翻译:那海蓝蓝,译文请见“ 【】”中的部分

LexisNexis is releasing a set of open-source, data-processing tools that it says outperforms Hadoop and even handles workloads Hadoop presently can’t. 【LexisNexis 发布了一套开源的、数据处理工具,他优于Hadoop并且能处理工作负载而Hadoop不能】The technology (and new business line) is called HPCC Systems, and was created 10 years ago within the LexisNexis Risk Solutions division that analyzes huge amounts of data for its customers in intelligence, financial services and other high-profile industries. 【这项技术(新的商业线)被称为HPCC系统,于十年前在LexisNexis风险解决方案部创建,用于为它的商业、金融服务和其他高端事业的客户分析大量数据】 There have been calls for a legitimate alternative to Hadoop, and this certainly looks like one. 【Hadoop已经有一个很好的替代品,(HPCC)看起来就像一个完整物。】

According to Armando Escalante, CTO of LexisNexis Risk Solutions, the company decided to release HPCC now because it wanted to get the technology into the community before Hadoop became the de facto option for big data processing. 【Armando Escalante、LexisNexis分析解决方案CTO表示,公司决定发布HPCC是因为想在Hadoop成为大数据量处理的事实标准之前让(HPCC)技术进入社区。】Escalante told me during a phone call that he thinks of Hadoop as “a guy with a machete in front of a jungle — they made a trail,” but that he thinks HPCC is superior.【Escalante 在电话中告诉我,他认为Hadoop是一个“丛林中挥舞砍刀开路者,造出一条路径”,但是他认为HPCC更为优秀。】

But in order to compete for mindshare and developers, he said, the company felt it had to open-source the technology. 【为了完成思维共享和开发,公司开源了这项技术】One big thing Hadoop has going for it is its open-source model, Escalante explained, which attracts a lot of developers and a lot of innovation. 【Escalante 解释,Hadoop利于开源模式吸引了大量的开发者和大量的创新。】If his company wanted HPCC to “remain relevant” and keep improving through new use cases and ideas from a new community, the time for release was now and open source had to be the model.【如果他的公司想让HPCC“保持相关性”,可通过新的案例和来自新社区的好主意进行改善,现在是时候发布开源了。】

Hadoop, of course, is the Apache Software Foundation project created several years ago by then-Yahoo employee Doug Cutting. 【Hadoop,是雅虎雇员Doug Cutting几年前捐献给Apache软件基金项目。】It has become a critical tool for web companies — including Yahoo and Facebook — to process their ever-growing volumes of unstructured data, and is fast making its way into organizations of all types and sizes. 【它已经成为网络公司的关键工具--包括雅虎和脸谱--处理其日益增长的大量非结构化数据,并快速致其成为各种类型和规模的组织。】Hadoop has spawned a number of commercial distributions and products, too, including from Cloudera, EMC  and IBM.【Hadoop催生了大量的商业分布和产品,包括Cloudra、EMC、IBM。】

How HPCC works

Hadoop relies on two core components to store and process huge amounts of data: the Hadoop Distributed File System and Hadoop MapReduce. 【Hadoop依赖于两个核心组件存储处理大数据量:Hadoop分布式文件系统和Hadoop的MapReduce。】However, as Cloudant CEO Mike Miller explained in a post over the weekend, MapReduce is still a relatively complex language for writing parallel-processing workflows. 【然而,如Cloudant的CEO Mike Miller所解释,MapReduce依然是一个用于写并行处理工作流的相对复杂的语言。】HPCC seeks to remedy this with its Enterprise Control Language.【HPCC使用ECL语言弥补这一缺限。】

Escalante says ECL is a declarative, data-centric language that abstracts a lot of the work necessary within MapReduce. 【Escalante 说ECL是一个声明式的、以数据为中心的语言,它从Mapreduce中提炼了大量必须的工作For certain tasks that take a thousand lines of code in MapReduce, he said, ECL only requires 99 lines.【对于一些需要在MapReduce中花费上千行代码完成的任务,ECL只需要99行。】 Furthermore, he explained, ECL doesn’t care how many nodes are in the cluster because the system automatically distributes data across however many nodes are present.【更多的,ECL(编程)不需关注有多少节点在集群中,因为系统自动地发布数据(到很多节点)】 Technically, though, HPCC could run on just a single virtual machine. 【技术上,HPCC能够在单一虚拟机上运行。】And, says Escalante, HPCC is written in C++ — like the original Google MapReduce  on which Hadoop MapReduce is based — which he says makes it inherently faster than the Java-based Hadoop version.【而且,HPCC使用C++编写---像原始的Google的MapReduce,与基于Google的Hadoop的MapReduce比---,比基于Java的Hadoop更快。】

HPCC offers two options for processing and serving data: the Thor Data Refinery Cluster and the Roxy Rapid Data Delivery Cluster. Escalante said Thor — so named for its hammer-like approach to solving the problem — crunches, analyzes and indexes huge amounts of data a la Hadoop. 【HPCC提供两个选项用于处理和服务数据据:Thor数据加工集群和Roxy快速数据传输集群。Escalante说,Thor之所以这样命名是寄望于解决问题---紧缩、分析和索引大量数据】Roxie, on the other hand, is more like a traditional relational database or database warehouse that even can serve transactions to a web front end.【Roxie,另一方面,更像传统的关系数据库或数据仓库,甚至可以成为交易Web前端。】

We didn’t go into detail on HPCC’s storage component, but Escalante noted that it does utilize a distributed file system, although it can support a variety of off-node storage architectures and/or local solid-state drives.【我们不深入HPCC的存储组件的细节,但是Escalante表示它是一个分布式文件系统,虽然它可以支持多种关闭节点的存储架构和/或本地的固态硬盘。

He added that in order to ensure LexisNexis wasn’t blinded by “eating its own dogfood,” his team hired a Hadoop expert to kick the tires on HPCC.【他补充说,为了确保LexisNexis不被自己蒙蔽吃自己的狗食,他的团队雇佣了一名Hadoop专家“评估”了HPCC。】 The consultant was impressed, Escalante said, but did note some shortcomings that the team addressed as it readied the technology for release. Escalante说,专家印象深刻,但也注意到一些(开发)团队致力于准备解决的缺陷It also built a converter for migrating Hadoop applications written in the Pig language to ECL.【它还建立了一个使用Pig语言构建的转换器以迁移Hadoop应用到ECL。】

Can HPCC Systems actually compete?

The million-dollar question is whether HPCC Systems can actually attract an ecosystem of contributors and users that will help it rise above the status of big data also-ran. 【关键的问题是HPCC系统是否能够足够地吸引生态系统的贡献者和用户来帮助它提高大数据处理方面的劣势。】Escalante thinks it can, in large part because HPCC already has been proven in production dealing with LexisNexis Risk Solutions’ 35,000 data sources, 5,000 transactions per second and large, paying customers.Escalante认为HPCC可以(做到这一点),很大程度上HPCC已经在LexisNexis的风险解决部门的35000数据源、每秒5000个事务和大的、付费用户的产品上得到证明。 He added that the company also will provide enterprise licenses and proprietary applications in addition to the open-source code. Plus, it already has potential customers lined up.【他补充说公司除了开放源码外还提供企业许可和专有的应用。此外,已经有潜在的客户了。】

It’s often said that competition means validation. 【竞争就是确认。(谚语?)】Hadoop has moved from a niche set of tools to the core of a potentially huge business that’s growing every day, and even Microsoft has a horse in this race with its Dryad set of big data tools. 【Hadoop已经从一套小工具集成长为具有潜在大型商业、且每日都在增长的核心应用,即使微软有一套大数据处理工具都(不好)与之竞争。】Hadoop has already proven itself, but the companies and organizations relying on it for the their big data strategies can’t rest on their laurels.【Hadoop已经证明了自己,但是依赖于Hadoop处理大数据的公司和组织的数据处理策略,不应该仅限于此(Hadoop)。】

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值