alluxio
文章平均质量分 76
xwc35047
时间有限,但做于自己的无限。(公众号:水木之椿)
展开
-
Leveraging Alluxio with Spark SQL to Speed Up Ad-hoc Analysis
BackgroundAt present, hundreds of TB of data is processed in Momo bigdata cluster every day. However, most of the data will be read/write through disk repeatedly, which is ineffective. In order to s...原创 2018-01-23 18:46:16 · 1145 阅读 · 0 评论 -
Alluxio 1.6.1 与Spark SQL结合使用踩坑总结
1、 扫表问题表不存在hdfs,但在元数据中java.lang.RuntimeException: serious problem at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021) at org.apache.ha...原创 2018-03-22 17:50:47 · 1501 阅读 · 0 评论 -
Spark SQL with Alluxio 环境搭建
背景这里搭建使用yarn的node label特性隔离出测试集群环境,使用Spark Thriftserver提供adhoc 查询服务,查alluxio scheme的表对用户来说是透明的。环境配置部分总体上来说,配置依据官网 https://www.alluxio.org/docs/1.6/en/Running-Spark-on-Alluxio.html和 https://w...原创 2018-03-22 19:08:15 · 1170 阅读 · 0 评论