使用anisble配置好的脚本安装环境:
zookeeper
hadoop
spark
alluxio
先修改一下配置文件:
修改alluxio的配置文件/opt/alluxio/roles/install-alluxio/templates/alluxio-site.properties.j2:
alluxio.zookeeper.enabled=false
#true
#alluxio.zookeeper.address={{ groups['zookeeper'][0] }}:2181,{{ groups['zookeeper'][1] }}:2181,{{ groups['zookeeper'][2] }}:2181
修改spark的配置文件vars.yml:
增加alluxio配置
#alluxio
alluxio_path: "/opt/alluxio-1.8.1"
spark_driver_path: "{{ alluxio_path }}/client/alluxio-1.8.1-client.jar"
spark_executor_path: "{{ alluxio_path }}/client/alluxio-1.8.1-client.jar"
spark_properties部分增加:
- {
"name":"spark.driver.extraClassPath",
"value":"{{ spark_driver_path }}"
}
- {
"name":"spark.executor.extraClassPath",
"value":"{{ spark_executor_path }}"
}
浏览器访问
http://192.168.6.12:19999/home
检验Alluxio环境安装:
./bin/alluxio validateEnv all
或
./bin/alluxio runTests
检查Spark是否设置正确:
integration/checker/bin/alluxio-checker.sh spark spark://192.168.6.12:17077
测试Alluxio作为输入输出:
1、访问Alluxio中的数据:
上传文件:
cd /opt/alluxio-1.8.1
./bin/alluxio fs copyFromLocal LICENSE /LICENSE
./bin/alluxio fs ls /
./bin/alluxio fs cat /LICENSE
spark-shell测试读写alluxio数据:
val s = sc.textFile("alluxio://192.168.6.12:19998/LICENSE")
val double = s.map(line => line + line)
double.saveAsTextFile("alluxio://192.168.6.12:19998/Output")
spark-shell测试写alluxio数据:
val data = Array(1, 2, 3, 4, 5)
val distData = sc.parallelize(data)
distData.saveAsTextFile("alluxio://192.168.6.12:19998/distData")
2、访问底层存储数据:
hdfs dfs -put -f LICENSE hdfs://192.168.6.12:9000/alluxio/backend/Input_HDFS
hdfs dfs -ls /alluxio/backend
./bin/alluxio fs ls /
val s = sc.textFile("alluxio://192.168.6.12:19998/Input_HDFS")
val double = s.map(line => line + line)
double.saveAsTextFile("alluxio://192.168.6.12:19998/Output_HDFS")
查看结果:
高级部分:配置zookeeper
修改alluxio的配置文件/opt/alluxio/roles/install-alluxio/templates/alluxio-site.properties.j2:
alluxio.zookeeper.enabled=true
alluxio.zookeeper.address={{ groups['zookeeper'][0] }}:2181,{{ groups['zookeeper'][1] }}:2181,{{ groups['zookeeper'][2] }}:2181
修改spark的配置文件vars.yml:
alluxio_zookeeper_address: "192.168.6.12:2181,192.168.6.13:2181,192.168.6.14:2181"
spark_properties部分增加:
- {
"name":"spark.driver.extraJavaOptions",
"value":"-Dalluxio.zookeeper.address=192.168.6.12:2181,192.168.6.13:2181,192.168.6.14:2181 -Dalluxio.zookeeper.enabled=true"
}
- {
"name":"spark.executor.extraJavaOptions",
"value":"-Dalluxio.zookeeper.address=192.168.6.12:2181,192.168.6.13:2181,192.168.6.14:2181 -Dalluxio.zookeeper.enabled=true"
}
测试1:
#加上zookeeper配置信息
val s = sc.textFile("alluxio://192.168.6.12:2181;192.168.6.13:2181;192.168.6.14:2181/Input_HDFS")
val double = s.map(line => line + line)
double.saveAsTextFile("alluxio://192.168.6.12:2181;192.168.6.13:2181;192.168.6.14:2181/Output_HDFS")
测试2:
#省略zookeeper配置信息
val s = sc.textFile("alluxio:///Input_HDFS")
val double = s.map(line => line + line)
double.saveAsTextFile("alluxio:///Output_HDFS1")