微信公众号(SZBigdata-Club):后续博客的文档都会转到微信公众号中。
1、公众号会持续给大家推送技术文档、学习视频、技术书籍、数据集等。
2、接受大家投稿支持。
3、对于各公司hr招聘的,可以私下联系我,把招聘信息发给我我会在公众号中进行推送。
技术交流群:59701880 深圳广州hadoop好友会
预先条件
安装tensorflow环境
下载tensorflowonspark代码
1 2 3 | git clone https://github.com/yahoo/TensorFlowOnSpark.git cd TensorFlowOnSpark export TFoS_HOME=$(pwd) |
安装Spark
这里tensorflowOnSpark中提供了一个脚本用于下载spark,我们直接执行这个命令。
1 2 3 4 | ${TFoS_HOME}/scripts/local-setup-spark.sh rm spark-1.6.0-bin-hadoop2.6.tar export SPARK_HOME=$(pwd)/spark-1.6.0-bin-hadoop2.6 export PATH=${SPARK_HOME}/bin:${PATH} |
安装tensorflow以及tensorflowOnSpark
这里我们通过pip命令来安装tensorflow以及tensorflowOnSpark,目前最新版本的tensorflow是1.2.x,不过我这边测试是用的0.12.1版本。安装指定tensorflow版本可以通过==${version}来指定。
1 2 | sudo pip install tensorflow==0.12.1 sudo pip install tensorflowonspark |
下载mnist数据
1 2 3 4 5 6 7 | mkdir ${TFoS_HOME}/mnist pushd ${TFoS_HOME}/mnist curl -O "http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz" curl -O "http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz" popd |
运行standalone spark集群
1 2 3 4 5 | export MASTER=spark://$(hostname):7077 export SPARK_WORKER_INSTANCES=2 export CORES_PER_WORKER=1 export TOTAL_CORES=$((${CORES_PER_WORKER}*${SPARK_WORKER_INSTANCES})) ${SPARK_HOME}/sbin/start-master.sh; ${SPARK_HOME}/sbin/start-slave.sh -c $CORES_PER_WORKER -m 3G ${MASTER} |
测试pyspark、tensorflow以及tensorflowOnSpark
1 2 3 4 | pyspark >>> import tensorflow as tf >>> from tensorflowonspark import TFCluster >>> exit() |
使用spark转换mnist压缩文件
1 2 3 4 5 6 7 8 | cd ${TFoS_HOME} # rm -rf examples/mnist/csv ${SPARK_HOME}/bin/spark-submit \ --master ${MASTER} \ ${TFoS_HOME}/examples/mnist/mnist_data_setup.py \ --output examples/mnist/csv \ --format csv ls -lR examples/mnist/csv |
运行分布式mnist训练(使用feed_dict)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | # rm -rf mnist_model ${SPARK_HOME}/bin/spark-submit \ --master ${MASTER} \ --py-files ${TFoS_HOME}/examples/mnist/spark/mnist_dist.py \ --conf spark.cores.max=${TOTAL_CORES} \ --conf spark.task.cpus=${CORES_PER_WORKER} \ --conf spark.executorEnv.JAVA_HOME="$JAVA_HOME" \ ${TFoS_HOME}/examples/mnist/spark/mnist_spark.py \ --cluster_size ${SPARK_WORKER_INSTANCES} \ --images examples/mnist/csv/train/images \ --labels examples/mnist/csv/train/labels \ --format csv \ --mode train \ --model mnist_model ls -l mnist_model |
运行分布式mnist推论(使用feed_dict)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | # rm -rf predictions ${SPARK_HOME}/bin/spark-submit \ --master ${MASTER} \ --py-files ${TFoS_HOME}/examples/mnist/spark/mnist_dist.py \ --conf spark.cores.max=${TOTAL_CORES} \ --conf spark.task.cpus=${CORES_PER_WORKER} \ --conf spark.executorEnv.JAVA_HOME="$JAVA_HOME" \ ${TFoS_HOME}/examples/mnist/spark/mnist_spark.py \ --cluster_size ${SPARK_WORKER_INSTANCES} \ --images examples/mnist/csv/test/images \ --labels examples/mnist/csv/test/labels \ --mode inference \ --format csv \ --model mnist_model \ --output predictions less predictions/part-00000 |
预测结果如下所示:
1 2 3 4 5 6 7 8 9 10 11 | 2017-02-10T23:29:17.009563 Label: 7, Prediction: 7 2017-02-10T23:29:17.009677 Label: 2, Prediction: 2 2017-02-10T23:29:17.009721 Label: 1, Prediction: 1 2017-02-10T23:29:17.009761 Label: 0, Prediction: 0 2017-02-10T23:29:17.009799 Label: 4, Prediction: 4 2017-02-10T23:29:17.009838 Label: 1, Prediction: 1 2017-02-10T23:29:17.009876 Label: 4, Prediction: 4 2017-02-10T23:29:17.009914 Label: 9, Prediction: 9 2017-02-10T23:29:17.009951 Label: 5, Prediction: 6 2017-02-10T23:29:17.009989 Label: 9, Prediction: 9 2017-02-10T23:29:17.010026 Label: 0, Prediction: 0 |
关闭spark集群
1 | ${SPARK_HOME}/sbin/stop-slave.sh; ${SPARK_HOME}/sbin/stop-master.sh |
原链接:
https://github.com/yahoo/TensorFlowOnSpark/wiki/GetStarted_standalone