DL4J hello world

背景:之前尝试TensorFlow训练保存pb模型给到spark用,感觉还是性能太慢了;开始寻求在spark上跑深度学习的方法,权衡sparkNet和DL4J后选择。


参考官网 https://deeplearning4j.org/cn/quickstart 先弄了个例子:
步骤1:克隆到本地
F:\spark project\dl4j-examples>git clone https://github.com/deeplearning4j/dl4j-examples.git
Cloning into 'dl4j-examples'...
remote: Enumerating objects: 201, done.
remote: Counting objects: 100% (201/201), done.
remote: Compressing objects: 100% (133/133), done.
error: RPC failed; curl 56 OpenSSL SSL_read: SSfL_ERROR_SYSCALL, errno 10054
atal: the remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed

上面问题参考:https://stackoverflow.com/questions/21277806/fatal-early-eof-fatal-index-pack-failed 如下解决:

F:\spark project\dl4j-examples>
F:\spark project\dl4j-examples>git config --global core.compression 0

F:\spark project\dl4j-examples>git clone --depth 1 https://github.com/deeplearning4j/dl4j-examples.git
Cloning into 'dl4j-examples'...
remote: Enumerating objects: 768, done.
remote: Counting objects: 100% (768/768), done.
remote: Compressing objects: 100% (547/547), done.
remote: Total 768 (delta 161), reused 491 (delta 97), pack-reused 0 eceiving objects:  99% (761/768), 22.93 MiB | 76.00 KiB/s
Receiving objects: 100% (768/768), 22.94 MiB | 165.00 KiB/s, done.
Resolving deltas: 100% (161/161), done.

感觉执行到这里可以了,毕竟例子都下下来了,然后我又往下执行了几行:
F:\spark project\dl4j-examples>git fetch --unshallow
fatal: not a git repository (or any of the parent directories): .git

F:\spark project\dl4j-examples>git fetch --depth=2147483647
fatal: not a git repository (or any of the parent directories): .git

F:\spark project\dl4j-examples>git init
Initialized empty Git repository in F:/spark project/dl4j-examples/.git/

F:\spark project\dl4j-examples>git fetch --unshallow
fatal: --unshallow on a complete repository does not make sense

F:\spark project\dl4j-examples>git fetch --depth=2147483647

F:\spark project\dl4j-examples>git pull --all
There is no tracking information for the current branch.
Please specify which branch you want to merge with.
See git-pull(1) for details.

    git pull <remote> <branch>

If you wish to set tracking information for this branch you can do so with:

    git branch --set-upstream-to=<remote>/<branch> master

步骤2:大概要半小时,然后需要5G多磁盘空间
mvn clean install
这里我没有单独安装maven,而是偷懒在idea的其他项目中通过Execute Maven Goal在指定路径F:\spark project\dl4j-examples\dl4j-examples 下执行mvn clean install的

步骤3:挑个例子跑起来:
如org.deeplearning4j.examples.feedforward.classification.MLPClassifierLinear,
IDEA运行测试用例报错如下:
Error running 'AuthServer': Command line is too long. Shorten command line for AuthServeror also for Application default configuration.
解决办法:
修改项目下 .idea\workspace.xml,找到标签 <component name="PropertiesComponent"> , 在标签里加一行  <property name="dynamic.classpath" value="true" />
执行结果:

 

步骤4:到自己的spark里面运行
先添加依赖:
<dependency>
    <groupId>org.datavec</groupId>
    <artifactId>datavec-api</artifactId>
    <version>1.0.0-beta5</version>
</dependency>
<dependency>
    <groupId>org.nd4j</groupId>
    <artifactId>nd4j-native-platform</artifactId>
    <version>1.0.0-beta5</version>
</dependency>
<dependency>
    <groupId>org.deeplearning4j</groupId>
    <artifactId>deeplearning4j-core</artifactId>
    <version>1.0.0-beta5</version>
</dependency>
<dependency>
    <groupId>org.deeplearning4j</groupId>
    <artifactId>dl4j-spark_2.11</artifactId>
    <version>1.0.0-beta5</version>
</dependency>
<dependency>
    <groupId>com.beust</groupId>
    <artifactId>jcommander</artifactId>
    <version>1.27</version>
</dependency>

复制org.deeplearning4j.legacyExamples.mlp.MnistMLPExample并稍作修改得到:
import org.apache.spark.SparkConf
import org.apache.spark.api.java.JavaSparkContext
import org.deeplearning4j.datasets.iterator.impl.MnistDataSetIterator
import org.deeplearning4j.eval.Evaluation
import org.deeplearning4j.nn.conf.NeuralNetConfiguration
import org.deeplearning4j.nn.conf.layers.DenseLayer
import org.deeplearning4j.nn.conf.layers.OutputLayer
import org.deeplearning4j.nn.weights.WeightInit
import org.deeplearning4j.spark.impl.multilayer.SparkDl4jMultiLayer
import org.deeplearning4j.spark.impl.paramavg.ParameterAveragingTrainingMaster
import org.nd4j.linalg.activations.Activation
import org.nd4j.linalg.dataset.DataSet
import org.nd4j.linalg.learning.config.Nesterovs
import org.nd4j.linalg.lossfunctions.LossFunctions
import java.util

object MnistMLPExample {
  val batchSizePerWorker=16
  val numEpochs = 2
  def main(args: Array[String]): Unit = {
    val sparkConf = new SparkConf
    System.setProperty("hadoop.home.dir", "D:\\火狐下载\\hadoop-common-2.2.0-bin-master")
    sparkConf.setMaster("local[*]")
    sparkConf.setAppName("DL4J Spark MLP Example")
    val sc = new JavaSparkContext(sparkConf)
    sc.setLogLevel("WARN")
    //Load the data into memory then parallelize
    //This isn't a good approach in general - but is simple to use for this example
    val iterTrain = new MnistDataSetIterator(batchSizePerWorker, true, 12345)
    val iterTest = new MnistDataSetIterator(batchSizePerWorker, true, 12345)
    val trainDataList = new util.ArrayList[DataSet]
    val testDataList = new util.ArrayList[DataSet]
    while ( {
      iterTrain.hasNext
    }) trainDataList.add(iterTrain.next)
    while ( {
      iterTest.hasNext
    }) testDataList.add(iterTest.next)
    val trainData = sc.parallelize(trainDataList)
    val testData = sc.parallelize(testDataList)
    //Create network configuration and conduct network training
    val conf = new NeuralNetConfiguration.Builder()
      .seed(12345)
      .activation(Activation.LEAKYRELU)
      .weightInit(WeightInit.XAVIER)
      .updater(new Nesterovs(0.1))// To configure: .updater(Nesterovs.builder().momentum(0.9).build())
      .l2(1e-4)
      .list()
      .layer(new DenseLayer.Builder().nIn(28 * 28).nOut(500).build())
      .layer(new DenseLayer.Builder().nOut(100).build())
      .layer(new OutputLayer.Builder(LossFunctions.LossFunction.NEGATIVELOGLIKELIHOOD)
        .activation(Activation.SOFTMAX).nOut(10).build())
      .build();
    //Configuration for Spark training: see http://deeplearning4j.org/spark for explanation of these configuration options
    val tm = new ParameterAveragingTrainingMaster.Builder(batchSizePerWorker)    //Each DataSet object: contains (by default) 32 examples
      .averagingFrequency(5)
      .workerPrefetchNumBatches(2)            //Async prefetching: 2 examples per worker
      .batchSizePerWorker(batchSizePerWorker)
      .build();
    //Create the Spark network
    val sparkNet = new SparkDl4jMultiLayer(sc, conf, tm)
    //Execute training:
    var i = 0
    while ( {
      i < numEpochs
    }) {
      sparkNet.fit(trainData)
      println("Completed Epoch {}", i)
      i += 1;
    }
    //Perform evaluation (distributed)
    //        Evaluation evaluation = sparkNet.evaluate(testData);
    val evaluation = sparkNet.doEvaluation(testData, 64, new Evaluation(10))(0) //Work-around for 0.9.1 bug: see https://deeplearning4j.org/releasenotes
    println("***** Evaluation *****")
    println(evaluation.stats)
    //Delete the temp training files, now that we are done with them
    tm.deleteTempFiles(sc)
    println("***** Example Complete *****")

  }
 
}

运行结果为:
......
15:03:14,619 INFO  ~ Completed Epoch 1
                                                                                15:03:20,498 INFO  ~ ***** Evaluation *****
15:03:20,502 INFO  ~

========================Evaluation Metrics========================
 # of classes:    10
 Accuracy:        0.9608
 Precision:       0.9605
 Recall:          0.9607
 F1 Score:        0.9605
Precision, recall & F1: macro-averaged (equally weighted avg. of 10 classes)


=========================Confusion Matrix=========================
    0    1    2    3    4    5    6    7    8    9
---------------------------------------------------
 5807    0   10    3   11   17   24    3   40    8 | 0 = 0
    1 6583   50   17    9    8    1    7   54   12 | 1 = 1
   24   11 5759   24   27   11   19   31   47    5 | 2 = 2
   14   16   99 5717    3  113    8   29   96   36 | 3 = 3
    6   14   25    2 5615    1   31    5   16  127 | 4 = 4
   28    8   14   65   15 5180   39    8   43   21 | 5 = 5
   29    8   10    0   21   49 5775    0   26    0 | 6 = 6
   11   24   77   13   35    3    2 6006   14   80 | 7 = 7
   18   39   24   56   14   45   21    9 5600   25 | 8 = 8
   22   12    5   43  115   28    2   71   43 5608 | 9 = 9

Confusion matrix format: Actual (rowClass) predicted as (columnClass) N times
==================================================================
15:03:20,502 INFO  ~ Attempting to delete temporary directory: /tmp/hadoop-lenovo/dl4j/1572245786579_-2c362db/0/
15:03:22,346 INFO  ~ Deleted temporary directory: /tmp/hadoop-lenovo/dl4j/1572245786579_-2c362db/0/
15:03:22,347 INFO  ~ ***** Example Complete *****

DeepLearning4J环境搭建、测试完成,接下来了解下DLJ4的细节...

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值