Dl4j-fit(DataSetIterator iterator)源码阅读（三）

最新推荐文章于 2022-11-15 17:13:44 发布

寒沧

最新推荐文章于 2022-11-15 17:13:44 发布

阅读量353

点赞数 1

分类专栏： deeplearning4j DeepLearning4j

本文链接：https://blog.csdn.net/u011669700/article/details/78711067

版权

DeepLearning4j 同时被 2 个专栏收录

33 篇文章 7 订阅

订阅专栏

deeplearning4j

32 篇文章 3 订阅

订阅专栏

2.3.3.1 gradientAndScore();

这里用于获取梯度和分数

@Override
public Pair<Gradient, Double> gradientAndScore() {
    oldScore = score;
    model.computeGradientAndScore();

    if (iterationListeners != null && iterationListeners.size() > 0) {
        for (IterationListener l : iterationListeners) {
            if (l instanceof TrainingListener) {
                ((TrainingListener) l).onGradientCalculation(model);
            }
        }
    }

    Pair<Gradient, Double> pair = model.gradientAndScore();
    score = pair.getSecond();
    updateGradientAccordingToParams(pair.getFirst(), model, model.batchSize());
    return pair;
}

2.3.3.2 model.computeGradientAndScore()

@Override
public void computeGradientAndScore() {
    //Calculate activations (which are stored in each layer, and used in backprop)
    if (layerWiseConfigurations.getBackpropType() == BackpropType.TruncatedBPTT) {
        List<INDArray> activations = rnnActivateUsingStoredState(getInput(), true, true);
        if (trainingListeners.size() > 0) {
            for (TrainingListener tl : trainingListeners) {
                tl.onForwardPass(this, activations);
            }
        }
        truncatedBPTTGradient();
    } else {
        //First: do a feed-forward through the network
        //Note that we don't actually need to do the full forward pass through the output layer right now; but we do
        // need the input to the output layer to be set (such that backprop can be done)
        List<INDArray> activations = feedForwardToLayer(layers.length - 2, true);
        if (trainingListeners.size() > 0) {
            //TODO: We possibly do want output layer activations in some cases here...
            for (TrainingListener tl : trainingListeners) {
                tl.onForwardPass(this, activations);
            }
        }
        INDArray actSecondLastLayer = activations.get(activations.size() - 1);
        if (layerWiseConfigurations.getInputPreProcess(layers.length - 1) != null)
            actSecondLastLayer = layerWiseConfigurations.getInputPreProcess(layers.length - 1)
                            .preProcess(actSecondLastLayer, getInputMiniBatchSize());
        getOutputLayer().setInput(actSecondLastLayer);
        //Then: compute gradients
        backprop();
    }

    //Calculate score
    if (!(getOutputLayer() instanceof IOutputLayer)) {
        throw new IllegalStateException(
                        "Cannot calculate gradient and score with respect to labels: final layer is not an IOutputLayer");
    }
    score = ((IOutputLayer) getOutputLayer()).computeScore(calcL1(true), calcL2(true), true);

    //Listeners
    if (trainingListeners.size() > 0) {
        for (TrainingListener tl : trainingListeners) {
            tl.onBackwardPass(this);
        }
    }
}

在dl4j中，除非在网络模型建立的过程中，通过.backpropType(BackpropType.TruncatedBPTT)方法来改变模型的反向传播方式，那么默认的反向传播方式一定是BackpropType.Standard（包括RNN、LSTM）。
在确定本次的反向传播方式为BackpropType.Standard之后，需要执行如下的语句。

//First: 首先对网络做一个前向传播
//Note：现在我们并不需要作完整的前向传播到输出层
//但是我们确实需要算出给输出层的输入（这样Backprop就可以完成）
List<INDArray> activations = feedForwardToLayer(layers.length - 2, true);

接下来进入这个函数体内：

 /** Compute the activations from the input to the specified layer, using the currently set input for the network.<br>
 * To compute activations for all layers, use feedForward(...) methods<br>
 * Note: output list includes the original input. So list.get(0) is always the original input, and
 * list.get(i+1) is the activations of the ith layer.
 * @param layerNum Index of the last layer to calculate activations for. Layers are zero-indexed.
 *                 feedForwardToLayer(i,input) will return the activations for layers 0..i (inclusive)
 * @param train true for training, false for test (i.e., false if using network after training)
 * @return list of activations.
 */
public List<INDArray> feedForwardToLayer(int layerNum, boolean train) {
    INDArray currInput = input;
    List<INDArray> activations = new ArrayList<>();
    activations.add(currInput);

    for (int i = 0; i <= layerNum; i++) {
        currInput = activationFromPrevLayer(i, currInput, train);
        //applies drop connect to the activation
        activations.add(currInput);
    }
    return activations;
}

这个函数是计算出所有隐藏层的输出（除去输入层和输出层），并且组成一个INDArray的List，并且包含原始的输入。之后我们单步进入activationFromPrevLayer(i, currInput, train);函数，查看神经网络的前向传播计算过程。

/**
 * Calculate activation from previous layer including pre processing where necessary
 *
 * @param curr  the current layer
 * @param input the input 
 * @return the activation from the previous layer
 */
public INDArray activationFromPrevLayer(int curr, INDArray input, boolean training) {
    if (getLayerWiseConfigurations().getInputPreProcess(curr) != null)
        input = getLayerWiseConfigurations().getInputPreProcess(curr).preProcess(input, getInputMiniBatchSize());
    INDArray ret = layers[curr].activate(input, training);
    return ret;
}

使用前一层的输出作为当前层的输入，如果有数据预处理则先进行预处理。然后调用当前层的activate()方法计算结果。该方法的调用链条如下：

@Override
public INDArray activate(INDArray input, boolean training) {
    setInput(input);
    return activate(training);
}

@Override
public INDArray activate(boolean training) {
    INDArray z = preOutput(training);
    //INDArray ret = Nd4j.getExecutioner().execAndReturn(Nd4j.getOpFactory().createTransform(
    //        conf.getLayer().getActivationFunction(), z, conf.getExtraArgs() ));
    INDArray ret = conf().getLayer().getActivationFn().getActivation(z, training);

    if (maskArray != null) {
        ret.muliColumnVector(maskArray);
    }

    return ret;
}

preOut这一部分就是网络模型前向传播的重点。

寒沧

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Dl4j-fit(DataSetIterator iterator)源码阅读（三）

2.3.3.1 gradientAndScore();这里用于获取梯度和分数@Overridepublic Pair<Gradient, Double> gradientAndScore() { oldScore = score; model.computeGradientAndScore(); if (iterationListeners != null && iterat
复制链接

扫一扫