【GCN】tensoflow 2.0 (keras) 模块子类化实现 GCN

我们面对的很多数据其实是图(graph),图在生活中无处不在,如社交网络,知识图谱,蛋白质结构等。

图的概念

对于图,我们习惯上用 G = ( V , E ) G = (V,E) G=(V,E) 表示。这里 V V V 是图中节点的集合,而 E E E 为边的集合,这里记图的节点数为 N N N 。一个 G G G 中有 3 个比较重要的矩阵:

  • 邻接矩阵 A A A:adjacency matrix,用来表示节点间的连接关系,这里我们假定是 0 - 1 矩阵;
  • 度矩阵 D D D:degree matrix,每个节点的度指的是其连接的节点数,这是一个对角矩阵,其中对角线元素 D i i = ∑ j A i j D_{ii} = \sum\limits_j A_{ij} Dii=jAij
  • 特征矩阵 X X X :用于表示节点的特征 X ∈ R N × F X \in \mathbb{R}^{N \times F} XRN×F,这里 F F F 是特征的维度;

数学表示是比较抽象的,下面是一个实例:

图

邻接矩阵

图卷积

以给图中每个节点增加自连接,实现上可以直接改变邻接矩阵:
A ~ = A + I N \tilde{A} = A + I_N A~=A+IN
可以对邻接矩阵进行归一化,使得 A A A 的每行和值为 1,在实现上我们可以乘以度矩阵的逆矩阵: D ~ − 1 A ~ \tilde{D}^{-1}\tilde{A} D~1A~,这里的度矩阵是更新 A A A 后重新计算的。这样我们就得到:
H ( k + 1 ) = f ( H ( k ) , A ) = σ ( D ~ − 1 A ~ H ( k ) W ( k ) ) H^{(k+1)}=f(H^{(k)}, A)=\sigma(\tilde{D}^{-1}\tilde{A}H^{(k)}W^{(k)}) H(k+1)=f(H(k),A)=σ(D~1A~H(k)W(k))
相比加法规则,这种聚合方式其实是对领域节点特征求平均,这里:
( D ~ − 1 A ~ H ) i = ( D ~ − 1 A ~ ) i H = ( ∑ k D ~ i k − 1 A ~ i ) H = ( D ~ i i − 1 A ~ i ) H = D ~ i i − 1 ∑ j A ~ i j H j = ∑ j 1 D ~ i i A ~ i j H j \begin{aligned} (\tilde{D}^{-1}\tilde{A}H)_i &= (\tilde{D}^{-1}\tilde{A})_iH\\ &= (\sum_k\tilde{D}^{-1}_{ik}\tilde{A}_i)H \\ &= (\tilde{D}^{-1}_{ii}\tilde{A}_i)H \\ &= \tilde{D}^{-1}_{ii}\sum_j\tilde{A}_{ij}H_j \\ &= \sum_j\frac{1}{\tilde{D}_{ii}}\tilde{A}_{ij}H_j \end{aligned} (D~1A~H)i=(D~1A~)iH=(kD~ik1A~i)H=(D~ii1A~i)H=D~ii1jA~ijHj=jD~ii1A~ijHj

由于 D i i = ∑ j A i j D_{ii} = \sum\limits_j A_{ij} Dii=jAij ,所以这种聚合方式其实就是求平均,对领域节点的特征是求平均值,这样就进行了归一化,避免求和方式所造成的问题。

更进一步地,我们可以采用对称归一化来进行聚合操作,这就是论文1中所提出的图卷积方法:
H ( k + 1 ) = f ( H ( k ) , A ) = σ ( D ~ − 1 2 A ~ D ~ − 1 2 H ( k ) W ( k ) ) H^{(k+1)}=f(H^{(k)}, A)=\sigma(\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}H^{(k)}W^{(k)}) H(k+1)=f(H(k),A)=σ(D~21A~D~21H(k)W(k))
这种新的聚合方法不再是单单地对邻域节点特征进行平均,这里:
( D ~ − 1 2 A ~ D ~ − 1 2 H ) i = ( D ~ − 1 2 A ~ ) i D ~ − 1 2 H = ( ∑ k D ~ i k − 1 2 A ~ i ) D ~ − 1 2 H = D ~ i i − 1 2 ∑ j A ~ i j ∑ k D ~ j k − 1 2 H j = D ~ i i − 1 2 ∑ j A ~ i j D ~ j j − 1 2 H j = ∑ j 1 D ~ i i D ~ j j A ~ i j H j \begin{aligned} (\tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}H)_i &= (\tilde{D}^{-\frac{1}{2}}\tilde{A})_i\tilde{D}^{-\frac{1}{2}}H\\\\ &= (\sum_k\tilde{D}^{-\frac{1}{2}}_{ik}\tilde{A}_i)\tilde{D}^{-\frac{1}{2}}H \\ &= \tilde{D}^{-\frac{1}{2}}_{ii}\sum_j\tilde{A}_{ij}\sum_k\tilde{D}^{-\frac{1}{2}}_{jk}H_j \\ &= \tilde{D}^{-\frac{1}{2}}_{ii}\sum_j\tilde{A}_{ij}\tilde{D}^{-\frac{1}{2}}_{jj}H_j \\ &= \sum_j\frac{1}{\sqrt{\tilde{D}_{ii}\tilde{D}_{jj}}}\tilde{A}_{ij}H_j \end{aligned} (D~21A~D~21H)i=(D~21A~)iD~21H=(kD~ik21A~i)D~21H=D~ii21jA~ijkD~jk21Hj=D~ii21jA~ijD~jj21Hj=jD~iiD~jj 1A~ijHj
可以看到这种聚合方式不仅考虑了节点i的度,而且也考虑了邻居节点 j j j 的度,当邻居节点j的度较大时,而特征反而会受到抑制

这种图卷积方法其实基于傅里叶理论的 谱图卷积 的一阶近似,关于更多的数学证明比较难理解,这里不做展开,详情可见论文。

GCN 之 tensorflow 2.0 实现

一般情况下,简单的应用可以直接使用函数式API编程,对于复杂的网络的定义和训练可以使用类继承的方式,这样的代码逻辑和封装性较好。本程序基于 Github repo 进行改编,更加符合 model 子类化的个人习惯,以下简述核心部分。

自定义 layer

定义了稀疏 SparseDropGraphConv 层分别用于稀疏输入的 DropoutConvolution

class SparseDrop(layers.Layer):
    """
    Sparse dropout layer.
    """

    def __init__(self, num_features_nonzero,
                 dropout=0.,
                 is_sparse_inputs=False, **kwargs):
        super(SparseDrop, self).__init__(**kwargs)

        self.dropout = dropout
        self.is_sparse_inputs = is_sparse_inputs
        self.num_features_nonzero = num_features_nonzero

    def call(self, inputs, training=None):
        x = inputs

        # dropout
        if training is not False and self.is_sparse_inputs:
            x = sparse_dropout(x, self.dropout, self.num_features_nonzero)
        elif training is not False:
            x = tf.nn.dropout(x, self.dropout)

        return x

class GraphConv(layers.Layer):
    """
    Graph convolution layer.
    """
    def __init__(self, input_dim, output_dim,
                 is_sparse_inputs=False,
                 bias=False,
                 featureless=False, **kwargs):
        super(GraphConv, self).__init__(**kwargs)

        self.is_sparse_inputs = is_sparse_inputs
        self.featureless = featureless
        self.bias = bias

        self.weights_ = []
        for i in range(1):
            w = self.add_variable('weight' + str(i), [input_dim, output_dim])
            self.weights_.append(w)
        if self.bias:
            self.bias = self.add_variable('bias', [output_dim])

    def call(self, inputs, training=None):
        x, support_ = inputs

        # convolve
        supports = list()
        for i in range(len(support_)):
            if not self.featureless: # if it has features x
                pre_sup = dot(x, self.weights_[i], sparse=self.is_sparse_inputs)
            else:
                pre_sup = self.weights_[i]

            support = dot(support_[i], pre_sup, sparse=True)
            supports.append(support)

        output = tf.add_n(supports)

        # bias
        if self.bias:
            output += self.bias

        return output

自定义 GCN

class GCN(keras.Model):

    def __init__(self, input_dim, output_dim, num_features_nonzero, **kwargs):
        super(GCN, self).__init__(**kwargs)

        self.input_dim = input_dim # 1433
        self.output_dim = output_dim

        print('input dim:', input_dim)
        print('output dim:', output_dim)
        print('num_features_nonzero:', num_features_nonzero)

        self.d1 = SparseDrop(num_features_nonzero=num_features_nonzero,
                             dropout=args.dropout,
                             is_sparse_inputs=True)
        self.c1 = GraphConv(input_dim=self.input_dim,  # 1433
                            output_dim=args.hidden1,  # 16
                            is_sparse_inputs=True)
        self.a1 = Activation(activation='relu')

        self.d2 = Dropout(args.dropout)
        self.c2 = GraphConv(input_dim=args.hidden1,  # 16
                            output_dim=self.output_dim)  # 7
        self.a2 = Activation(activation='softmax')

        for p in self.trainable_variables:
            print(p.name, p.shape, type(p))

    def call(self, inputs, training=None):
        """
        :param inputs:
        :param training:
        :return:
        """
        x, label, mask, support = inputs

        h = self.d1(x)
        h = self.c1((h, support), training)
        h = self.a1(h)

        h = self.d2(h)
        h = self.c2((h, support), training)
        outputs = self.a1(h)

        self.outputs = outputs
        self.loss, self.acc = self.evaluate((label, mask), outputs)

        return outputs

    def evaluate(self, inputs, outputs):
        label, mask = inputs

        # Weight decay loss
        loss = tf.zeros([])
        for var in self.c1.trainable_variables:
            loss += args.weight_decay * tf.nn.l2_loss(var)

        # Cross entropy error
        loss += masked_softmax_cross_entropy(outputs, label, mask)
        acc = masked_accuracy(outputs, label, mask)

        return loss, acc

    def predict(self, inputs, training=None):
        # outputs = self.call(self, inputs, training=training)
        return tf.nn.softmax(self.outputs)

核心模块

# Create model
model = GCN(input_dim=features[2][1],
                   output_dim=y_train.shape[1],
                   num_features_nonzero=features[1].shape) # [1433]

optimizer = optimizers.Adam(lr=1e-2)
# model.compile(optimizer=optimizer, loss=)

for epoch in range(args.epochs):

    # Training step
    with tf.GradientTape() as tape:
        outputs = model((features, train_label, train_mask, support))
        loss, acc = model.evaluate((train_label, train_mask), outputs)

    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

    # validation step
    outputs = model((features, val_label, val_mask, support), training=False)
    val_loss, val_acc = model.evaluate((val_label, val_mask), outputs)

    if epoch % 20 == 0:
        print('Epoch %-3d' % epoch,
              'loss=%-6.4f' % float(loss),
              'acc=%-6.4f' % float(acc),
              'loss=%-6.4f' % float(val_loss),
              'val=%-6.4f' % float(val_acc))


outputs = model((features, test_label, test_mask, support), training=False)
test_loss, test_acc = model.evaluate((test_label, test_mask), outputs)

其实所谓的“模型子类化”就是自己实现一个类来继承 model类,构建一个model类的子类,需要实现三个方法,即:

__init__()
call()
compute_output_shape()

自定义层也是实现三个方法,分别是:

build(input_shape):
call(x):
compute_output_shape(input_shape):

总之,针对作业使用正确的 API。虽然模型子类化较为灵活,但代价是复杂性更高且用户出错率更高。如果可能,请首选函数式 API。

  • 1
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
要在conda中配置Tensorflow 2.0,可以按照以下步骤进行: 1. 首先,确保已经安装了conda和相应的Python版本。如果没有安装,可以从Anaconda官网下载和安装。 2. 打开Anaconda Prompt或者终端,并创建一个新的conda环境,可以使用命令:conda create -n tf2.0 python=3.7 3. 激活新创建的环境,可以使用命令:conda activate tf2.0 4. 使用conda安装Tensorflow 2.0,可以使用命令:conda install tensorflow-gpu=2.0.0 5. 等待安装完成后,就成功配置了conda中的Tensorflow 2.0环境。 另外,如果你想使用pip进行安装,也可以按照以下步骤进行: 1. 激活你的conda环境,可以使用命令:conda activate tf2.0 2. 使用pip安装Tensorflow 2.0,可以使用命令:pip install tensorflow==2.0.0 -i https://pypi.tuna.tsinghua.edu.cn/simple 这样你就成功配置了conda中的Tensorflow 2.0环境。希望对你有帮助!<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [深度学习(基于Tensorflow2.0)学习笔记——Day2](https://download.csdn.net/download/weixin_38747211/14884742)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [Conda安装Tensorflow2.0](https://blog.csdn.net/u010442263/article/details/125567460)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [Anaconda安装+Tensorflow2.0安装配置+Pycharm安装+GCN调试(Window 10)](https://blog.csdn.net/adreammaker/article/details/125506038)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v93^chatsearchT3_2"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值