GCN（1stChebNet）节点分类代码

最新推荐文章于 2024-05-20 16:02:21 发布

monster.YC

最新推荐文章于 2024-05-20 16:02:21 发布

阅读量2.1k

点赞数 1

文章标签：深度学习 gcn tensorflow

本文链接：https://blog.csdn.net/weixin_43450885/article/details/106412229

版权

0.GCN（1stChebNet）节点分类

该代码为《SEMI-SUPERVISED CLASSIFICATION WITH GRAPH CONVOLUTIONAL NETWORKS》一文中作者公开的代码，代码网址为https://github.com/tkipf/gcn。

1. 数据

"""
    ind.dataset_str.x => the feature vectors of the training instances as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.tx => the feature vectors of the test instances as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.allx => the feature vectors of both labeled and unlabeled training instances
        (a superset of ind.dataset_str.x) as scipy.sparse.csr.csr_matrix object;
    ind.dataset_str.y => the one-hot labels of the labeled training instances as numpy.ndarray object;
    ind.dataset_str.ty => the one-hot labels of the test instances as numpy.ndarray object;
    ind.dataset_str.ally => the labels for instances in ind.dataset_str.allx as numpy.ndarray object;
    ind.dataset_str.graph => a dict in the format {index: [index_of_neighbor_nodes]} as collections.defaultdict
        object;
    ind.dataset_str.test.index => the indices of test instances in graph, for the inductive setting as list object.
    """

2. 数据加载：

在\gcn-master\gcn\models.py中，def load_data(dataset_str)进行数据加载：
使用Cora数据集，通过pickle模块将数据反序列化得到变量x, y, tx, ty, allx, ally, graph：

"""
    ind.cora.x => x.shape:(140,1433)
    ind.cora.tx => tx.shape:(1000,1433)
    ind.cora.allx => allx.shape:(1708,1433)
    ind.cora.y => y.shape:(140,7)id
    ind.cora.ty => ty.shape:(1000,7)
    ind.cora.ally => ally.shape:(1708,7)
    ind.cora.graph => graph:{键值:list}，键值从0到2707表示每个节点id，list记录表示与该节点相连的节点id
    ind.cora.test.index => test_idx_reorder.shape:(1000,),记录测试节点id
    """

将allx, tx拼接，ally, ty拼接，得到Cora数据集所有节点的特征和标签：

features = sp.vstack((allx, tx)).tolil()#features.shape:(2708,1433)节点特征
labels = np.vstack((ally, ty))#labels.shape:(2708,7)节点标签
adj = nx.adjacency_matrix(nx.from_dict_of_lists(graph))#adj.shape(2708,2708)邻接矩阵

生成训练/验证/测试掩膜以及分割训练/验证/测试数据：
取[0,140)的节点作为训练集，[140,640)的节点作为验证集，[1708,2707]的节点作为测试集。

idx_test = test_idx_range.tolist()
idx_train = range(len(y))
idx_val = range(len(y), len(y)+500)

train_mask = sample_mask(idx_train, labels.shape[0])
val_mask = sample_mask(idx_val, labels.shape[0])
test_mask = sample_mask(idx_test, labels.shape[0])

y_train = np.zeros(labels.shape)
y_val = np.zeros(labels.shape)
y_test = np.zeros(labels.shape)
#训练/验证/测试集中不包含的节点特征的值为零
y_train[train_mask, :] = labels[train_mask, :]
y_val[val_mask, :] = labels[val_mask, :]
y_test[test_mask, :] = labels[test_mask, :]

3.归一化邻接矩阵

在\gcn-master\gcn\utils.py中，preprocess_adj(adj)将邻接矩阵 $a d j$ 归一化：
对应论文中的 $\tilde D^{-\frac{1}{2}}\tilde A\tilde D^{-\frac{1}{2}}$ 部分,（ $A\tilde = A+I_{N}$ , $\tilde D_{ii}=\sum_{j}\tilde A_{ij}$ ）

def normalize_adj(adj):
    """Symmetrically normalize adjacency matrix."""
    adj = sp.coo_matrix(adj)
    rowsum = np.array(adj.sum(1))
    d_inv_sqrt = np.power(rowsum, -0.5).flatten()
    d_inv_sqrt[np.isinf(d_inv_sqrt)] = 0.
    d_mat_inv_sqrt = sp.diags(d_inv_sqrt)
    return adj.dot(d_mat_inv_sqrt).transpose().dot(d_mat_inv_sqrt).tocoo()

def preprocess_adj(adj):
    """Preprocessing of adjacency matrix for simple GCN model and conversion to tuple representation."""
    adj_normalized = normalize_adj(adj + sp.eye(adj.shape[0]))
    return sparse_to_tuple(adj_normalized)

4.1stChebNet（图卷积层定义）

在\gcn-master\gcn\layers.py的 GraphConvolution(Layer)类中定义了图卷积层，在初始化GraphConvolution类时，self.vars[‘weights_’ + str(i)]以及self.vars[‘bias’]是该图卷积层创建的可学习权重和偏置。相当于论文中 $\Theta \in R^{C×F}$ 参数。

 with tf.variable_scope(self.name + '_vars'):
            for i in range(len(self.support)):
                self.vars['weights_' + str(i)] = glorot([input_dim, output_dim],
                                                        name='weights_' + str(i))
            if self.bias:
                self.vars['bias'] = zeros([output_dim], name='bias')

当对GraphConvolution类使用def _call_(self, inputs)，子类GraphConvolution继承了父类Layer的def _call_(self, inputs)，在执行def _call_(self, inputs)时，会调用 def _call(self, inputs)方法，但是子类GraphConvolution类重新定义了def _call(self, inputs)，将再次跑到子类定义的def call(self, inputs)方法中。
在def _call(self, inputs)方法中，注释为” # convolve“部分为论文中的图卷积公式：
$Z=(\tilde D^{-\frac{1}{2}}\tilde A\tilde D^{-\frac{1}{2}})X\Theta$

其中， $pre\_sup=X\Theta,self.support[i]=\tilde D^{-\frac{1}{2}}\tilde A\tilde D^{-\frac{1}{2}},support=(\tilde D^{-\frac{1}{2}}\tilde A\tilde D^{-\frac{1}{2}})X\Theta$ 。

# convolve
supports = list()
for i in range(len(self.support)):
    if not self.featureless:
        pre_sup = dot(x, self.vars['weights_' + str(i)],
                      sparse=self.sparse_inputs)
    else:
        pre_sup = self.vars['weights_' + str(i)]
    support = dot(self.support[i], pre_sup, sparse=True)
    supports.append(support)
output = tf.add_n(supports)

5.图卷积模型

在\gcn-master\gcn\models.py的GCN(Model)类中，定义了实验的网络结构，其中关于网络结构部分的属性为self.layers，是一个list格式数据，在 def _build(self)类方法中实现，可以看出结构由两层图卷积组成，需要注意的是act参数，结合第4节中GraphConvolution可以知道，act参数是对卷积后的结果进行操作，第一层卷积后经过relu()函数后再输出（act=tf.nn.relu），第二层卷积后直接输出（act=lambda x: x）。

def _build(self):
   self.layers.append(GraphConvolution(input_dim=self.input_dim,
                                        output_dim=FLAGS.hidden1,
                                        placeholders=self.placeholders,
                                        act=tf.nn.relu,
                                        dropout=True,
                                        sparse_inputs=True,
                                        logging=self.logging))

    self.layers.append(GraphConvolution(input_dim=FLAGS.hidden1,
                                        output_dim=self.output_dim,
                                        placeholders=self.placeholders,
                                        act=lambda x: x,
                                        dropout=True,
                                        logging=self.logging))

6.前向传播和反向传播

在\gcn-master\gcn\models.py中定义了Model(object)类，作为GCN(Model)的父类。在初始化子类GCN(Model)时将调用存在于父类Model(object)类的def build(self)，构建前向传播、权值更新的计算过程。

    def build(self):
        """ Wrapper for _build() """
        # 创建两层图卷积序列，类似于keras中的sequence部分。
        with tf.variable_scope(self.name):
            self._build()

       #将输入送入构建好的两层图卷积中，得到输出self.outputs。
        self.activations.append(self.inputs)
        for layer in self.layers:
            hidden = layer(self.activations[-1])
            self.activations.append(hidden)
        self.outputs = self.activations[-1]

        # Store model variables for easy access
        variables = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope=self.name)
        self.vars = {var.name: var for var in variables}

        # 计算损失和准确率
        self._loss()
        self._accuracy()
		#权重优化
        self.opt_op = self.optimizer.minimize(self.loss)

结构表示为论文中：
$self.outputs=f(self.inputs,A)=\hat A\ ReLU(\hat A(self.inputs)W^{(0)})W^{(1)}$

此时并没有对self.outputs进行softmax，softmax部分是在计算交叉熵损失的时候进行的。在计算交叉熵损失的时候会将self.outputs先进行 $s o f t (*)$ 计算，在进行交叉熵的计算，和pytorch中的交叉熵计算方法类似，所以在定义网络结构时没有加上 $s o f t m a x (*)$ 。交叉熵损失定义在\gcn-master\gcn\metrics.py中的masked_softmax_cross_entropy(preds, labels, mask)函数中。

def masked_softmax_cross_entropy(preds, labels, mask):
    """Softmax cross-entropy loss with masking."""
    loss = tf.nn.softmax_cross_entropy_with_logits(logits=preds, labels=labels)
    mask = tf.cast(mask, dtype=tf.float32)
    mask /= tf.reduce_mean(mask)
    loss *= mask
    return tf.reduce_mean(loss)