GCN
我们可以看到将
D
~
1
/
2
A
~
D
~
1
/
2
\widetilde{D}^{1/2} \widetilde{A} \widetilde{D}^{1/2}
D
1/2A
D
1/2去掉那么,公式就变成了一个全连接层的公式;
全连接层:
全连接层,通俗的说就是前面一层的每个单元都与后面一层的相连接。如下图的绿色 Hidden 层,Hidden 层的每个单元都与 Input 层的所有单元相连接,同理 Output 层的与 Hidden 层的也是如此。
x
W
=
Y
xW=Y
xW=Y
(
1
∗
3
)
∗
(
3
∗
4
)
=
(
1
∗
4
)
(1*3)*(3*4)=(1*4)
(1∗3)∗(3∗4)=(1∗4)
(
1
∗
4
)
∗
(
4
∗
2
)
=
(
1
∗
2
)
(1*4)*(4*2)=(1*2)
(1∗4)∗(4∗2)=(1∗2)
如下图,
A
A
A是临接矩阵,
I
N
I_N
IN是个单位矩阵,
A
~
\widetilde{A}
A
等于二者相加;(也就是加入了自连接)
D
~
i
i
=
∑
j
A
~
i
j
\widetilde{D}_{ii}=\sum_j \widetilde{A}_{ij}
D
ii=j∑A
ij
D
~
\widetilde{D}
D
是将
A
~
\widetilde{A}
A
按行相加;(也就是加入了自连接的度(degree)的矩阵)
计算示例如下图所示:
那么这个公式到底有什么物理含义呢?
由于
D
~
1
/
2
\widetilde{D}^{1/2}
D
1/2,是归一化操作,我们可以先忽略此步操作;
不难发现,这个相乘是将邻居特征都聚合起来了,如:1节点,聚合它所连接的2,3节点的特征;
从下面我们可以对比,使用
A
A
A与
A
~
\widetilde{A}
A
去相乘,可以得出结论,加入了自连接的,不仅聚合了邻居的特征还聚合了它自己的特征;
如下图,我们可以看到我们输入一个图通过GCN之后输出一个图,图的结构是不变的,但是由原始特征的C维变成了F维;
如果我们是十分类任务的话,F维应该为10维,然后再进行softmax归一化操作;
A
^
\hat{A}
A^ 是
A
~
\widetilde{A}
A
进行归一化之后的矩阵,即
D
~
1
/
2
A
~
D
~
1
/
2
\widetilde{D}^{1/2} \widetilde{A} \widetilde{D}^{1/2}
D
1/2A
D
1/2 ;
邻接矩阵A是为了选取节点,W是为了选取节点的某些特征;
代码:
train.py代码如下:
# 1
from __future__ import division # 相除之后可以显示小数
from __future__ import print_function # 使用print需要括号()
# 2
import time
import argparse
import numpy as np
import torch
import torch.nn.functional as F
import torch.optim as optim
from pygcn.utils import load_data, accuracy
from pygcn.models import GCN
# Training settings
parser = argparse.ArgumentParser()
parser.add_argument('--no-cuda', action='store_true', default=False,
help='Disables CUDA training.')
parser.add_argument('--fastmode', action='store_true', default=False,
help='Validate during training pass.')
parser.add_argument('--seed', type=int, default=42, help='Random seed.')
parser.add_argument('--epochs', type=int, default=200,
help='Number of epochs to train.')
parser.add_argument('--lr', type=float, default=0.01,
help='Initial learning rate.')
parser.add_argument('--weight_decay', type=float, default=5e-4,
help='Weight decay (L2 loss on parameters).')
parser.add_argument('--hidden', type=int, default=16,
help='Number of hidden units.')
parser.add_argument('--dropout', type=float, default=0.5,
help='Dropout rate (1 - keep probability).')
args = parser.parse_args()
args.cuda = not args.no_cuda and torch.cuda.is_available()
# 3
np.random.seed(args.seed) # 设置随机种子,随机种子相同则产生相同随机数
torch.manual_seed(args.seed) # 设置随机种子,随机种子相同则产生相同随机数
if args.cuda:
torch.cuda.manual_seed(args.seed)
# 4
# Load data
adj, features, labels, idx_train, idx_val, idx_test = load_data() # Ctrl+左键点击load_data函数可以跳转到函数里边,再次Ctrl+左键可以跳转回来;
# Model and optimizer
model = GCN(nfeat=features.shape[1],
nhid=args.hidden,
nclass=labels.max().item() + 1,
dropout=args.dropout)
optimizer = optim.Adam(model.parameters(),
lr=args.lr, weight_decay=args.weight_decay)
if args.cuda:
model.cuda()
features = features.cuda()
adj = adj.cuda()
labels = labels.cuda()
idx_train = idx_train.cuda()
idx_val = idx_val.cuda()
idx_test = idx_test.cuda()
def train(epoch):
t = time.time()
model.train()
optimizer.zero_grad()
output = model(features, adj)
loss_train = F.nll_loss(output[idx_train], labels[idx_train])
acc_train = accuracy(output[idx_train], labels[idx_train])
loss_train.backward()
optimizer.step()
if not args.fastmode:
# Evaluate validation set performance separately,
# deactivates dropout during validation run.
model.eval()
output = model(features, adj)
loss_val = F.nll_loss(output[idx_val], labels[idx_val])
acc_val = accuracy(output[idx_val], labels[idx_val])
print('Epoch: {:04d}'.format(epoch+1),
'loss_train: {:.4f}'.format(loss_train.item()),
'acc_train: {:.4f}'.format(acc_train.item()),
'loss_val: {:.4f}'.format(loss_val.item()),
'acc_val: {:.4f}'.format(acc_val.item()),
'time: {:.4f}s'.format(time.time() - t))
def test():
model.eval()
output = model(features, adj)
loss_test = F.nll_loss(output[idx_test], labels[idx_test])
acc_test = accuracy(output[idx_test], labels[idx_test])
print("Test set results:",
"loss= {:.4f}".format(loss_test.item()),
"accuracy= {:.4f}".format(acc_test.item()))
# Train model
t_total = time.time()
for epoch in range(args.epochs):
train(epoch)
print("Optimization Finished!")
print("Total time elapsed: {:.4f}s".format(time.time() - t_total))
# Testing
test()
1、如果某个版本中出现了某个新的功能特性,而且这个特性和当前版本中使用的不兼容,也就是它在该版本中不是语言标准,那么我如果想要使用的话就需要从future模块导入;
2、argparse是python用于命令行选项和参数解析的模块,可以编写用户友好的命令行接口,能够帮助程序员为模型定义参数;
举个例子:
# 导入库
import argparse
# 1. 定义命令行解析器对象
parser = argparse.ArgumentParser(description='Demo of argparse')
# 2. 添加命令行参数
parser.add_argument('--epochs', type=int, default=30)
parser.add_argument('--batch', type=int, default=4)
parser.add_argument('--no-cuda', action='store_true', default=False,
help='Disables CUDA training.')
# 3. 从命令行中结构化解析参数
args = parser.parse_args()
print(args)
epochs = args.epochs
batch = args.batch
print('show {} {}'.format(epochs, batch))
#4.判断是否有GPU,如果有就打开
args.cuda = not args.no_cuda and torch.cuda.is_available()
if args.cuda:
print('\nGPU is on!')
1.我们首先导入argparse这个包,通过包里的ArgumentParser这个类创建一个对象Parser;
2.我们通过add_argument()函数导入我们所需要的参数;
3.最后,通过parse_arg()函数读取Parser对象里的从参数,并将其打印出来;
3、需要注意的是当只调用torch.cuda.manual_seed()一次时并不能生成相同的随机数序列。如果想要得到相同的随机数序列就需要每次产生随机数的时候都要调用一下torch.cuda.manual_seed();
4、加载数据
until.py代码如下:
import numpy as np
import scipy.sparse as sp
import torch
def encode_onehot(labels):
classes = set(labels)
classes_dict = {c: np.identity(len(classes))[i, :] for i, c in
enumerate(classes)}
labels_onehot = np.array(list(map(classes_dict.get, labels)),
dtype=np.int32)
return labels_onehot
def load_data(path="../data/cora/", dataset="cora"):
"""Load citation network dataset (cora only for now)"""
print('Loading {} dataset...'.format(dataset))
idx_features_labels = np.genfromtxt("{}{}.content".format(path, dataset), # 读取整个文件,【节点,特征矩阵,标签】
dtype=np.dtype(str))
features = sp.csr_matrix(idx_features_labels[:, 1:-1], dtype=np.float32) # 将稀疏矩阵以另一种方式储存,节省空间
labels = encode_onehot(idx_features_labels[:, -1]) # 将标签用独热向量表示
# build graph
idx = np.array(idx_features_labels[:, 0], dtype=np.int32)
idx_map = {j: i for i, j in enumerate(idx)}
edges_unordered = np.genfromtxt("{}{}.cites".format(path, dataset),
dtype=np.int32)
edges = np.array(list(map(idx_map.get, edges_unordered.flatten())),
dtype=np.int32).reshape(edges_unordered.shape)
adj = sp.coo_matrix((np.ones(edges.shape[0]), (edges[:, 0], edges[:, 1])),
shape=(labels.shape[0], labels.shape[0]),
dtype=np.float32)
# build symmetric adjacency matrix
adj = adj + adj.T.multiply(adj.T > adj) - adj.multiply(adj.T > adj)
features = normalize(features)
adj = normalize(adj + sp.eye(adj.shape[0]))
idx_train = range(140)
idx_val = range(200, 500)
idx_test = range(500, 1500)
features = torch.FloatTensor(np.array(features.todense()))
labels = torch.LongTensor(np.where(labels)[1])
adj = sparse_mx_to_torch_sparse_tensor(adj)
idx_train = torch.LongTensor(idx_train)
idx_val = torch.LongTensor(idx_val)
idx_test = torch.LongTensor(idx_test)
return adj, features, labels, idx_train, idx_val, idx_test
def normalize(mx):
"""Row-normalize sparse matrix"""
rowsum = np.array(mx.sum(1))
r_inv = np.power(rowsum, -1).flatten()
r_inv[np.isinf(r_inv)] = 0.
r_mat_inv = sp.diags(r_inv)
mx = r_mat_inv.dot(mx)
return mx
def accuracy(output, labels):
preds = output.max(1)[1].type_as(labels)
correct = preds.eq(labels).double()
correct = correct.sum()
return correct / len(labels)
def sparse_mx_to_torch_sparse_tensor(sparse_mx):
"""Convert a scipy sparse matrix to a torch sparse tensor."""
sparse_mx = sparse_mx.tocoo().astype(np.float32)
indices = torch.from_numpy(
np.vstack((sparse_mx.row, sparse_mx.col)).astype(np.int64))
values = torch.from_numpy(sparse_mx.data)
shape = torch.Size(sparse_mx.shape)
return torch.sparse.FloatTensor(indices, values, shape)
np.genfromtxt()从文本中加载数据,并按指定处理缺失值;
format将任意数据转换为字符串;