+-------------------+---------------------------------------+
| Parameter | Value |
+===================+=======================================+
| Active connection | 10 |
+-------------------+---------------------------------------+
| Batch size | 16 |
+-------------------+---------------------------------------+
| Epochs | 4 |
+-------------------+---------------------------------------+
| Feat path | facedata/CASIA.feas.npy |
+-------------------+---------------------------------------+
| K at hop | [200, 10] |
+-------------------+---------------------------------------+
| Knn graph path | facedata/knn.graph.CASIA.kdtree.npy |
+-------------------+---------------------------------------+
| Label path | facedata/CASIA.labels.npy |
+-------------------+---------------------------------------+
| Logs dir | D:\jupyter\gcn_clustering-master\logs |
+-------------------+---------------------------------------+
| Lr | 0.010 |
+-------------------+---------------------------------------+
| Momentum | 0.900 |
+-------------------+---------------------------------------+
| Print freq | 200 |
+-------------------+---------------------------------------+
| Seed | 1 |
+-------------------+---------------------------------------+
| Weight decay | 0.000 |
+-------------------+---------------------------------------+
| Workers | 16 |对于我这普通电脑还是得调成0啊
+-------------------+---------------------------------------+
依然是用dataset和dataloader这俩包直接划分好数据
读取CASIA.feas.npy得到(454590, 512)的features,knn.graph.CASIA.kdtree.npy得到(454590, 201)的knn_graph,CASIA.labels.npy得到454590的labels
每轮一开始先adjust_lr,自适应调整学习率,从1轮还是每轮乘个0.1(不断减小学习率啊)
训练刚开始,通过AverageMeter类初始化一些性能指标
遍历dataloader划分出的batch,有[16, 2201, 512]的feat,[16, 2201, 2201]的adj,[16, 1]的cid,[16, 200]的h1id,[16, 200]的gtmat(前4个好像原本是包在一个tuple里的)
输入:
feat->x [16, 2201, 512]
adj ->A [16, 2201, 2201]
h1id->one_hop_idcs [16, 200]
gcn(
#把x view成[35216, 512]
(bn0): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=True)
#再view回[16, 2201, 512]
(conv1): GraphConv(
(agg): MeanAggregator()
#[16,2201,2201]x[16,2201,512]->[16,2201,512]agg_feats
#拼接前后两个[16,2201,512]得到cat_feats[16,2201,1024]
#通过einsum,[16,2201,1024]x[1024,512]->[16,2201,512]
#加偏重后relu
)
(conv2): GraphConv(
(agg): MeanAggregator()
)
(conv3): GraphConv(
(agg): MeanAggregator()
#[16,2201,1024]x[1024,256]->[16,2201,256]
)
(conv4): GraphConv(
(agg): MeanAggregator()
#[16,2201,512]x[512,256]->[16,2201,256]
)
#初始化一个全0的[16,200,256]的edge_feat
for b in range(B): #16
edge_feat[b,:,:] = x[b, one_hop_idcs[b]]
#再view成[3200,256]
(classifier): Sequential(
(0): Linear(in_features=256, out_features=256, bias=True)
(1): PReLU(num_parameters=256)
(2): Linear(in_features=256, out_features=2, bias=True)
)
#得到[3200,2]
)
SGD (
Parameter Group 0
dampening: 0
lr: 0.01
momentum: 0.9
nesterov: False
weight_decay: 0.0001
)
把[16,200]的gtmat view成3200的labels和pred交叉熵
再计算准确率(这会才发现gtmat里只有0和1,这怎么就成了二分类?)
从而有平均准确率acc0.5884,precision0.07675753228120516,召回率recall0.781021897810219
然后再opt.zero_grad(),loss.backward(),opt.step()(也就是说这之前计算的变量都没有梯度?这样一说,好像这样反传更加合理?)
————————————————————
(又看了下关于二分类的事,是在一开始dataset类里实现的,总之就是和我想象的节点聚类不太一样,应该是对我没啥帮助了)