引用于论文:Central Similarity Quantization for Efficient Image and Video Retrieval 【CVPR2020】
Hadarmard矩阵可作为哈希目标(哈希中心)使用
在本教程中,我们将介绍如何使用Hadamard矩阵为图像和视频数据集生成哈希目标(哈希中心)。
from scipy.special import comb, perm #calculate combination
from itertools import combinations
from scipy.linalg import hadamard # direct import hadamrd matrix from scipy
import torch
import numpy as np
构建一个Hadarmard矩阵。每一行或列都可以作为哈希中一类图像的目标。
d = 64 # d is the lenth of hash codes and hash centers, d should be 2^n
ha_d = hadamard(d) # hadamard matrix
ha_2d = np.concatenate((ha_d, -ha_d),0) # can be used as targets for 2*d hash bit
接下来,我们给出了单标签数据集和多标签数据集的例子
1.1 为单标签数据集生成哈希目标:ImageNet
我们在实验中从ImageNet中抽取了100个类别。因此,我们产生了100个哈希目标,一个目标代表一个不同类别的数据。对于单标签数据集。ImageNet、UCF101和HMDB51,生成哈希目标的过程是一样的
num_class = 100
if num_class<=d:
hash_targets = torch.from_numpy(ha_d[0:num_class]).float()
print('hash centers shape: {}'. format(hash_targets.shape))
elif num_class>d:
hash_targets = torch.from_numpy(ha_2d[0:num_class]).float()
print('hash centers shape: {}'. format(hash_targets.shape))
out: hash centers shape: torch.Size([100, 64])
# Save the hash targets as training targets
file_name = str(d) + '_imagenet' + '_' + str(num_class) + '_class.pkl'
file_dir = 'data/imagenet/hash_centers/' + file_name
f = open(file_dir, "wb")
torch.save(hash_targets, f)
# Test average Hamming distance between hash targets
b = []
num_class= 100
for i in range(0, num_class):
b.append(i)
com_num = int(comb(num_class, 2))
c = np.zeros(com_num)
for i in range(com_num):
i_1 = list(combinations(b, 2))[i][0]
i_2 = list(combinations(b, 2))[i][1]
TF = sum(hash_targets[i_1]!=hash_targets[i_2])
c[i]=TF
# distance between any two hash targets
c
out: array([32., 32., 32., ..., 32., 32., 32.])
1.2 为多标签数据集生成哈希目标:NUS_WIDE
我们在实验中使用了NUS_WIDE的21个类。我们首先生成21个哈希目标,一个目标代表一个不同类别的数据;然后,我们计算出多标签数据的中心点。对于多标签数据集。COCO和NUS_WIDE,生成哈希中心的过程是相同的。
num_class = 21
if num_class<=d:
hash_targets = torch.from_numpy(ha_d[0:num_class]).float()
print('hash centers shape: {}'. format(hash_targets.shape))
elif num_class>d:
hash_targets = torch.from_numpy(ha_2d[0:num_class]).float()
print('hash centers shape: {}'. format(hash_targets.shape))
hash centers shape: torch.Size([21, 64])
# Save the hash targets as training targets
file_name = str(d) + '_nus_wide' + '_' + str(num_class) + '_class.pkl'
file_dir = 'data/nus_wide/hash_centers/' + file_name
f = open(file_dir, "wb")
torch.save(hash_targets, f)
为NUS_WIDE生成多标签中心
一个简单的例子,如果NUS_WIDE中的一张图片有一个典型的标签。[1,1,0,0,0,0,0,0,0,0,0,0,0,0,1]因此,该图像包含三种类型的物体,包括第一、第二和最后一类。然后我们计算出三个对应中心的中心点。
label = torch.tensor([1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1])
three_centers = hash_targets[label==1]
three_centers
out: tensor([[ 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 1., 1., 1., 1., 1., 1.],
[ 1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1.,
1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1.,
1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1.,
1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1., 1., -1.,
1., -1., 1., -1., 1., -1., 1., -1.],
[ 1., 1., 1., 1., -1., -1., -1., -1., 1., 1., 1., 1., -1., -1.,
-1., -1., -1., -1., -1., -1., 1., 1., 1., 1., -1., -1., -1., -1.,
1., 1., 1., 1., 1., 1., 1., 1., -1., -1., -1., -1., 1., 1.,
1., 1., -1., -1., -1., -1., -1., -1., -1., -1., 1., 1., 1., 1.,
-1., -1., -1., -1., 1., 1., 1., 1.]])
centroid = three_centers.mean(dim=0)
centroid[centroid>0]=1.0
centroid[centroid<0]=-1.0
centroid
out: tensor([ 1., 1., 1., 1., 1., -1., 1., -1., 1., 1., 1., 1., 1., -1.,
1., -1., 1., -1., 1., -1., 1., 1., 1., 1., 1., -1., 1., -1.,
1., 1., 1., 1., 1., 1., 1., 1., 1., -1., 1., -1., 1., 1.,
1., 1., 1., -1., 1., -1., 1., -1., 1., -1., 1., 1., 1., 1.,
1., -1., 1., -1., 1., 1., 1., 1.])
2. 通过从伯努利分布中取样生成哈希中心
我们只是给出了一个为ImageNet@64bit生成哈希中心的例子。其他数据集的生成也是如此。中心点的计算也与上述过程相似。
# random generation
import torch
import random
import numpy as np
import csv
from scipy.special import comb, perm #calculate combination
from itertools import combinations
hash_targets = []
a = [] # for sampling the 0.5*hash_bit
b = [] # for calculate the combinations of 51 num_class
num_class = 100
hash_bit = 64
for i in range(0, hash_bit):
a.append(i)
for i in range(0, num_class):
b.append(i)
for j in range(10000):
hash_targets = torch.zeros([num_class, hash_bit])
for i in range(num_class):
ones = torch.ones(hash_bit)
sa = random.sample(a, round(hash_bit/2))
ones[sa] = -1
hash_targets[i]=ones
com_num = int(comb(num_class, 2))
c = np.zeros(com_num)
for i in range(com_num):
i_1 = list(combinations(b, 2))[i][0]
i_2 = list(combinations(b, 2))[i][1]
TF = torch.sum(hash_targets[i_1]!=hash_targets[i_2])
c[i]=TF
print(min(c))
print(max(c))
print(np.mean(c))
if min(c)>=20 and np.mean(c)>=32: # guarantee the hash center are far away from each other in Hamming space, 20 can be set as 18 for fast convergence
print(min(c))
print("stop! we find suitable hash centers")
break
18.0
48.0
32.02383838383838
18.0
46.0
32.00525252525252
16.0
48.0
32.00646464646464
20.0
46.0
31.986666666666668
20.0
46.0
31.933333333333334
18.0
44.0
31.93010101010101
18.0
46.0
32.01979797979798
16.0
46.0
31.90707070707071
18.0
46.0
31.90141414141414
18.0
46.0
32.05777777777778
18.0
46.0
31.97939393939394
18.0
46.0
32.02383838383838
18.0
46.0
32.08888888888889
18.0
46.0
31.988282828282827
20.0
46.0
32.034747474747476
20.0
stop! we find suitable hash centers
# Save the hash targets as training targets
file_name = str(d) + '_imagenet' + '_' + str(num_class) + '_class_random.pkl'
file_dir = 'data/imagenet/hash_centers/' + file_name
f = open(file_dir, "wb")
torch.save(hash_targets, f)