一种无监督人脸聚类方法(SOTA效果)

Learning to Cluster Faces by Infomap

Intorduction

采用了无监督方法infomap进行人脸聚类github,在公开数据集上MS-Celeb-1M、YouTube-Faces、DeepFashion获得较当前主流方法(如GCN人脸聚类等监督方法)同等或更优的效果.
通过faiss加速邻接边的构建,提高聚类速度,单批百万数据聚类仅需几分钟. 效果及性能测试详见下表.

Infomap Intorduction

About Infomap

About GCN Method

  1. L-GCN: Linkage-based Face Clustering via Graph Convolution Network, CVPR 2019
  2. GCN-D: Learning to Cluster Faces on an Affinity Graph, CVPR 2019 (Oral)
  3. GCN-V+GCN-E: Learning to Cluster Faces via Confidence and Connectivity Estimation, CVPR 2020

Requirements

  • Python >= 3.6
  • sklearn
  • infomap
  • numpy
  • faiss-gpu(or faiss-cpu)

Datasets

MS-Celeb-1M : part1_test (584K)、YouTube-Faces、DeepFashion
download

Run

python face-cluster-by-infomap

Results on part1_test (584K)

MethodPrecisionRecallF-score
Chinese Whispers (k=80, th=0.6, iters=20)55.4952.4653.93
Approx Rank Order (k=80, th=0)99.777.213.42
MiniBatchKmeans (ncluster=5000, bs=100)45.4880.9858.25
KNN DBSCAN (k=80, th=0.7, eps=0.25, min=1)95.2552.7967.93
FastHAC (dist=0.72, single)92.0757.2870.63
DaskSpectral (ncluster=8573, affinity=‘rbf’)78.7566.5972.16
CDP (single model, th=0.7)80.1970.4775.02
L-GCN (k_at_hop=[200, 10], active_conn=10, step=0.6, maxsz=300)74.3883.5178.68
GCN-D (2 prpsls)95.4167.7779.25
GCN-D (5 prpsls)94.6272.5982.15
GCN-D (8 prpsls)94.2379.6986.35
GCN-D (20 prplss)94.5481.6287.61
GCN-D + GCN-S (2 prpsls)99.0767.2280.1
GCN-D + GCN-S (5 prpsls)98.8472.0183.31
GCN-D + GCN-S (8 prpsls)97.9378.9887.44
GCN-D + GCN-S (20 prpsls)97.9180.8688.57
GCN-V92.4582.4287.14
GCN-V + GCN-E92.5683.7487.93
Infomap(ours)(k=50,min_sim=0.58)95.5092.5193.98

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-MrAsRZ9K-1606379388603)(./image/evaluate.png)]

Results on YouTube-Faces

MethodPairwise F-scoreBCubed F-scoreNMI
Chinese Whispers (k=160, th=0.75, iters=20)72.970.5593.25
Approx Rank Order (k=200, th=0)76.4575.4594.34
Kmeans (ncluster=1436)67.8675.7793.99
KNN DBSCAN (k=160, th=0., eps=0.3, min=1)91.3589.3497.52
FastHAC (dist=0.72, single)93.0787.9897.19
GCN-D (4 prpsls)94.4491.3397.97
Infomap(ours)(k=400,min_sim=0.56)92.8291.7898.04

Results on DeepFashion

MethodPairwise F-scoreBCubed F-scoreNMI
Chinese Whispers (k=5, th=0.7, iters=20)31.2253.2589.8
Approx Rank Order (k=10, th=0)25.0452.7788.71
Kmeans (ncluster=3991)32.0253.388.91
KNN DBSCAN (k=4, th=0., eps=0.1, min=2)25.0753.2390.75
FastHAC (dist=0.4, single)22.5448.7790.44
Meanshift (bandwidth=0.5)31.6156.7389.29
Spectral (ncluster=3991, affinity=‘rbf’)29.647.1286.95
DaskSpectral (ncluster=3991, affinity=‘rbf’)24.2544.1186.21
CDP (single model, k=2, th=0.5, maxsz=200)28.2857.8390.93
L-GCN (k_at_hop=[5, 5], active_conn=5, step=0.5, maxsz=50)30.760.1390.67
GCN-D (2 prpsls)29.1459.0989.48
GCN-D (8 prpsls)32.5257.5289.54
GCN-D (20 prpsls)33.2556.8389.36
GCN-V33.5959.4190.88
GCN-V + GCN-E38.4760.0690.5
Infomap(ours)(k=400,min_sim=0.88)38.6760.4890.97

Time Consumes and Gpu Memory (k=50,min_sim=0.58)

NodesEdgesTimeCountGpu Memory
50000016535263160(s)2745(MiB)
100000030206572400(s)3235(MiB)

Comments

  • k值越大,TimeCount会变长,Gpu Memory会增加
  • k值并非infomap的参数,仅为faiss构建knn构建时使用
  • 数据量增加一倍,KNN构建时间为原来的4倍(实质为n*n的向量搜索)

References

  • 5
    点赞
  • 24
    收藏
    觉得还不错? 一键收藏
  • 9
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 9
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值