PQ-源码解析

本文详细解析了PQ(Product Quantization)和IVFPQ(Inverted File Product Quantization)两种编码搜索算法。从加载数据、设置参数、训练、编码到搜索过程,逐一展开阐述。在PQ中,介绍了数据加载、kmeans训练、编码及搜索步骤;而在IVFPQ部分,除了相同的流程,还涉及到了粗聚类中心的计算和残差编码。整个过程对于理解这两种高效的近似最近邻搜索算法至关重要。
摘要由CSDN通过智能技术生成

PQ

Load data and set parameter

第一步是载入数据和设定参数。
载入数据,random生成数据可以不需要数据集。
以下是载入数据的代码:

% Generate or load an evaluation set (query+learn+base)
if strcmp (dataset, 'random')
  % synthetic dataset
    d = 16;

    % Generate a set of unit norm vectors
    ntrain = 10000;
    nbase = 1000000; 
    nquery = 1000;
    % 随机生成d * n维的矩阵
    vtrain = single (rand (d, ntrain));
    vbase = single (rand (d, nbase));
    vquery = single (rand (d, nquery)); 

    % Compute the ground-truth
    t0 = cputime;
    [ids_gnd, dis_gnd] = yael_nn (vbase, vquery, 1);
    tgnd = cputime - t0;

else

  switch dataset
   case 'siftsmall'
    basedir = './siftsmall/' ;               % modify this directory to fit your configuration
    fbase = [basedir 'siftsmall_base.fvecs'];
    fquery = [basedir 'siftsmall_query.fvecs'];
    ftrain = [basedir 'siftsmall_learn.fvecs'];
    fgroundtruth = [basedir 'siftsmall_groundtruth.ivecs'];

   case 'sift'
    basedir = './sift/' ;                    % modify this directory to fit your configuration
    fbase = [basedir 'sift_base.fvecs'];
    fquery = [basedir 'sift_query.fvecs'];
    ftrain = [basedir 'sift_learn.fvecs'];
    fgroundtruth = [basedir 'sift_groundtruth.ivecs'];

   case 'gist'
    basedir = './gist/' ;                    % modify this directory to fit your configuration
    fbase = [basedir 'gist_base.fvecs'];
    fquery = [basedir 'gist_query.fvecs'];
    ftrain = [basedir 'gist_learn.fvecs'];
    fgroundtruth = [basedir 'gist_groundtruth.ivecs'];

  end

  % Read the vectors
  vtrain = fvecs_read (ftrain);
  vbase  = fvecs_read (fbase);
  vquery = fvecs_read (fquery);

  ntrain = size (vtrain, 2);
  nbase = size (vbase, 2);
  nquery = size (vquery, 2);
  d = size (vtrain, 1);

  % Load the groundtruth
  ids = ivecs_read (fgroundtruth);
  ids_gnd = ids (1, :) + 1;  % matlab indices start at 1
end

在随机生成数据的代码中,调用yael_nn函数来生成ground-truth。

之后设定搜索过程中的参数:

k = 100; % number of elements to be returned
  • 3
    点赞
  • 6
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值