安装
- pip install pyflann
- 如果是python3,且安装失败。
- sudo 2to3 -w [pyflann directory in dist-packages]
使用
建立Flann 并测试使用
###python3
import pyflann
import pickle
import numpy as np
# 生成数据
train_n = 100000
test_n = 500
feature_number = 500
train_data = np.random.rand(train_n, feature_number)
test_data = np.random.rand(test_n, feature_number)
# 设置distance type
pyflann.set_distance_type("euclidean")
# 建立FLANN类
flann = pyflann.FLANN()
# 建立索引并分类
branching = 10
params = flann.build_index(train_data, algorithm='kmeans',target_precision=0.9, branching = branching , log_level='info')
# 测试
top_k_results = 20
# sims: 500 * 20 每行代表一个test data,第k列是对应test data的TOP-k 结果
# dists: 500 * 20 每行代表一个test data, 第k列是test data 和 top-k result 的距离
sims, dists = flann.nn_index(test_data, top_k_results, checks = params['checks'])
# 保存
pickle.dump(params,open('params.pk','wb'))
flann.save_index(b'flann_index')
# Or
# flann_filename = 'flann_index'
# flann.save_index(bytes(flann_filename, encoding='utf8'))
读取已经建立好的FLANN, 并使用
import pickle
import pyflann
import numpy as np
# 读取上次的train_data
train_data = np.load(...)
# 从文件读取 flann
pyflann.set_distance_type("euclidean")
flann = pyflann.FLANN()
params = pickle.load(open('params.pk','rb'))
flann.load_index(b'flann_index', train_data)
# Or
# flann_filename = 'flann_index'
# flann.load_index(bytes(flann_filename, encoding='utf8'), train_data)
# 使用
newData = np.random.rand(200,500)
topk = 20
sims, dists = flann.nn_index(newData, topk, checks=params['checks'])
常见错误
1. ImportError: Cannot load dynamic library. Did you compile FLANN?
出错原因:pip安装的pyflann缺少lib文件夹
解决方法:找到pyflann安装的位置(如下),在【2】中下载lib文件夹放到该目录即可
python3 -m pip show pyflann
# or
python3 -m pip show flann
REFERENCE:
【1】https://www.cs.ubc.ca/research/flann/uploads/FLANN/flann_manual-1.8.4.pdf