NGT使用

pycharm专业版破解参考:破解步骤最新破解补丁下载。参考pycharm配置服务器这个。是在打开本地的文件在服务器上运行,直接打开服务器上的文件运行会报错。ubuntu卸载pycharm

安装执行"make install"会报错:

CMake Error at lib/NGT/cmake_install.cmake:41 (file):
file INSTALL cannot copy file
"/home/che/NGT-1.7.7/build/lib/NGT/libngt.so.1.7.7" to
"/usr/local/lib/libngt.so.1.7.7".
Call Stack (most recent call first):
lib/cmake_install.cmake:42 (include)
cmake_install.cmake:42 (include)

Makefile:73: recipe for target 'install' failed
make: *** [install] Error 1

应该执行"sudo make install",即可解决(参考)。卸载通过"make install"安装的包,可执行(参考):

sudo xargs rm < install_manifest.txt

ngtpy测试 (pip安装)

Processed 100000 objects. time= 86.0712 (sec)
Processed 200000 objects. time= 151.8 (sec)
Processed 300000 objects. time= 190.317 (sec)
Processed 400000 objects. time= 222.733 (sec)
Processed 500000 objects. time= 246.079 (sec)
Processed 600000 objects. time= 259.662 (sec)
Processed 700000 objects. time= 296.847 (sec)
Processed 800000 objects. time= 304.327 (sec)
Processed 900000 objects. time= 320.594 (sec)
Processed 1000000 objects. time= 340.045 (sec)
batch query耗时81.59044170379639秒
recall: 0.99675

插入时的线程问题

ngtpy测试(build安装)

Processed 100000 objects. time= 40.9467 (sec)
Processed 200000 objects. time= 76.988 (sec)
Processed 300000 objects. time= 98.5284 (sec)
Processed 400000 objects. time= 114.624 (sec)
Processed 500000 objects. time= 127.351 (sec)
Processed 600000 objects. time= 138.594 (sec)
Processed 700000 objects. time= 154.436 (sec)
Processed 800000 objects. time= 157.033 (sec)
Processed 900000 objects. time= 167.58 (sec)
Processed 1000000 objects. time= 173.845 (sec)

recall: 0.9967400000000001

NGT-onng测试(有问题)

LeafNode::splitObjects: Anyway, continue...
Processed 700000 objects. time= 3019.6 (sec)
Processed 800000 objects. time= 3481.62 (sec)
Processed 900000 objects. time= 3956.48 (sec)
Processed 1000000 objects. time= 4434.9 (sec)
ngt::reconstructGraph: Extract the graph data.
Reconstruction time=1.215e-06:1.75e-07:6.7e-08
original edge size=10
reverse edge size=120
ngt::Graph reconstruction time=8.0721e-05 (sec) 
GraphReconstructor::adjustPaths: graph preparing time=4.5e-08 (sec)
GraphReconstructor::adjustPaths extracting removed edge candidates time=0.026959 (sec)
ngt::Path adjustment time=0.0271246 (sec) 
adjustSearchEdgeSize::Extract queries for GT...
adjustSearchEdgeSize::create GT...
adjustRateSearchEdgeSize::Base: rate=20
adjustBaseSearchEdgeSize::explore for the mergin 0.2, 4...
开始构建anng
Processed 100000 objects. time= 25.7082 (sec)
Processed 200000 objects. time= 49.1525 (sec)
Processed 300000 objects. time= 65.6256 (sec)
Processed 400000 objects. time= 77.6595 (sec)
Processed 500000 objects. time= 86.5774 (sec)
Processed 600000 objects. time= 92.2477 (sec)
Processed 700000 objects. time= 102.419 (sec)
Processed 800000 objects. time= 103.627 (sec)
Processed 900000 objects. time= 109.25 (sec)
Processed 1000000 objects. time= 109.926 (sec)
anng构建耗时834.0635671615601秒

onng构建耗时691.6314151287079秒
加载onng耗时6.280822038650513秒
batch query耗时59.293705701828秒
recall: 0.99747

----------------------------------------------
(epsilon=0.025)
加载onng耗时4.964902877807617秒
batch query耗时2.2762231826782227秒
recall: 0.7401800000000001
加载onng索引
加载onng耗时4.6635212898254395秒
batch query耗时1.3801333904266357秒
recall: 0.62438
加载onng索引
加载onng耗时4.672779560089111秒
batch query耗时50.05926203727722秒
recall: 0.9985599999999999
edge_size_for_creation=100,edge_size_for_search=0
num_of_outgoings = 10, num_of_incomings = 120

anng构建耗时11936.804404497147秒
onng构建耗时2468.350889444351秒
加载onng耗时10.152814149856567秒
(epsilon=0.01)
batch query耗时5.177316665649414秒
recall: 0.85801 
-----------------------------------------
(epsilon=0.1)
batch query耗时62.492873668670654秒
recall: 0.99984
-----------------------------------------
(epsilon=0.05)
batch query耗时10.901301860809326秒
recall: 0.9917400000000001
-----------------------------------------
(epsilon=0.025)
batch query耗时4.264456748962402秒
recall: 0.93662
dim,edge_size_for_creation=100,edge_size_for_search=125
num_of_outgoings = 10, num_of_incomings = 120
(epsilon=0.025)
anng构建耗时4018.5171251296997秒
onng构建耗时2036.1997334957123秒
加载onng耗时9.730204582214355秒
batch query耗时9.041321992874146秒
recall: 0.9575100000000001
------------------------------------------------------
(epsilon=0.02)
加载onng耗时7.361858606338501秒
batch query耗时4.730037212371826秒
recall: 0.93257
------------------------------------------------------
(epsilon=0.016)
加载onng耗时7.09028959274292秒
batch query耗时4.464436292648315秒
recall: 0.90762
------------------------------------------------------
(epsilon=0.015)
加载onng耗时6.952389478683472秒
batch query耗时4.210986852645874秒
recall: 0.89791
------------------------------------------------------
(epsilon=0.013)
加载onng耗时7.093137502670288秒
batch query耗时2.88352632522583秒
recall: 0.8871899999999999
------------------------------------------------------
linear_search
加载onng耗时7.016313076019287秒
batch query耗时650.9210848808289秒
recall: 0.9998699999999999
edge_size_for_creation=100,edge_size_for_search=115
num_of_outgoings = 10, num_of_incomings = 120
anng构建耗时3497.133960723877秒
onng构建耗时2132.821494102478秒
----------------------------------------------------
(epsilon=0.016)----onngt3c0.016
加载onng耗时9.852200508117676秒
batch query耗时6.195500135421753秒
recall: 0.90347

onng逐次添加测试

以100个vec为基底构建索引,之后每次添加100vec,共计10000vec,总时间2.3858931064605713秒
avg time:0.02427秒,min_time:0.00776 max_time:0.0478 整个时间曲线时递增的。

一次性添加整个数据集所用参数:

print('开始构建anng')
t = time.time()
ngtpy.create(b"ngt-log/t6/anng", dim,edge_size_for_creation=100,edge_size_for_search=125)###edge_size_for_search=125
index = ngtpy.Index(b"ngt-log/t6/anng")
index.batch_insert(x[:100],num_threads=24)
print('anng构建耗时{0}秒'.format((time.time() - t)))
index.save()

print('开始构建onng')
t = time.time()
optimizer = ngtpy.Optimizer()
optimizer.set(num_of_outgoings = 10, num_of_incomings = 120)
optimizer.execute(b"ngt-log/t6/anng", b"ngt-log/t6/onng")
print('onng构建耗时{0}秒'.format((time.time() - t)))

print('加载onng索引')
t = time.time()
index = ngtpy.Index(b"ngt-log/t6/onng")
print('加载onng耗时{0}秒'.format((time.time() - t)))

初始添加100个构建onng不成功(时间太久了),添加700个时可以构建成功,但是只能返回84个近邻点,参数如下:

print('开始构建anng')
t = time.time()
ngtpy.create(b"ngt-log/t6/anng", dim,edge_size_for_creation=40,edge_size_for_search=50)###edge_size_for_search=125
index = ngtpy.Index(b"ngt-log/t6/anng")
index.batch_insert(x[:700],num_threads=24)
print('anng构建耗时{0}秒'.format((time.time() - t)))
index.save()

print('开始构建onng')
t = time.time()
optimizer = ngtpy.Optimizer()
optimizer.set(num_of_outgoings = 4, num_of_incomings = 42)
optimizer.execute(b"ngt-log/t6/anng", b"ngt-log/t6/onng")
print('onng构建耗时{0}秒'.format((time.time() - t)))
t7测试:
先加10w,然后每次加200,整个大小100w
onng构建耗时397.9275736808777秒
all add耗时6948.093120574951秒

加载onng耗时6.954091310501099秒
batch query耗时3.543898820877075秒
recall: 0.88956
t8测试:
先加10w构建anng,然后构建onng,再逐次添加200,整个大小100w,最后再构建onng
onng构建耗时623.2219877243042秒
all add耗时6754.238663673401秒
重新调整onng构建耗时1927.9305436611176秒

加载onng耗时7.097617864608765秒
batch query耗时3.0127980709075928秒
recall: 0.88956

t8测试居然没有任何提升,添加完后再构建onng并没有什么效果。逐次添加和整体添加的效果相差不大(recall相差0.018),逐次添加的用时长,生成的索引也比整体的大。

onng删除测试

remove中的是id,删除过程中可能会出现:

removeEdgesReliably : internal error : cannot find an edge. ID=3 d=4.38431 in 26

其实是正常删除,并不是报错,remove后并不会自动保存索引,需要手动保存,remove10个耗时0.0016667842864990234秒,代码如下:

print('开始删除数据')
daletetime=time.time()
for i in range(0,3):
    print(index.get_object(i))
    index.remove(i)
print('remove耗时{0}秒'.format((time.time() - daletetime)))
index.save()

在使用github的gists时,国内可能要梯子才能进入,也可改hosts访问该网站,如下(参考):

在C:\Windows\System32\drivers\etc中打开hosts,并添加:
192.30.253.118 gist.github.com

内存1内存2cpucpu计算

Dec/feature_query$ mprof plot mprofile_20190909152840.dat

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值