k-NN与检索任务的异同
偶然遇到k-NN,突然感觉算法过程跟检索任务很像,以图像分类任务中的k-NN为例,有如下对应:
k-NN | 检索 |
---|---|
图像 | 检索集 |
label | 被检索集 |
两者相同点是:
- 训练阶段,图像(检索样本)和label(被检索样本)都是成对出现
- 测试阶段,对于每个检索样本(图像),均要与所有被检索样本(训练集图像)计算距离
区别是:测试阶段,检索任务得到的N个结果各不相同,而k-NN得到的N个结果可能存在相同的,因此需要再行投票
启发:检索任务是否也可以应用这种投票思想呢?
jupyter转pdf
转换后方便打印,很实用~
import argparse
import os
import subprocess
try:
from PyPDF2 import PdfFileMerger
MERGE = True
except ImportError:
print("Could not find PyPDF2. Leaving pdf files unmerged.")
MERGE = False
def main(files, pdf_name):
os_args = [
"jupyter",
"nbconvert",
"--log-level",
"CRITICAL",
"--to",
"pdf",
]
for f in files:
os_args.append(f)
subprocess.run(os_args)
print("Created PDF {}.".format(f))
if MERGE:
pdfs = [f.split(".")[0] + ".pdf" for f in files]
merger = PdfFileMerger()
for pdf in pdfs:
merger.append(pdf)
merger.write(pdf_name)
merger.close()
for pdf in pdfs:
os.remove(pdf)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
# We pass in a explicit notebook arg so that we can provide an ordered list
# and produce an ordered PDF.
parser.add_argument("--notebooks", type=str, nargs="+", required=True)
parser.add_argument("--pdf_filename", type=str, required=True)
args = parser.parse_args()
main(args.notebooks, args.pdf_filename)