一次lire 图像搜索调用方法的查看

最新推荐文章于 2021-02-13 11:28:51 发布

SoftWare2589

最新推荐文章于 2021-02-13 11:28:51 发布

阅读量812

点赞数

分类专栏：图像文章标签： lire 图片搜索源代码

本文链接：https://blog.csdn.net/SoftWare2589/article/details/37763757

版权

图像专栏收录该内容

1 篇文章 0 订阅

订阅专栏

首先看一个测试类

	@Test
	public void searchSimilar() throws Exception {
		IndexReader ir = IndexReader.open(FSDirectory.open(new File(INDEX_PATH)));//打开索引
		ImageSearcher is = ImageSearcherFactory.createDefaultSearcher();//创建一个图片搜索器
		FileInputStream fis = new FileInputStream(SEARCH_FILE);//搜索图片源
		BufferedImage bi = ImageIO.read(fis);
		ImageSearchHits ish = is.search(bi, ir);//根据上面提供的图片搜索相似的图片
		for (int i = 0; i < 9; i++) {//显示前10条记录（根据匹配度排序）
			System.out.println(ish.score(i) + ": " + ish.doc(i).getFieldable(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
		}
		System.out.println("分界线*****************");
		Document d = ish.doc(0);//匹配度最高的记录
		ish = is.search(d, ir);// 从结果集中再搜索
		for (int i = 0; i < 4; i++) {
			System.out.println(ish.score(i) + ": " + ish.doc(i).getFieldable(DocumentBuilder.FIELD_NAME_IDENTIFIER).stringValue());
		}
	}

把鼠标放在is.search(bi, ir)的search上面可以看到调用的是search(BufferedImage arg0, IndexReader arg1)，想查看他的实现的时候发现有好多，如图

具体是哪一个呢，可以看一下第4行代码，

ImageSearcher is = ImageSearcherFactory.createDefaultSearcher();//创建一个图片搜索器

会发现调用了createDefaultSearcher()方法，在lire源代码中找到此方法，如下

    /**
     * Returns a new default ImageSearcher with a predefined number of maximum
     * hits defined in the {@link ImageSearcherFactory#NUM_MAX_HITS} based on the {@link net.semanticmetadata.lire.imageanalysis.CEDD} feature
     *
     * @return the searcher instance
     */
    public static ImageSearcher createDefaultSearcher() {
        return new GenericFastImageSearcher(NUM_MAX_HITS, CEDD.class, DocumentBuilder.FIELD_NAME_CEDD);
    }

会看到是调用了GenericFastImageSearcher类，跳到此类，找到对应的search方法，代码如下

    public ImageSearchHits search(BufferedImage image, IndexReader reader) throws IOException {
        logger.finer("Starting extraction.");
        LireFeature lireFeature = null;
        SimpleImageSearchHits searchHits = null;
        try {
            lireFeature = (LireFeature) descriptorClass.newInstance();
            // Scaling image is especially with the correlogram features very important!
            BufferedImage bimg = image;
            if (Math.max(image.getHeight(), image.getWidth()) > GenericDocumentBuilder.MAX_IMAGE_DIMENSION) {
                bimg = ImageUtils.scaleImage(image, GenericDocumentBuilder.MAX_IMAGE_DIMENSION);
            }
            lireFeature.extract(bimg);
            logger.fine("Extraction from image finished");

            float maxDistance = findSimilar(reader, lireFeature);
            searchHits = new SimpleImageSearchHits(this.docs, maxDistance);
        } catch (InstantiationException e) {
            logger.log(Level.SEVERE, "Error instantiating class for generic image searcher: " + e.getMessage());
        } catch (IllegalAccessException e) {
            logger.log(Level.SEVERE, "Error instantiating class for generic image searcher: " + e.getMessage());
        }
        return searchHits;
    }

上面代码中if的作用是如果输入图像分辨率过大，当然这里是大于默认值1024，就将图像缩小.

接着用extract方法提取图像的特征值。

接着用findSimilar方法进行查找相似的图片

然后新建一个ImageHits用来存储查找结果。

最后返回这个结果。

关于extact方法，我看的是CEDD的，是一些列复杂的算法，如果想详细查看的话，请搜索下CEDD算法。

下面是findSimilar()方法的代码，

    /**
     * @param reader
     * @param lireFeature
     * @return the maximum distance found for normalizing.
     * @throws java.io.IOException
     */
    protected float findSimilar(IndexReader reader, LireFeature lireFeature) throws IOException {
        float maxDistance = -1f, overallMaxDistance = -1f;
        boolean hasDeletions = reader.hasDeletions();

        // clear result set ...
        docs.clear();

        int docs = reader.numDocs();
        for (int i = 0; i < docs; i++) {
            // bugfix by Roman Kern
            if (hasDeletions && reader.isDeleted(i)) {
                continue;
            }

            Document d = reader.document(i);
            float distance = getDistance(d, lireFeature);
            assert (distance >= 0);
            // calculate the overall max distance to normalize score afterwards
            if (overallMaxDistance < distance) {
                overallMaxDistance = distance;
            }
            // if it is the first document:
            if (maxDistance < 0) {
                maxDistance = distance;
            }
            // if the array is not full yet:
            if (this.docs.size() < maxHits) {
                this.docs.add(new SimpleResult(distance, d));
                if (distance > maxDistance) maxDistance = distance;
            } else if (distance < maxDistance) {
                // if it is nearer to the sample than at least on of the current set:
                // remove the last one ...
                this.docs.remove(this.docs.last());
                // add the new one ...
                this.docs.add(new SimpleResult(distance, d));
                // and set our new distance border ...
                maxDistance = this.docs.last().getDistance();
            }
        }
        return maxDistance;
    }

查询Lucene的API可以发现hasDeletions()方法的作用是 Returns true if any documents have been deleted

下面调用了getDistance()方法，得到两个的相似距离，代码如下：

    /**
     * Main similarity method called for each and every document in the index.
     *
     * @param document
     * @param lireFeature
     * @return the distance between the given feature and the feature stored in the document.
     */
    protected float getDistance(Document document, LireFeature lireFeature) {
        tempBinaryValue = document.getBinaryValue(fieldName);
        if (tempBinaryValue != null && tempBinaryValue.length > 0) {
            cachedInstance.setByteArrayRepresentation(tempBinaryValue);
            return lireFeature.getDistance(cachedInstance);
        } else {
            logger.warning("No feature stored in this document! (" + descriptorClass.getName() + ")");
        }
        return 0f;
    }

在这里又调用了一个getDistance()方法，在这里可以直接跳到CEDD类下的此方法，通过算法计算得到值。

最后通过比较得到需要的值。

这里返回到search()方法这里用到了SimpleImageSearchHits()的构造方法，代码如下

    public SimpleImageSearchHits(Collection<SimpleResult> results, float maxDistance) {
        this.results = new ArrayList<SimpleResult>(results.size());
        this.results.addAll(results);
        // this step normalizes and inverts the distance ...
        // although its now a score or similarity like measure its further called distance
        for (Iterator<SimpleResult> iterator = this.results.iterator(); iterator.hasNext(); ) {
            SimpleResult result = iterator.next();
            result.setDistance(1f - result.getDistance() / maxDistance);
        }
    }

一次搜索的调用就完成了，很多地方都没看懂，只是想先把这个流程走下来。希望懂的人能指点