Lucene索引创建过程

最新推荐文章于 2021-04-12 09:04:08 发布

chenqiang_99

最新推荐文章于 2021-04-12 09:04:08 发布

阅读量879

点赞数

本文链接：https://blog.csdn.net/chenqiang_99/article/details/48506789

版权

本文深入剖析了Lucene创建索引的过程，包括使用场景、创建IndexWriter、构建Document、更新Document的详细步骤。重点讨论了DocumentsWriterPerThreadPool、ThreadState等关键类的作用，以及为何Lucene更新Document需要先删除再添加的机制。

摘要由CSDN通过智能技术生成

本文档旨在分析Lucene如何把业务信息写到磁盘上的大致流程，并不涉及Document中每个Field如何存储（该部分放在另外一篇wiki中介绍）。

一，Lucene建索引API

 
           Directory dire = NIOFSDirectory.open(FileSystems.getDefault().getPath(indexDirectory)); 
          
           IndexWriterConfig iwc =  
           new  
           IndexWriterConfig( 
           new  
           StandardAnalyzer()); 
          
           iwc.setRAMBufferSizeMB( 
           64 
           );  
           //兆默认刷 
          
           indexWriter =  
           new  
           IndexWriter(dire, iwc); 
          
           Document doc = createDocument(artiste, skuId); 
          
           indexWriter.addDocument(doc); 
          
           indexWriter.commit(); 
          
           indexWriter.close();

二，创建IndexWriter

NIOFSDirectory.open()

如果是64位JRE会得到MMapDirectory（采用内存映射的方式写索引数据到File中）。

 
           IndexWriterConfig 
          
           //properties 
          
           this 
           .analyzer = analyzer; 
          
           ramBufferSizeMB = IndexWriterConfig.DEFAULT_RAM_BUFFER_SIZE_MB; 
           //默认超过16M就会触发flush磁盘操作 
          
           maxBufferedDocs = IndexWriterConfig.DEFAULT_MAX_BUFFERED_DOCS; 
           //默认按照RAM空间大小触发flush 
          
           maxBufferedDeleteTerms = IndexWriterConfig.DEFAULT_MAX_BUFFERED_DELETE_TERMS; 
           // 
          
           mergedSegmentWarmer =  
           null 
           ; 
          
           delPolicy =  
           new  
           KeepOnlyLastCommitDeletionPolicy(); 
           //删除策略 
          
           commit =  
           null 
           ; 
          
           useCompoundFile = IndexWriterConfig.DEFAULT_USE_COMPOUND_FILE_SYSTEM; 
          
           openMode = OpenMode.CREATE_OR_APPEND; 
           //IndexWriter打开模式 
          
           similarity = IndexSearcher.getDefaultSimilarity(); 
           //相似度计算，一般初始化Searcher的时候会用（因为只有查询的时候才会用到相似度计算） 
          
           mergeScheduler =  
           new  
           ConcurrentMergeScheduler(); 
           //每个segement的merge交个一个线程完成 
          
           writeLockTimeout = IndexWriterConfig.WRITE_LOCK_TIMEOUT; 
           //写操作遇到锁超时时间 
          
           indexingChain = DocumentsWriterPerThread.defaultIndexingChain; 
          
           codec = Codec.getDefault();