solr创建索引源码解析

先说下创建索引源码流程

      源码类: 
          1.CommonHttpSolrServer (SolrServer的子类) 
          2.SolrServer(抽象类) 
          3.SolrRequest (基类) 
          4.AbstractUpdateRequest (抽象类、SolrRequest的子类) 
          5.UpdateRequest  (AbstractUpdateRequest的子类) 
          6.SolrInputDocument  (设置需要索引的名称和值、这个应该放在第一位) 


创建索引代码:


  1.        查询数据库数据,或者其他文档数据进行索引
              private void updateBook(String sql, String url, String idColumn,
    			String timeColumn,BufferedWriter dataFile) throws Exception {
    		long start = System.currentTimeMillis();
                    <span></span> SolrUtil solrUtil = new SolrUtil(url);//初始化索引
    		SolrDocument doc = SqlSh.getSolrMaxDoc(solrUtil, idColumn, timeColumn);
    		if (doc == null) {
    			CommonLogger.getLogger().error("solr no data.");
    			return;
    		}
    		int maxId = Integer.parseInt(doc.get(idColumn).toString());
    		long maxTime = Long.parseLong(doc.get(timeColumn).toString())*1000;
    		Date maxDate = new Date(maxTime);
    		
    		DateFormat dateFormat2 = new SimpleDateFormat("yyyy-MM-dd HH:mm:ss");
                    //获取数据库需要索引的数据
    		ResultSet rs = stmt_m.executeQuery(String.format(sql,
    				dateFormat2.format(maxDate)));
    		//获取需要创建索引的key
                    initColumeMeta(rs.getMetaData());
    
                    //解析数据并索引
    		parseRs(rs, solrUtil);
    
    		rs.close();
                   
                    //优化索引
    		solrUtil.server.optimize();
    
    		CommonLogger.getLogger().info(
    				"update book time:" + (System.currentTimeMillis() - start)
    						/ 1000 + "s");
    	}


  2. 咱们看下上面代码的parseRs方法
    //下面是简单的解析数据方法并写索引       

     private void parseRs(ResultSet rs, SolrUtil solrUtil) throws <span></span> Exception {
    		Collection<SolrInputDocument> docs=new ArrayList<SolrInputDocument>();
    		SolrInputDocument doc = null;
    		int locBk = 0;
    		boolean flag=true;
    		StringBuilder sb=null;
    		String vl=null;
    		try {
    			while (rs.next()) {
    				doc = new SolrInputDocument();
    				for (int i = 0; i < ToolMain.columnNames.length; i++) {
    					doc.addField(
    					    ToolMain.columnNames[i],
    						getColumnValue(
    						    rs.getObject(ToolMain.columnNames[i]),
    							ToolMain.columnTypes[i]));//此方法为设置一个域,可以添加一个参数来设置权重
    				}
    				docs.add(doc);
    				locBk++;
    				if (docs.size() >= 1000) {
    					solrUtil.addDocList(docs);//创建索引和提交索引操作都在这里面
    					docs.clear();
    				}
    			}
    			if (docs.size() > 0) {
    				solrUtil.addDocList(docs);
    				docs.clear();
    			}
    		} catch (Exception e) {
    			throw e;
    		} finally {
    			docs.clear();
    			docs = null;
    		}
    	}

    3.下面来说明下SolrUtil类,此类主要是封装了CommonHttpSolrServer
    import java.util.Collection;
    
    import log.CommonLogger;
    
    import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
    import org.apache.solr.common.SolrInputDocument;
    
    public class SolrUtil {
    	public CommonsHttpSolrServer server = null;
    
    	public String url = "";//url为solr服务的地址
    	public  String shards = "";
    
    	public SolrUtil(String url) {
    		this.url = url;
    		initSolr();
    	}
    	public SolrUtil(String url,String shards) {
    		this.url = url;
    		this.shards=shards;
    		initSolr();
    	}
           //初始化Server
    	private void initSolr() {
    		try {
    			server = new CommonsHttpSolrServer(url);
    			server.setSoTimeout(60*1000);
    			server.setConnectionTimeout(60*1000);
    			server.setDefaultMaxConnectionsPerHost(1000);
    			server.setMaxTotalConnections(1000);
    			server.setFollowRedirects(false);
    			server.setAllowCompression(true);
    		} catch (Exception e) {
    			e.printStackTrace();
    			System.exit(-1);
    		}
    	}
    	//封装了add、commit
    	public void addDocList(Collection<SolrInputDocument> docs) {
    		try {
    			server.add(docs);
    			server.commit();
    			docs.clear();//释放
    		} catch (Exception e) {
    			CommonLogger.getLogger().error("addDocList error.", e);
    		}
    	}
    	
    	public void deleteDocByQuery(String query) throws Exception { 
    		try {
    			server.deleteByQuery(query);
    			server.commit();
    		} catch (Exception e) {
    			CommonLogger.getLogger().error("deleteDocByQuery error.", e);
    			throw e;
    		}
    	}
    }

    4.现在来看看solr创建索引的源码

        其实源码执行的操作无非是 生成请求request  返回response

        1.上面代码中的SolrInputDocument 类所做的操作

        public class SolrInputDocument implements Map<String,SolrInputField>, Iterable<SolrInputField>, Serializable   //实现了Map和Iterable的接口并且实现了接口中的方法,其主要的类为SolrInputFiled类

        public class SolrInputField implements Iterable<Object>, Serializable //类中只有三个属性,String key,Object value,还包括评分  float boost = 1.0f; 默认是1.0f(如果做权重的话可以设置这个值)
    

    再来看下执行的CommonHttpSolrServer类所做的操作(表现形式在SolrUtil中的addDocList)

        2.添加文档方法

    public UpdateResponse add(Collection<SolrInputDocument> docs )                         throws SolrServerException, IOException {

                UpdateRequest req = new UpdateRequest();//创建一个request

              req.add(docs);//调用UpdateRequest的add方法,添加索引文档
              return req.process(this);//亲 重点是这个方法(返回的是response)
       }

            //再看下UpdateRequest的add方法
            private List<SolrInputDocument> documents = null;
            public UpdateRequest add( final Collection<SolrInputDocument> docs )
            {
                if( documents == null ) {
                      documents = new ArrayList<SolrInputDocument>( docs.size()+1 );
                }
                documents.addAll( docs );
                return this;
            }

     3.提交方法 commit,调用的是SolrServer类中的
     public UpdateResponse commit( boolean waitFlush, boolean waitSearcher ) throws Solr    ServerException, IOException {
            return new UpdateRequest().setAction( UpdateRequest.ACTION.COMMIT, waitFlush, waitSearcher ).process( this );//看到了吗?
    <pre class="brush:java; toolbar: true; auto-links: false;"> setAction都是为了对对象ModifiableSolrParams(这个对象在最终CommonHttpSolrServerrequest的request方法中用的到)</pre>
    <span></span> 在提交索引的时候也是调用的process方法
      }

    4.优化索引
       public UpdateResponse optimize(boolean waitFlush, boolean waitSearcher,                 int maxSegments ) throws SolrServerException, IOException {
                return new UpdateRequest().setAction( UpdateRequest.ACTION.OPTIMIZE, waitFlush, waitSearcher, maxSegments ).process( this );//同样调用process,通过setAction参数,在CommonHttpSolrServer类方法request()中主要执行的是合并和压缩  setAction都是为了对对象ModifiableSolrParams(这个对象在最终CommonHttpSolrServer的request方法中用的到)进行赋值
       }

    5.既然上面都提到了process方法,那我们来看看
    @Override
         public UpdateResponse process( SolrServer server ) throws SolrServerException,             IOException
         {
               long startTime = System.currentTimeMillis();
               UpdateResponse res = new UpdateResponse();
               res.setResponse( server.request( this ) );//这里面这个方法可是重点之重啊,这是调用了 CommonHttpSolrServer类中的request方法
               res.setElapsedTime( System.currentTimeMillis()-startTime );
               return res;
         }

    6.最终的方法是SolrServer的子类CommonHttpSolrServer类的request方法,咋再来看看这个方法是怎么工作的
    public NamedList<Object> request(final SolrRequest request, ResponseParser processor    ) throws SolrServerException, IOException {
        
        HttpMethod method = null;
        InputStream is = null;
        SolrParams params = request.getParams();
        Collection<ContentStream> streams = requestWriter.getContentStreams(request);
        String path = requestWriter.getPath(request);
        
        //创建索引进来的是/update  /select 为查询  
        if( path == null || !path.startsWith( "/" ) ) {
          path = "/select";
        }
        
        ResponseParser parser = request.getResponseParser();
        if( parser == null ) {
          parser = _parser;
        }
        
        // The parser 'wt=' and 'version=' params are used instead of the original params
        ModifiableSolrParams wparams = new ModifiableSolrParams();
        wparams.set( CommonParams.WT, parser.getWriterType() );
        wparams.set( CommonParams.VERSION, parser.getVersion());
        if( params == null ) {
          params = wparams;
        }
        else {
          params = new DefaultSolrParams( wparams, params );
        }
        
        if( _invariantParams != null ) {
          params = new DefaultSolrParams( _invariantParams, params );
        }
    
        int tries = _maxRetries + 1;
        try {
          while( tries-- > 0 ) {
            // Note: since we aren't do intermittent time keeping
            // ourselves, the potential non-timeout latency could be as
            // much as tries-times (plus scheduling effects) the given
            // timeAllowed.
            try {//通过使用查看solr源码,在使用UpdateRequest对象时会自动设置为Post
              if( SolrRequest.METHOD.GET == request.getMethod() ) {
                if( streams != null ) {
                      <span></span>throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "GET can't send streams!" );
                }
                method = new GetMethod( _baseURL + path + ClientUtils.toQueryString( params, false ) );
              }
              else if( SolrRequest.METHOD.POST == request.getMethod() ) {//所以我们直接看
    
                String url = _baseURL + path;
                boolean isMultipart = ( streams != null && streams.size() > 1 );
    
                if (streams == null || isMultipart) {
                  PostMethod post = new PostMethod(url);//设置post,包括request头部、内容、参数、等等一些操作
                  post.getParams().setContentCharset("UTF-8");
                  if (!this.useMultiPartPost && !isMultipart) {
                    post.addRequestHeader("Content-Type",
                        "application/x-www-form-urlencoded; charset=UTF-8");
                  }
    
                  List<Part> parts = new LinkedList<Part>();
                  Iterator<String> iter = params.getParameterNamesIterator();
                  while (iter.hasNext()) {
                    String p = iter.next();
                    String[] vals = params.getParams(p);
                    if (vals != null) {
                      for (String v : vals) {
                        if (this.useMultiPartPost || isMultipart) {
                          parts.add(new StringPart(p, v, "UTF-8"));
                        } else {
                          post.addParameter(p, v);
                        }
                      }
                    }
                  }
    
                  if (isMultipart) {
                    int i = 0;
                    for (ContentStream content : streams) {
                      final ContentStream c = content;
    
                      String charSet = null;
                      PartSource source = new PartSource() {
                        public long getLength() {
                          return c.getSize();
                        }
                        public String getFileName() {
                          return c.getName();
                        }
                        public InputStream createInputStream() throws IOException {
                          return c.getStream();
                        }
                      };
                    
                      parts.add(new FilePart(c.getName(), source, 
                                             c.getContentType(), charSet));
                    }
                  }
                  if (parts.size() > 0) {
                    post.setRequestEntity(new MultipartRequestEntity(parts
                        .toArray(new Part[parts.size()]), post.getParams()));
                  }
    
                  method = post;
                }
                // It is has one stream, it is the post body, put the params in the URL
                else {
                  String pstr = ClientUtils.toQueryString(params, false);
                  PostMethod post = new PostMethod(url + pstr);
    
                  // Single stream as body
                  // Using a loop just to get the first one
                  final ContentStream[] contentStream = new ContentStream[1];
                  for (ContentStream content : streams) {
                    contentStream[0] = content;
                    break;
                  }
                  if (contentStream[0] instanceof RequestWriter.LazyContentStream) {
                    post.setRequestEntity(new RequestEntity() {
                      public long getContentLength() {
                        return -1;
                      }
    
                      public String getContentType() {
                        return contentStream[0].getContentType();
                      }
    
                      public boolean isRepeatable() {
                        return false;
                      }
    
                      public void writeRequest(OutputStream outputStream) throws IOException {
                        ((RequestWriter.LazyContentStream) contentStream[0]).writeTo(outputStream);
                      }
                    }
                    );
    
                  } else {
                    is = contentStream[0].getStream();
                    post.setRequestEntity(new InputStreamRequestEntity(is, contentStream[0].getContentType()));
                  }
                  method = post;
                }
              }
              else {
                throw new SolrServerException("Unsupported method: "+request.getMethod() );
              }
            }
            catch( NoHttpResponseException r ) {
              // This is generally safe to retry on
              method.releaseConnection();
              method = null;
              if(is != null) {
                is.close();
              }
              // If out of tries then just rethrow (as normal error).
              if( ( tries < 1 ) ) {
                throw r;
              }
              //log.warn( "Caught: " + r + ". Retrying..." );
            }
          }
        }
        catch( IOException ex ) {
          throw new SolrServerException("error reading streams", ex );
        }
    
        method.setFollowRedirects( _followRedirects );
        method.addRequestHeader( "User-Agent", AGENT );
        if( _allowCompression ) {
          method.setRequestHeader( new Header( "Accept-Encoding", "gzip,deflate" ) );
        }
    
        try {
          // Execute the method.
          //System.out.println( "EXECUTE:"+method.getURI() );
          //执行请求,返回状态码,然后组装response 最后返回
          int statusCode = _httpClient.executeMethod(method);
          if (statusCode != HttpStatus.SC_OK) {
            StringBuilder msg = new StringBuilder();
            msg.append( method.getStatusLine().getReasonPhrase() );
            msg.append( "\n\n" );
            msg.append( method.getStatusText() );
            msg.append( "\n\n" );
            msg.append( "request: "+method.getURI() );
            throw new SolrException(statusCode, java.net.URLDecoder.decode(msg.toString(), "UTF-8") );
          }
    
          // Read the contents
          String charset = "UTF-8";
          if( method instanceof HttpMethodBase ) {
            charset = ((HttpMethodBase)method).getResponseCharSet();
          }
          InputStream respBody = method.getResponseBodyAsStream();
          // Jakarta Commons HTTPClient doesn't handle any
          // compression natively.  Handle gzip or deflate
          // here if applicable.
          if( _allowCompression ) {
            Header contentEncodingHeader = method.getResponseHeader( "Content-Encoding" );
            if( contentEncodingHeader != null ) {
              String contentEncoding = contentEncodingHeader.getValue();
              if( contentEncoding.contains( "gzip" ) ) {
                //log.debug( "wrapping response in GZIPInputStream" );
                respBody = new GZIPInputStream( respBody );
              }
              else if( contentEncoding.contains( "deflate" ) ) {
                //log.debug( "wrapping response in InflaterInputStream" );
                respBody = new InflaterInputStream(respBody);
              }
            }
            else {
              Header contentTypeHeader = method.getResponseHeader( "Content-Type" );
              if( contentTypeHeader != null ) {
                String contentType = contentTypeHeader.getValue();
                if( contentType != null ) {
                  if( contentType.startsWith( "application/x-gzip-compressed" ) ) {
                    //log.debug( "wrapping response in GZIPInputStream" );
                    respBody = new GZIPInputStream( respBody );
                  }
                  else if ( contentType.startsWith("application/x-deflate") ) {
                    //log.debug( "wrapping response in InflaterInputStream" );
                    respBody = new InflaterInputStream(respBody);
                  }
                }
              }
            }
          }
          return processor.processResponse(respBody, charset);
        }
        catch (HttpException e) {
          throw new SolrServerException( e );
        }
        catch (IOException e) {
          throw new SolrServerException( e );
        }
        finally {
          method.releaseConnection();
          if(is != null) {
            is.close();
          }
        }
      }

    下面是文字说明:

              1.查询数据库或者读取文件等等  按找自己的方式存入SolrInputDocument中、 SolrInputDocument中会定义一个map来存储  (正真的对象是SolrInputFiled

              2.初始化CommonHttpSolrServer  ,包括服务url(solr服务地址)、超时时间、最大链接数等等 (SolrUtil类) 

              3.SolrServer类的add/commit/optimize方法最终调用的都是 AbstractUpdateRequest类中的process方法 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Solr源码在MyEclipse下的搭建 1. 下载并按装Ant 下载地址: http://ant.apache.org/bindownload.cgi Ant环境变量配置: ANT_HOME: E:\Program Files\apache-ant-1.9.0 Path: %ANT_HOME%\bin 在cmd中输入ant -v, 有下图结果表示成功安装 2. 下载Solr源码 下载地址: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html 3. 用Ant把Solr源码转换为MyEclipse Web项目 进入Solr源码的根目录 在cmd中输入ant eclipse, 按回车后你将会看到如下画面, 如果你的c:\Users\用户名\.ant\lib下没有ivy jar包的话 此时你按它说的那样需输入ant ivy-bootstrap命令下载ivy, 当然你也可以直接吧ivy jar包直接放入c:\Users\用户名\.ant\lib下 下好后再输入刚才的ant eclipse命令,回车后你会看到一堆信息,此时表明ant已经再帮你生成项目了。期间会等一段时间,在这期间也可能会出现fault信息,而且就是它可能造成你很久都看不到成功生成,在我目前遇到的情况下的解决办法是,再输入一遍命令,之后就看你的点了,或者你有更好的解决办法。 4. 把Eclipse普通项目转化为web项目 如果是Eclipse可以看考百度。这里只介绍MyEclipse的转化方法。 1. 在项目根目录下创建一个WebRoot文件夹 2. 找一个MyEclipse Web项目,把.project文件中的<buildSpec>...</buildSpec>和<natures>...</natures>标签中的内容复制到生成的项目中的.project文件中。 3. 找到Web项目中的.mymetadata文件,看看里面的内容,就知道怎么回事了。 4. 求改项目编译结果的存放地址,找到"<classpathentry kind="output"..."部分,修改path的值为WebRoot/WEB-INF/classes,这样就可以跑自己的代码了。 5. 配置Solr运行环境 1. 把solr.war(solr-4.2.0\example\solr-webapp\solr.war)里的东西全复制到WebRoot下 2. 创建solr/home, 把solr-4.2.0\example\solr所有文件复制到你创建solr/home目录下 3. 创建JNDI让程序找到solr/home(当然你也可以用System Properties方式), 在WebRoot/META-INF目下创建context.xml 文件,并写入以下字符 <?xml version='1.0' encoding='utf-8'?> <Context> <Environment name="solr/home" type="java.lang.String" value="E:\Solr" override="true" /> </Context> 注:value对应地址即你创建solr/home目录地址 4. 部署到tomcat,开始Solr

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值