[代码] solrcloud/solr4.0的启动步骤

This page show the major procedures in the progress of Solr4.0  startup 

 SolrDispatchFilter.init(FilterConfig config) init the CoreContainer firstly.   

public  void  init(FilterConfig config)  throws  ServletException
   {
     ...........
     CoreContainer.Initializer init = createInitializer();
     ...........
     this .cores = init.initialize();
     ..........
    }

then CoreContainer.Initalizer.initializer() call the CoreContainer.load()

public  CoreContainer initialize()  throws  IOException,
         ParserConfigurationException, SAXException {
       CoreContainer cores =  null ;
       String solrHome = SolrResourceLoader.locateSolrHome();
       File fconf =  new  File(solrHome, containerConfigFilename ==  null  "solr.xml"
           : containerConfigFilename);
       cores =  new  CoreContainer(solrHome);
 
       if  (fconf.exists()) {
         cores.load(solrHome, fconf);
       else  {
         log.info( "no solr.xml file found − using default" );
         cores.load(solrHome,  new  InputSource( new  ByteArrayInputStream(DEF_SOLR_XML.getBytes( "UTF−8" ))));
         cores.configFile = fconf;
       }
 
       containerConfigFilename = cores.getConfigFile().getName();
 
       return  cores;
     }
  

CoreContainer.load(solrHome, fconf) call CoreContainer.load(String dir, InputSource cfgis). This function is the most important part for Solr4.0's startup. Many members of CoreContainer initialize here, including OverSeer, ZkCotroller,CoreAdminHandler and CollectionHandler.  Now we go in to this function 

..........
initZooKeeper(zkHost, zkClientTimeout); //this calling will initialize the zkControler
 
..........
coreAdminHandler =  new  CoreAdminHandler( this );
..........
 
..........
NodeList nodes = (NodeList)cfg.evaluate( "solr/cores/core" , XPathConstants.NODESET);  //got croe config info from solr.xml
 
     for  ( int  i= 0 ; i<nodes.getLength(); i++) {
       Node node = nodes.item(i);
       .........
       .........
       CoreDescriptor p =  new  CoreDescriptor( this , name, DOMUtil.getAttr(node,  "instanceDir" null ));
       .........
       .........
 
       SolrCore core = create(p); //each Core create and initialize here. All important features will create
       register(name, core,  false );
       .........
       .........
     }

Core created but did not register. The CoreContainer.register(String name, SolrCore core, boolean returnPrevNotClosed) will register the core to zkController. above register(name, core, false) do this job. At the same time the register(name, core, false) will publice the core status to overseer. register(name, core, false) will call ZkController.register(String coreName, final CoreDescriptor desc, boolean recoverReloadedCores) to update this core's cloud status, including join leaderElection line and so on.

public  String register(String coreName,  final  CoreDescriptor desc,  boolean  recoverReloadedCores)  throws  Exception {
........
........
joinElection(desc);
........
........
if  (!core.isReloaded() && ulog !=  null ) { //recover From Log if core is not reload
          Future<UpdateLog.RecoveryInfo> recoveryFuture = core.getUpdateHandler()
              .getUpdateLog().recoverFromLog();
           .......
}
..........
boolean  didRecovery = checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc,
            collection, coreZkNodeName, shardId, leaderProps, core, cc);
         if  (!didRecovery) {
          publish(desc, ZkStateReader.ACTIVE);
        }
..........
 
     zkStateReader.updateCloudState( true );
     return  shardId;
  }

1. zkController.joinElection(desc) decide whether this core is a leader. if it's a leader then call runIamLeader() else start a watcher to watch the former core's status. thezkController.joinElection(desc) call LeaderElector.joinElection(context)  as follow:  

public  int  joinElection(ElectionContext context)  throws  KeeperException, InterruptedException, IOException {
......
int  seq = getSeq(leaderSeqPath);
checkIfIamLeader(seq, context,  false );
.......
}

then LeaderElector.checkIfIamLeader(seq, context, false):

/**

   * Check if the candidate with the given n_* sequence number is the leader.

   * If it is, set the leaderId on the leader zk node. If it is not, start

   * watching the candidate that is in line before this one - if it goes down, check

   * if this candidate is the leader again.

   **/

private  void  checkIfIamLeader( final  int  seq,  final  ElectionContext context,  boolean  replacement)  throws  KeeperException,
       InterruptedException, IOException {
     // get all other numbers...
     final  String holdElectionPath = context.electionPath + ELECTION_NODE;
     List<String> seqs = zkClient.getChildren(holdElectionPath,  null true );
 
     sortSeqs(seqs);
     List<Integer> intSeqs = getSeqs(seqs);
     if  (seq <= intSeqs.get( 0 )) {
       runIamLeaderProcess(context, replacement);
     else  {
       // I am not the leader − watch the node below me
       int  i =  1 ;
       for  (; i < intSeqs.size(); i++) {
         int  s = intSeqs.get(i);
         if  (seq < s) {
           // we found who we come before − watch the guy in front
           break ;
         }
       }
       int  index = i −  2 ;
       if  (index <  0 ) {
         log.warn( "Our node is no longer in line to be leader" );
         return ;
       }
       try  {
         zkClient.getData(holdElectionPath +  "/"  + seqs.get(index),
             new  Watcher() {
 
               @Override
               public  void  process(WatchedEvent event) {
                 // am I the next leader?
                 try  {
                   checkIfIamLeader(seq, context,  true );
                 catch  (InterruptedException e) {
                   // Restore the interrupted status
                   Thread.currentThread().interrupt();
                   log.warn( "" , e);
                 catch  (IOException e) {
                   log.warn( "" , e);
                 catch  (Exception e) {
                   log.warn( "" , e);
                 }
               }
 
             },  null true );
       catch  (KeeperException.SessionExpiredException e) {
         throw  e;
       catch  (KeeperException e) {
         // we couldn't set our watch − the node before us may already be down?
         // we need to check if we are the leader again
         checkIfIamLeader(seq, context,  true );
       }
     }
   }

2. for core.getUpdateHandler().getUpdateLog().recoverFromLog(); this will get the UpdateLog from DirectUodateHandler2UpdateLog call the recoverFromLog() function. this call will start a new thread to replay local updateLog belong to local machine. recoverFromLog() recover the local transation log primarily. the UpdateLog.recoverFromLog() as below:

public  Future<RecoveryInfo> recoverFromLog() {
     recoveryInfo =  new  RecoveryInfo();
 
     List<TransactionLog> recoverLogs =  new  ArrayList<TransactionLog>( 1 );
     for  (TransactionLog ll : newestLogsOnStartup) {
       if  (!ll.try_incref())  continue ;
 
       try  {
         if  (ll.endsWithCommit()) {
           ll.decref();
           continue ;
         }
       catch  (IOException e) {
         log.error( "Error inspecting tlog "  + ll);
         ll.decref();
         continue ;
       }
 
       recoverLogs.add(ll);
     }
 
     if  (recoverLogs.isEmpty())  return  null ;
 
     ExecutorCompletionService<RecoveryInfo> cs =  new  ExecutorCompletionService<RecoveryInfo>(recoveryExecutor);
     LogReplayer replayer =  new  LogReplayer(recoverLogs,  false );
 
     versionInfo.blockUpdates();
     try  {
       state = State.REPLAYING;
     finally  {
       versionInfo.unblockUpdates();
     }
 
     // At this point, we are guaranteed that any new updates coming in will see the state as "replaying"
 
     return  cs.submit(replayer, recoveryInfo);
   }

3. for ZkController.checkRecovery(coreName, desc, recoverReloadedCores, isLeader, cloudDesc,  collection, coreZkNodeName, shardId, leaderProps, core, cc) ,it's a distributed recovery. This process will not do if this core is a leader, or will do recovery. The function will start a new thread named RecoveryStrategy, and this thread is the job holder.  

  1.   If this is the first time, try to recovery from the PeerSync.sync(). this action will try to recovery form leader's updateLog. Turn to step 2 if failed
  2.  do distributed recovery.  RecoveryStrategy.replicate(String nodeName, SolrCore core, ZkNodeProps leaderprops, String baseUrl) will call ReplicationHandler.doFetch()  to fetch index files from leader and try to recovery from those files.
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值