Hard commits, soft commits and transaction logs(solrcloud同步)

The mantra

Repeat after me “Hard commits are about durability, soft commits are about visibility“. Hard and soft commits are related concepts, but serve different purposes. Concealed in this simple statement are many details; we’ll try to illuminate some of them. First, some definitions:

  • Transaction log (tlog): A file where the raw documents are written for recovery purposes. In SolrCloud, each node has its own tlog. On update, the entire document gets written to the tlog. For Atomic Updates, it’s still the entire document, including data read from the old version of the document. In other words, the document written to the tlog is not the “delta” for atomic updates. Tlogs are critical for consistency, they are used to bring an index up to date if the JVM is stopped before segments get closed.
    • NOTE: The transaction log will be replayed on server restart if the server was not gracefully shut down! So if your tlog is huge (and we’ve seen it in gigabyte ranges) then restarting your server can be very, very slow. As in hours.
  • Hard commit: This is governed by the <autoCommit> option in solrconfig.xml or explicit calls from a client (SolrJ or HTTP via the browser, cURL or similar). Hard commits truncate the current segment and open a new segment in your index.
    • openSearcher: A boolean sub-property of <autoCommit> that governs whether the newly-committed data is made visible to subsequent searches.
  • Soft commit: A less-expensive operation than hard-commit (openSearcher=true) that also makes documents visible to search.
  • fsynch: Low-level I/O-speak. When an fsynch call returns, the data has been written to a file on disk. This is different from simply having a Java program write data in that the return from a Java-level write only guarantees that the new data has been handed over to the operating system which will change the bits on the disk in its own good time. This is usually measured in a a few milliseconds (say 10-50 ms). The difference is that if the JVM crashes, the op system will still change the bits on disk, but the op system crashes after Java “writes” the file but before the I/O subsystem gets around to actually changing the bits on disk, the data can be lost. This is usually not something you need to be concerned about, it’s important only when you need to be absolutely sure that no data ever gets lost.
  • flush: The Java operation that hands the data over to the op-system. Upon return, the bits on the disk will not have been changed but the data will be written to disk in the event of an op system crash.
    • Note that, especially in SolrCloud where there are more than one replica per shard, losing data requires that both nodes go down at the same time in such a manner as neither one manages to complete the write to disk, which is very unlikely.

Important: Soft commits are “less expensive”, but they still aren’t free. You should make the soft commit interval as long as is reasonable for best performance!

Transaction Logs

Transaction logs are integral to the data guarantees of Solr4, and also a place people get into trouble, so let’s talk about them a bit. The indexing flow in SolrCloud is as follows:

  1. Incoming documents are received by a node and forwarded to the proper leader.
  2. From the leader they’re sent to all replicas for the relevant shard. 
  3. The replicas respond to their leader.
  4. The leader responds to the originating node.
  5. After all the leaders have responded, the originating node replies to the client. At this point, all documents have been flushed to the tlog for all the nodes in the cluster!
  6. If the JVM crashes, the documents are still safely written to disk. If the op system crashes, then not.
    1. If the JVM crashes (or, say, is killed with a -9), then on restart, the tlog is replayed.
    2. You can alter the configuration in solrconfig.xml to fsynch rather than flush before return, but this is rarely necessary. With leaders and replicas the chance that all of the replicas suffer a hardware crash at the same time that loses data for all of them is small. Some use-cases cannot tolerate even this tiny chance, so may choose to pay the price of decreased throughput however.

Note: tlogs are “rolled over” automatically on hard commit (openSearcher true or false). The old one is closed and a new one is opened. Enough tlogs are kept around to contain 100 documents, and older tlogs are deleted. So consider if you are indexing in batches of 25 documents and hard committing after each one (not that you should commit that often, but just saying). You should have 5 tlogs at any given time. the oldest four (closed) contain 25 documents each, totaling 100 plus the current tlog file. When the current tlog is closed, the oldest tlog will be deleted, and a new one opened.

Note particularly that there is no attempt on Solr’s part to only put 100 documents in any particular tlog. Tlogs are only rolled over when you tell Solr to, i.e. issue a hard commit. So in bulk-loading situations where you are loading, say, 1,000 docs/second and you don’t do a hard commit for an hour, your single tlog will contain 3,600,000 documents. And an un-graceful shutdown may cause it to be entirely replayed before the Solr node is open for business.

If you have very large tlogs, this is a Bad Thing and you should change your hard commit settings! This is especially trappy for people coming from the 3.x days where hard commits were often very lengthy since they were always expensive, there was no openSearcher=false option.

Soft commit

Soft commits are about visibility, hard commits are about durability. 

The thing to understand most about soft commits are that they will make documents visible, but at some cost. In particular the “top level” caches, which include what you configure in solrconfig.xml (filterCache, queryResultCache, etc) will be invalidated! Also, the FieldValueCache is invalidated, so facet queries will have to wait until the cache is refreshed. With very frequent soft commits it’s often the case that your top-level caches are little used and may, in some cases, be eliminated.

However, “segment level caches”, which include function queries, sorting caches, etc are “per segment”, so will not be invalidated on soft commit.

So what does all this mean?

Consider a soft commit. On execution you have the following:

  • The tlog has NOT been truncated. It will continue to grow.
  • The documents WILL be visible.
  • Some caches will have to be reloaded
  • Your top-level caches will be i
    nvalidated.

Note, I haven’t said a thing about index segments! That’s for hard commits.

And again, soft commits are “less expensive” than hard commits (openSearcher=true), but they are not free. The motto of the Lunar Colony in a science fiction novel (“The Moon Is a Harsh Mistress” by Robert Heinlein) was TANSTAAFL, There Ain’t No Such Thing As A Free Lunch. Soft commits are there to support Near Real Time, and they do. But they do have cost, so use the longest soft commit interval you can for best performance.

Hard commit

Hard commits are about durability, soft commits are about visibility.

There are really two flavors here, openSearcher=true and openSearcher=false. First we’ll talk about what happens in both cases.

If openSearcher=true or openSearcher=false, the following two consequences are most important:

  • The tlog is truncated: A new tlog is started. Old tlogs will be deleted if there are more than 100 documents in newer tlogs.
  • The current index segment is closed and flushed. 
  • Background segment merges may be initiated.

The above happens on all hard commits. That leaves the openSearcher setting

  •  openSearcher=true: The Solr/Lucene searchers are re-opened and  all caches are invalidated. Autowarming is done etc. This used to be the only way you could see newly-added documents.
  • openSearcher=false: Nothing further happens other than the three points above. To search the docs, a soft commit is necessary.

Recovery

I’ve talked above about durability, so let’s talk about that a bit. When a machine crashes, the JVM quits, whatever, here’s the state of your cluster.

  • The last update call that returned successfully has all the documents written to all the tlogs in the cluster. The default is that the tlog has been flushed, but not fsync’d. As discussed above, you can override this default behavior but it is not recommended.
  • On restart of the affected machine, it contacts the leader and either
    • Replays the documents from its own tlog if < 100 new updates have been received by the leader, or
    • Does an old-style replication from the leader to catch up.

Recovery can take some time. This is one of the hidden “gotchas” people are running in to as the work with SolrCloud. They are experimenting, so they’re bouncing servers up and down all over the place, killing Solrs with ‘kill -9′ etc. On the one hand, this is great, since it exercises the whole SolrCloud recovery process. On the other hand it’s not very great as it’s a highly artificial experience. If you have nodes disappearing many times a day you have bigger problems than Solr taking some time to recover on startup that should be fixed!

Recommendations

I always shudder at this, because any recommendation will be wrong in some cases. My first recommendation would be to not overthink the problem. Some very smart people have tried to make the entire process robust. Try the simple things first and only tweak things as necessary. In particular, look at the size of your transaction logs and adjust your hard commit intervals to keep these “reasonably sized”. Remember that the penalty is mostly the replay-time involved if you restart after a JVM crash. Is 15 seconds tolerable? Why go smaller then?

We’ve seen situations in which the hard commit interval is much shorter than the soft commit interval, see the bulk indexing bit below.

These are places to start

Heavy (bulk) indexing

The assumption here is that you’re interested in getting lots of data to the index as quickly as possible for search sometime in the future. I’m thinking original loads of a data source etc.

  • Set your soft commit interval quite long. As in10 minutes. Soft commit is about visibility, and my assumption here is that bulk indexing isn’t about near real time searching so don’t do the extra work of opening any kind of searcher.
  • Set your hard commit intervals to 15 seconds, openSearcher=false. Again the assumption is that you’re going to be just blasting data at Solr. The worst case here is that you restart your system and have to replay 15 seconds or so of data from your tlog. If your system is bouncing up and down more often than that, fix the reason for that first.
  • Only after you’ve tried the simple things should you consider refinements, they’re usually only required in unusual circumstances. But they include:
    • Turning off the tlog completely for the bulk-load operation
    • Indexing offline with some kind of map-reduce process
    • Only having a leader per shard, no replicas for the load, then turning on replicas later and letting them do old-style replication to catch up. Note that this is automatic, if the node discovers it is “too far” out of sync with the leader, it initiates an old-style replication. After it has caught up, it’ll get documents as they’re indexed to the leader and keep its own tlog.
    • etc.

Index-heavy, Query-light

By this I mean, say, searching log files. This is the case where you have a lot of data coming at the system pretty much all the time. But the query load is quite light, often to troubleshoot or analyze usage.

  • Set your soft commit interval quite long, up to the maximum latency you can stand for documents to be visible. This could be just a couple of minutes or much longer. Maybe even hours with the capability of issuing a hard commit (openSearcher=true) or soft commit on demand.
  • Set your hard commit to 15 seconds, openSearcher=false

Index-light, Query-light or heavy

This is a relatively static index that sometimes gets a small burst of indexing. Say every 5-10 minutes (or longer) you do an update

  • Unless NRT functionality is required, I’d omit soft commits in this situation and do hard commits every 5-10 minutes with openSearcher=true. This is a situation in which, if you’re indexing with a single external indexing process, it might make sense to have the client issue the hard commit.

Index-heavy, Query-heavy

This is the Near Real Time (NRT) case, and is really the trickiest of the lot. This one will require experimentation, but here’s where I’d start

  • Set your soft commit interval to as long as you can stand. Don’t listen to your product manager who says “we need no more than 1 second latency”. Really. Push back hard and see if the user is best served or will even notice. Soft commits and NRT are pretty amazing, but they’re not free.
  • Set your hard commit interval to 15 seconds.

 

SolrJ and HTTP and client indexing

Generally, all the options available automatically are also available via SolrJ or HTTP. The HTTP commands are documented here. The SolrJ commands are in the Javadocs, SolrServer class.

Late edit (Jun, 2014) 
Be very careful committing from the client! In fact, don’t do it.
By and large, do not issue commits from any client indexing to Solr, it’s almost always a mistake. And especially in those cases where you have multiple clients indexing at once, it is A Bad Thing. 

What happens is commits come in unpredictably close to each other, generating work as above. You’ll possibly see warnings in your log about “too many warming searchers”. Or you’ll see a zillion small segments. Or… 

Let your autocommit settings (both soft and hard) in solrconfig.xml handle the commit frequency. If you absolutely must control the visibility, say you want to search docs right after the indexing run happens andyou can’t afford to wait for your autocommit settings to kick in, commit once at the end.
In fact, I would only do that if I had only one indexing client. Otherwise, I’d wait until they were all finished and submit a “manual” commit, something like:
http://host:port/solr/collection/update?commit=true
should do it, cURL it in, send it from a browser, etc.

And remember, optimizing an index is rarely necessary!

Happy Indexing!

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值