这篇文章对理解openSearcher和softcommit两个参数的含义非常有帮助,
并预计会在Solr5.0中实现新的参数组合flush/openSearcher以降低混淆性。
[SOLR-3539]
I think current NRT options when doing a commit, particularly "openSearcher="true|false" and "softCommit=true|false", is confusing, we should rethink them before they get into user API in 4.0.
Issue Links
-
related:
![New Feature - A new feature of the product, which has yet to be developed.](https://www.evernote.com/shard/s155/sh/68e4744a-d50a-4288-8f8b-ee2572f2ed23/d7443849a79d948f61e80e18e33812ad/res/f5084cc8-7dc9-419e-89c8-0875596f1e11.png?resizeSmall&width=832)
All comments:
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
This is something that started to concern me while trying to update the tutorial. I'm having a hard time articulating my concerns to myself, so this will largely be stream of consciousness...
Both of these params seen defined more in terms of what they don't do then what they actually do – softCommit in particular – and while they aren't too terrible to explain indivdually, it's very hard to clearly articulate how they interplay with eachother.
- openSearcher
- true - opens a new searcher against this commit point
- false - does not open a new searcher against this commit point
- softCommit
- true - a new searcher is opened against the commit point, but no data is flushed to disk.
- false - the commit point is flushed to disk.
Certain combinations of these params seem redundent (openSearcher=true&softCommit=true) while others not only make no sense, but are directly contradictory (openSearcher=false&softCommit=true)...
softCommit=true | softCommit=false | |
openSearcher=true | openSearcher is redundent | OK |
openSearcher=false | contradictory (openSearcher is currently ignored) | OK |
From a vocabulary standpoint, they also seem confusing to understand. Consider a new user, starting with the 4x example which contains the following...
<autoCommit> <maxTime>15000</maxTime> <openSearcher>false</openSearcher> </autoCommit>
Documents this user adds will automaticly get flushed to disk, but won't be visible in search results until the user takes some explicit action. The user, upon reading some docs or asking on the list will become aware that he needs to open a new searcher, and will be guided to "do a commit" (or maybe a commit explicitly adding openSearcher=true). But this is actually overkill for what the user needs, because it will also flush any pending docs to disk. All the user really needs to "open a new searcher" is to do an explicit commit with softCommit=true.
I would like to suggest that we throw out the the "softCommit" param and replace it with a "flush" (or "flushToDisk" or "persist") param, which is solely concerned with the persistence of the commit, and completely disjoint from "searcher" opening which would be controled entirely with the "openSearcher" param.
- openSearcher
- true - opens a new searcher against this commit point
- false - does not open a new searcher against this commit point
- flush
- true - flushes this commit point to stable storage
- false - does not flush this commit point to stable storage
Making the interaction much easier to understand...
flush=true | flush=false | |
openSearcher=true | OK | OK |
openSearcher=false | OK | No-Op |
I've mainly been thinking about this from a user perspective the last few days, so I haven'thad a chance to figure out how much this would impact the internals related to softCommit right now. I supsect there are a lot of places that would need to be tweaked, but hopefully most of them would just involve flipping logic (softCommit=true -> flush=false). The biggest challenges i can think of are:
- how to deal with the autocommit options in solrconfig.xml. in 3x we supported a single <autoCommit/> block. On the 4x branch we support one <autoCommit/> lock and one <autoSoftCommit/> block – should we continue to do that? would <autoSoftCommit/> just implicitly specify flush=false? or should we try to generalize to support N <autoCommit/> blocks where <openSearcher/> and <flush/> are config options for all of them?
- event eventlistener – it looks like the SolrEventListener API had a postSoftCommit() method added to it, but it doesn't seem to be configurable in any way – i think this is just for tests, but if it's intentionally being expost we would need to revamp it ... off the cuff i would suggest removing postSoftCommit() changing the postCommit() method to take in some new structure specifying the options on the commit.
Thoughts?
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
bulk fixing the version info for 4.0-ALPHA and 4.0 all affected issues have "hoss20120711-bulk-40-change" in comment
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
Anyone else have thoughts around this?
One performance concern of mine revolves around "commit" - the vast majority of people used it for visibility of documents, not for persistence at a specific time.
I'm warming to the idea of a "flush" param instead of softCommit, and it seems like perhaps it should default to "false" for 4.0
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
rmuir20120906-bulk-40-change
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
I agree we could clean this up.
I worry about flush since it used to mean something else.
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
I worry about flush since it used to mean something else.
"persist" ?
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
Unassigned issues -> 4.1
![](https://i-blog.csdnimg.cn/blog_migrate/76f7b283229bb1f3e65df96396df450f.png)
Bulk move 4.4 issues to 4.5 and 5.0