Apache solr 和 ES比较

最新推荐文章于 2024-01-21 03:33:03 发布

惹不起的程咬金

最新推荐文章于 2024-01-21 03:33:03 发布

阅读量4.2k

点赞数 1

http://solr-vs-elasticsearch.com/

Apache Solr vs Elasticsearch

The Feature Smackdown

API

Feature	Solr 6.2.1	ElasticSearch 5.0
Format	XML, CSV, JSON	JSON
HTTP REST API
Binary API	SolrJ	TransportClient, Thrift (through a plugin)
JMX support		ES specific stats are exposed through the REST API
Official client libraries	Java	Java, Groovy, PHP, Ruby, Perl, Python, .NET, Javascript Official list of clients
Community client libraries	PHP, Ruby, Perl, Scala, Python, .NET, Javascript, Go, Erlang, Clojure	Clojure, Cold Fusion, Erlang, Go, Groovy, Haskell, Java, JavaScript, .NET, OCaml, Perl, PHP, Python, R, Ruby, Scala, Smalltalk, Vert.x Complete list
3rd-party product integration (open-source)	Drupal, Magento, Django, ColdFusion, Wordpress, OpenCMS, Plone, Typo3, ez Publish, Symfony2, Riak (via Yokozuna)	Drupal, Django, Symfony2, Wordpress, CouchBase
3rd-party product integration (commercial)	DataStax Enterprise Search, Cloudera Search, Hortonworks Data Platform, MapR	SearchBlox, Hortonworks Data Platform, MapR etc Complete list
Output	JSON, XML, PHP, Python, Ruby, CSV, Velocity, XSLT, native Java	JSON, XML/HTML (via plugin)

Infrastructure

Feature	Solr 6.2.1	ElasticSearch 5.0
Master-slave replication	Only in non-SolrCloud. In SolrCloud, behaves identically to ES.	Not an issue because shards are replicated across nodes.
Integrated snapshot and restore	Filesystem	Filesystem, AWS Cloud Plugin for S3 repositories, HDFS Plugin for Hadoop environments, Azure Cloud Plugin for Azure storage repositories

Indexing

Feature	Solr 6.2.1	ElasticSearch 5.0
Data Import	DataImportHandler - JDBC, CSV, XML, Tika, URL, Flat File	[DEPRECATED in 2.x] Rivers modules - ActiveMQ, Amazon SQS, CouchDB, Dropbox, DynamoDB, FileSystem, Git, GitHub, Hazelcast, JDBC, JMS, Kafka, LDAP, MongoDB, neo4j, OAI, RabbitMQ, Redis, RSS, Sofa, Solr, St9, Subversion, Twitter, Wikipedia
ID field for updates and deduplication
DocValues
Partial Doc Updates	with stored fields	with _source field
Custom Analyzers and Tokenizers
Per-field analyzer chain
Per-doc/query analyzer chain
Index-time synonyms		Supports Solr and Wordnet synonym format
Query-time synonyms	especially via hon-lucene-synonyms	Technically, yes, but practically no because multi-word/phrase query-time synonyms are not supported. See ES docs and hon-lucene-synonyms blog for nuances.
Multiple indexes
Near-Realtime Search/Indexing
Complex documents
Schemaless	4.4+
Multiple document types per schema	One set of fields per schema, one schema per core
Online schema changes	Schemaless mode or via dynamic fields.	Only backward-compatible changes.
Apache Tika integration
Dynamic fields
Field copying		via multi-fields
Hash-based deduplication		Murmur plugin or ER plugin

Searching

Feature	Solr 6.2.1	ElasticSearch 5.0
Lucene Query parsing
Structured Query DSL	Need to programmatically create queries if going beyond Lucene query syntax.
Span queries	via SOLR-2703
Spatial/geo search
Multi-point spatial search
Faceting		Top N term accuracy can be controlled with shard_size
Advanced Faceting	New JSON faceting API as of Solr 5.x	blog post
Geo-distance Faceting
Pivot Facets
More Like This
Boosting by functions
Boosting using scripting languages
Push Queries	JIRA issue	Percolation. Distributed percolation supported in 1.0
Field collapsing/Results grouping
Query Re-Ranking		via Rescoring or a plugin
Index-based Spellcheck		Phrase Suggester
Wordlist-based Spellcheck
Autocomplete
Query elevation		workaround
Intra-index joins	via parent-child query	via has_children and top_children queries
Inter-index joins	Joined index has to be single-shard and replicated across all nodes.
Resultset Scrolling	New to 4.7.0	via scan search type
Filter queries		also supports filtering by native scripts
Filter execution order	local params and cache property
Alternative QueryParsers	DisMax, eDisMax	query_string, dis_max, match, multi_match etc
Negative boosting	but awkward. Involves positively boosting the inverse set of negatively-boosted documents.
Search across multiple indexes	it can search across multiple compatible collections
Result highlighting
Custom Similarity
Searcher warming on index reload		Warmers API
Term Vectors API

Customizability

Feature	Solr 6.2.1	ElasticSearch 5.0
Pluggable API endpoints
Pluggable search workflow	via SearchComponents
Pluggable update workflow	via UpdateRequestProcessor
Pluggable Analyzers/Tokenizers
Pluggable QueryParsers
Pluggable Field Types
Pluggable Function queries
Pluggable scoring scripts
Pluggable hashing
Pluggable webapps		[site plugins DEPRECATED in 5.x] blog post
Automated plugin installation		Installable from GitHub, maven, sonatype or elasticsearch.org

Distributed

Feature	Solr 6.2.1	ElasticSearch 5.0
Self-contained cluster	Depends on separate ZooKeeper server	Only Elasticsearch nodes
Automatic node discovery	ZooKeeper	internal Zen Discovery or ZooKeeper
Partition tolerance	The partition without a ZooKeeper quorum will stop accepting indexing requests or cluster state changes, while the partition with a quorum continues to function.	Partitioned clusters can diverge unless discovery.zen.minimum_master_nodes set to at least N/2+1, where N is the size of the cluster. If configured correctly, the partition without a quorum will stop operating, while the other continues to work. See this
Automatic failover	If all nodes storing a shard and its replicas fail, client requests will fail, unless requests are made with the shards.tolerant=true parameter, in which case partial results are retuned from the available shards.
Automatic leader election
Shard replication
Sharding
Automatic shard rebalancing		it can be machine, rack, availability zone, and/or data center aware. Arbitrary tags can be assigned to nodes and it can be configured to not assign the same shard and its replicates on a node with the same tags.
Change # of shards	Shards can be added (when using implicit routing) or split (when using compositeId). Cannot be lowered. Replicas can be increased anytime.	each index has 5 shards by default. Number of primary shards cannot be changed once the index is created. Replicas can be increased anytime.
Shard splitting
Relocate shards and replicas	can be done by creating a shard replicate on the desired node and then removing the shard from the source node	can move shards and replicas to any node in the cluster on demand
Control shard routing	shards or _route_ parameter	routing parameter
Pluggable shard/replica assignment	Rule-based replica assignment	Probabilistic shard balancing with Tempest plugin
Consistency	Indexing requests are synchronous with replication. A indexing request won't return until all replicas respond. No check for downed replicas. They will catch up when they recover. When new replicas are added, they won't start accepting and responding to requests until they are finished replicating the index.	Replication between nodes is synchronous by default, thus ES is consistent by default, but it can be set to asynchronous on a per document indexing basis. Index writes can be configured to fail is there are not sufficient active shard replicas. The default is quorum, but all or one are also available.

Misc

Feature	Solr 6.2.1	ElasticSearch 5.0
Web Admin interface	bundled with Solr	Marvel or Kibana apps
Visualisation	Banana (Port of Kibana)	Kibana
Hosting providers	WebSolr, Searchify, Hosted-Solr, IndexDepot, OpenSolr, gotosolr	Found, ObjectRocket, bonsai.io, Indexisto, qbox.io, IndexDepot, Compose.io, Sematext Logsene

惹不起的程咬金

关注

1
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
Apache solr 和 ES比较

http://solr-vs-elasticsearch.com/Apache Solr vs ElasticsearchThe Feature SmackdownAPIFeature Solr 6.2.1 ElasticSearch 5.0 Format XML, CSV, JSON JSON HTTP REST API Bin...
复制链接

扫一扫