mysql 递归查询太慢_Neo4j在执行递归查询时比MySQL慢

最新推荐文章于 2022-06-24 14:14:55 发布

Indian Rose

最新推荐文章于 2022-06-24 14:14:55 发布

阅读量391

点赞数

文章标签： mysql 递归查询太慢

本文链接：https://blog.csdn.net/weixin_31618799/article/details/113227509

版权

本文比较了Neo4j 3.1和MySQL在执行递归查询时的性能。实验中，两个数据库填充了相同的数据，每个客户有100个关系。在查询一个客户的所有四级朋友时，MySQL用时60秒，而Neo4j用时240秒。通过优化Neo4j查询，时间降低到40秒，但MySQL经过优化后的查询仅需24秒。文章探讨了是否能进一步调整Neo4j以提高性能。

摘要由CSDN通过智能技术生成

I would like to compare Neo4j(ver. 3.1) and MySQL in performing recursive queries. Therefore I created two tables in MySQL database - Customer and CustomerFriend.

Second table consists of CustomerID and FriendID columns, both of them point to CustomerID column in Customer table. In Neo4j were created corresponding entities:

Customer nodes and FRIEND_OF relations (c:Customer)-[f:FRIEND_OF]->(cc:Customer). Databases are filled with the same data:

100000 Customers, each Customer has 100 relations.

Executed below queries:

MySQL (60s)

SELECT distinct cf4.FriendID FROM customerfriend cf1

join customerfriend cf2 on cf1.FriendID = cf2.CustomerID

join customerfriend cf3 on cf2.FriendID = cf3.CustomerID

join customerfriend cf4 on cf3.FriendID = cf4.CustomerID

where cf1.CustomerID =99;

Neo4j (240s)

match (c:Customer{CustomerID:99})-[:FRIEND_OF*4]->(cc:Customer)

return distinct cc.CustomerID;

Queries are run from simple Java app, which just connect to database (using available connectors), run queries, and measure execution times.

Measured times clearly indicate that Neo4j is slower in performing above queries than MySQL (MySQL 60s, Neo4j 240s). I have tested above queries for 50 relations per Customer and I achieved same results (MySQL 7s faster than Neo4j 17s ).

I read some articles about performing recursive queries in Neo4j which indicate that Neo4j should manage better for this type of queries than MySQL. That's why I have started wondering if I'm doing something wrong or

execution times are proper (??).

I'm wondering if in Neo4j exists any possibilities to tune system performance. In case of MySQL I set up innodb_buffer_pool_size to 3g which affected better query performance(shorter execution time).

--------------------------------EDIT---------------------------

I have considered below suggestions to rewrite my Noe4j query to new form:

match (c:Customer{CustomerID:99})-[:FRIEND_OF]->(c1)-[:FRIEND_OF]->(c2)

with distinct c2

match (c2)-[:FRIEND_OF]->(c3)

with distinct c3

match (c3)-[:FRIEND_OF]->(cc:Customer)

with distinct cc

return cc.CustomerID;

And achieved better query time: 40s

In case of MySQL I have figured out way to optimise previous query, similar to idea of Neo4j query optimisation:

select distinct FriendID as depth4

from customerfriend

where CustomerID in

(select distinct FriendID as depth3

from customerfriend

where CustomerID in

(select distinct FriendID as depth2

from customerfriend

where CustomerID in

(select distinct FriendID as depth

from customerfriend

where CustomerID =99

)));

And execution of this query took 24s

Neo4j still worse than MySQL...

解决方案

You can make a small modification to make neo4j about 50% faster, or for even more speed, use the bitset dance shown on the bottom of this blog post => https://maxdemarzi.com/2013/12/31/the-power-of-open-source-software/

Update:

I went ahead and built a custom procedure for you.

It takes 2.9 seconds on my laptop with 10002045 relationships.

Second Update: