出于测试目的,我使用的是具有以下规格的Windows7笔记本电脑:
我还使用neo4j-enterprise-3.0.2,并使用BOLT连接将Pycharm python程序连接到数据库。我创建了许多节点和关系。我注意到节点和关系的创建在一段时间后会大大减慢,而在某一点之后几乎没有进展。在
我检查了以下内容我在节点和属性上使用了唯一的约束,这样就可以方便地在数据库中查找节点了
我注意到,当所有这些事务发生时,我的RAM内存不断增加。我在neo4j配置文件中使用了不同的设置dbms.memory.pagecache.size(默认为2g、3g、10g),它们都会导致我的RAM从大约4GB(没有运行python代码)增加到7GB及以上。这时节点的创建变得非常缓慢。停止程序时,RAM使用率再次下降。在
这是健康监测显示给我的:
问:为什么节点和关系的创建速度会减慢这么多?是因为图的大小(但数据集似乎太小了)?是否与螺栓连接和数据库事务相关?是否与内存使用增加有关?怎样才能防止这种情况?在
我创建了一个简单的示例来说明问题:from neo4j.v1 import GraphDatabase
#BOLT driver
driver = GraphDatabase.driver("bolt://localhost")
session = driver.session()
#start with empty database
stmtDel = "MATCH (n) OPTIONAL MATCH (n)-[r]-() DELETE n,r"
session.run(stmtDel)
#Add uniqueness constraint
stmt = 'CREATE CONSTRAINT ON (d:Day) ASSERT d.name IS UNIQUE'
session.run(stmt)
# Create many nodes - run either option 1 or option 2
# # Option 1: Creates a node one by one. This is slow in execution and keeps the RAM flat (no increase)
# for i in range(1,80001):
# stmt1 = 'CREATE (d:Day) SET d.name = {name}'
# dict = {"name": str(i)}
# print(i)
# session.run(stmt1,dict)
#Option 2: Increase the speed and submit multiple transactions at the same time - e.g.1000
# This is very fast but blows up the RAM memory in no time, even with a limit on the page cache of 2GB
tx = session.begin_transaction()
for i in range(1, 80001):
stmt1 = 'CREATE (d:Day) SET d.name = {name}'
dict = {"name": str(i)}
tx.run(stmt1, dict) # append a transaction to the block
print(i)
if divmod(i,1000)[1] == 0: #every one thousand transactions submit the block and creat an new transaction block
tx.commit()
tx.close
tx = session.begin_transaction() #it seems that recycling the session keeps on growing the RAM.