py2neo 创建关系,py2neo-Neo4j-系统错误-创建批处理节点/关系

Attempting to batch create nodes & relationships - batch creation is failing - Traceback at end of the post

Note code functions with smaller subset of nodes - fails when get into massive number of relationships, unclear at what limit this is occurring.

Wondering if I need to increase ulimit above 40,000 open files

Read somewhere where persons were running into Xstream issues with REST API while conducting batch create - unclear if the problem set is on the py2neo end of the spectrum, or on the Neo4j server tuning/configuration, or on the Python end of the spectrum.

Any guidance would be greatly appreciated.

One cluster within the data set ends up with around 625525 relationships out of 700+ nodes.

Total Relationships will be 1M+ - utilizing an Apple Macbook Pro Retina with x86_64 - Ubuntu 13.04, SSD, 8GB memory.

Neo4j: configured auto_indexing & auto_relationships set to ON

Nodes Clustered/Grouped via Python Panadas DataFrame.groupby()

Nodes: contain 3 properties

Relationships Properties: 1 -> IN & Out Relationships created

ulimit set to 40,000 files open

Code

Operating System: Ubuntu 13.04

Python version: 2.7.5

py2neo Version: 1.5.1

Java version: 1.7.0_25-b15

Neo4j version: Community Edition 1.9.2

Traceback

Traceback (most recent call last):

File "/home/alienone/Programming/Python/OSINT/MANDIANTAPT/spitball.py", line 63, in

main()

File "/home/alienone/Programming/Python/OSINT/MANDIANTAPT/spitball.py", line 59, in main

graph_db.create(*sorted_nodes)

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 420, in create

return batch.submit()

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 2123, in submit

for response in self._submit()

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/neo4j.py", line 2092, in submit

for id, request in enumerate(self.requests)

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 428, in _send

return self._client().send(request)

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 365, in send

return Response(request.graph_db, rs.status, request.uri, rs.getheader("Location", None), rs_body)

File "/home/alienone/.pythonbrew/pythons/Python-2.7.5/lib/python2.7/site-packages/py2neo/rest.py", line 279, in init

raise SystemError(body)

SystemError: None

Process finished with exit code 1

解决方案

I had a similar issue. One way to deal with it is to do the batch.submit() for chunks of your data and not the whole data set. This is slower of course, but splitting one million nodes in chunks of 5000 is still faster than adding every node separately.

I use a small helper class to do this, note that all my nodes are indexed: https://gist.github.com/anonymous/6293739

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值