Neo4j:LOAD CSV –处理CSV文档中的隐藏数组

最近有人问我如何使用Neo4j的LOAD CSV工具在CSV文件的列中处理值的“数组”,尽管我最初认为这不可能,因为每个单元格都被视为字符串, 迈克尔向我展示了一种方法我认为这很整洁。
假设我们有一个代表人和他们的朋友的CSV文件。 它可能看起来像这样:



name,friends
"Mark","Michael,Peter"
"Michael","Peter,Kenny"
"Kenny","Anders,Michael"

我们想要的是具有以下节点:

  • 标记
  • 麦可
  • 彼得
  • 肯尼
  • 安德斯

与以下朋友的关系:

  • 马克->迈克尔
  • 马可->彼得
  • 迈克尔->彼得
  • 迈克尔->肯尼
  • 肯尼->安德斯
  • 肯尼->迈克尔

我们将从加载CSV文件并返回每一行开始:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row RETURN row;
+------------------------------------------------+
| row                                            |
+------------------------------------------------+
| {name -> "Mark", friends -> "Michael,Peter"}   |
| {name -> "Michael", friends -> "Peter,Kenny"}  |
| {name -> "Kenny", friends -> "Anders,Michael"} |
+------------------------------------------------+
3 rows

正如预期的那样,“朋友”列被视为字符串,这意味着我们可以使用split函数获取要与之成为朋友的人的数组:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row RETURN row, split(row.friends, ",") AS friends;
+-----------------------------------------------------------------------+
| row                                            | friends              |
+-----------------------------------------------------------------------+
| {name -> "Mark", friends -> "Michael,Peter"}   | ["Michael","Peter"]  |
| {name -> "Michael", friends -> "Peter,Kenny"}  | ["Peter","Kenny"]    |
| {name -> "Kenny", friends -> "Anders,Michael"} | ["Anders","Michael"] |
+-----------------------------------------------------------------------+
3 rows

现在我们已经将它们作为数组了,我们可以使用UNWIND来获取要创建的朋友对:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row WITH row, split(row.friends, ",") AS friends UNWIND friends AS friend RETURN row.name, friend;
+-----------------------+
| row.name  | friend    |
+-----------------------+
| "Mark"    | "Michael" |
| "Mark"    | "Peter"   |
| "Michael" | "Peter"   |
| "Michael" | "Kenny"   |
| "Kenny"   | "Anders"  |
| "Kenny"   | "Michael" |
+-----------------------+
6 rows

现在,我们将介绍一些MERGE语句来创建适当的节点和关系:

$ load csv with headers from "file:/Users/markneedham/Desktop/friends.csv" AS row WITH row, split(row.friends, ",") AS friends UNWIND friends AS friend  MERGE (p1:Person {name: row.name}) MERGE (p2:Person {name: friend}) MERGE (p1)-[:FRIENDS_WITH]->(p2);
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 5
Relationships created: 6
Properties set: 5
Labels added: 5
373 ms

现在,如果我们查询数据库以获取所有节点和关系…

$ match (p1:Person)-[r]->(p2) RETURN p1,r, p2;
+------------------------------------------------------------------------+
| p1                      | r                  | p2                      |
+------------------------------------------------------------------------+
| Node[0]{name:"Mark"}    | :FRIENDS_WITH[0]{} | Node[1]{name:"Michael"} |
| Node[0]{name:"Mark"}    | :FRIENDS_WITH[1]{} | Node[2]{name:"Peter"}   |
| Node[1]{name:"Michael"} | :FRIENDS_WITH[2]{} | Node[2]{name:"Peter"}   |
| Node[1]{name:"Michael"} | :FRIENDS_WITH[3]{} | Node[3]{name:"Kenny"}   |
| Node[3]{name:"Kenny"}   | :FRIENDS_WITH[4]{} | Node[4]{name:"Anders"}  |
| Node[3]{name:"Kenny"}   | :FRIENDS_WITH[5]{} | Node[1]{name:"Michael"} |
+------------------------------------------------------------------------+
6 rows

…您会看到我们拥有一切。 如果不是用逗号分隔的人员列表,我们在单元格中有一个文字数组…

name,friends
"Mark", "[Michael,Peter]"
"Michael", "[Peter,Kenny]"
"Kenny", "[Anders,Michael]"

…我们需要调整查询的一部分,以提取我们的朋友以剥离第一个和最后一个字符:

$ load csv with headers from "file:/Users/markneedham/Desktop/friendsa.csv" AS row RETURN row, split(substring(row.friends, 1, length(row.friends) -2), ",") AS friends;
+-------------------------------------------------------------------------+
| row                                              | friends              |
+-------------------------------------------------------------------------+
| {name -> "Mark", friends -> "[Michael,Peter]"}   | ["Michael","Peter"]  |
| {name -> "Michael", friends -> "[Peter,Kenny]"}  | ["Peter","Kenny"]    |
| {name -> "Kenny", friends -> "[Anders,Michael]"} | ["Anders","Michael"] |
+-------------------------------------------------------------------------+
3 rows

然后,如果我们将整个查询放在一起,我们将得到以下结果:

$ load csv with headers from "file:/Users/markneedham/Desktop/friendsa.csv" AS row WITH row, split(substring(row.friends, 1, length(row.friends) -2), ",") AS friends UNWIND friends AS friend  MERGE (p1:Person {name: row.name}) MERGE (p2:Person {name: friend}) MERGE (p1)-[:FRIENDS_WITH]->(p2);;
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 5
Relationships created: 6
Properties set: 5
Labels added: 5

翻译自: https://www.javacodegeeks.com/2014/07/neo4j-load-csv-processing-hidden-arrays-in-your-csv-documents.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值