图数据库Neo4j实现人脉推荐——二度人脉

35 篇文章 2 订阅
13 篇文章 0 订阅

“吾尝终日而思矣,不如须臾之所学也;吾尝跂而望矣,不如登高之博见也。登高而招,臂非加长也,而见者远;顺风而呼,声非加疾也,而闻者彰。假舆马者,非利足也,而致千里;假舟楫者,非能水也,而绝江河。君子生非异也,善假于物也。”


业务需求:

通过现有系统“好友关系”和“用户通讯录”数据,实现人脉推荐——二度人脉….六度人脉

技术实现分析:

  1. 关系数据库(深度关联表,算死人)
  2. 图数据库(天然图关系,选择Neo4j)

业务实体模型及规模 :

  1. 实体——顶点:用户(数百万)、手机号码(数亿)
  2. 关系——边:好友双向关系(数亿)、手机属于某个注册用户(数百万)、用户拥有手机号码(数亿)

    事例(Neo4j中只存储关系相关数据):

CREATE (p1:Person {uid:122622,identityType:1,personIUCode:'0101'})
CREATE (p2:Person {uid:122623,identityType:1,personIUCode:'0201'})
CREATE (p3:Person {uid:122624,identityType:2,personIUCode:'0301'})

CREATE (m1:Mobile {num:'1580084xxxx'})
CREATE (m2:Mobile {num:'1680084xxxx'})
CREATE (m3:Mobile {num:'1780084xxxx'})


CREATE
  /**p1分别和p2、p3互为双向好友*/ 
  (p1)-[:Fellow]->(p2),
  (p2)-[:Fellow]->(p1),

  (p1)-[:Fellow]->(p3),
  (p3)-[:Fellow]->(p1),

  /**用户p1通讯录中有m1、m2、m3三个联系人*/
  (p1)-[:Has]->(m1),
  (p1)-[:Has]->(m2),
  (p1)-[:Has]->(m3),

  (p2)-[:Has]->(m3),

  /**用户p1使用手机m1注册*/
  (m1)-[:Belong]->(p1)

实施方案&计划:

第一数据:

    1>找好服务器资源&安装配置好Neo4j数据库(2464G内存,Neo4j配置20G)
    2>从mysql、mongodb中导出指定格式的用户、好友关系、通讯录CSV文件数据(导出所需时间?)
    3>导入CSV数据进入Neo4j数据库(导入时间?)

第二研发:

    1.开发、部署好Neo4j独立应用服务GraphServer
        >服务1——addUser:新增用户Node
        >服务2——updateUserIUCode:修改Neo4j中用户行业
        >服务3——addFriend:新增用户好友关系
        >服务4——delFriend:删除好友关系
        >服务5——getFriendOfFriend:好友的好友
        >服务6——synContactList2Neo4j:同步通讯录
    2.业务APP服务端应用接入GraphServer
        >修改1:用户保存profile时,调用GraphServer.addUser()
        >修改2:用户修改行业时,调用GraphServer.updateUserIUCode()
        >修改3:用户同意添加好友时,调用GraphServer.addFriend()
        >修改4:用户删除好友时,调用GraphServer.delFriend(),同时删除Reids缓存中推荐的此UID
        >修改5:增加人脉推荐(优先查询Redis缓存,如有随机获取10;如没或少于10则调用GraphServer.getFriendOfFriend()获取数据且更新保存到Redis)

导入数据:

  • 用户数据(Node):
[root@docker-vm yidejun]# head uidNew.csv 
uid:ID(user-ID),identityType,personIUCode,:LABEL
100002,5,"0309",Person
100004,1,"0105",Person
100025,4,"0111",Person
100001,4,"0101",Person
100055,7,"0101",Person
100058,3,"0521",Person
100065,0,"0116",Person
100069,9,"0116",Person
100400,0,"9921",Person
  • 手机数据(Node):
[root@docker-vm yidejun]# head mobile3_20161124.csv
num:ID(mobile-ID),:LABEL
"1309957XXXX",M
"1319509XXXX",M
"1323959XXXX",M
"1340950XXXX",M
"1340951XXXX",M
"1346950XXXX",M
"1346951XXXX",M
  • 好友关系数据(Relationship):
[root@docker-vm yidejun]# head dejun.csv
:START_ID(user-ID),:END_ID(user-ID),:TYPE
111360,109378,F
111360,106615,F
125952,119734,F
125952,119737,F
125952,119735,F
125952,109378,F
125952,118472,F
126208,126003,F
126208,110944,F
  • 手机归属数据(Relationship):
[root@docker-vm yidejun]# head mobile-belong-user.csv
:START_ID(mobile-ID),:END_ID(user-ID),:TYPE
"1391792XXXX",100002,Belong
"1399999XXXX",100004,Belong
"1860215XXXX",100025,Belong
"1502134XXXX",100001,Belong
"1331010XXXX",100055,Belong
  • 通讯录被拥有数据(Relationship):
[root@docker-vm yidejun]# head data4.csv
:START_ID(user-ID),:END_ID(mobile-ID),:TYPE
10000001,"1309957XXXX",H
10000001,"1319509XXXX",H
10000001,"1323959XXXX",H
10000001,"1340950XXXX",H
10000001,"1340951XXXX",H
10000002,"1346950XXXX",H
10000002,"1346951XXXX",H
  • 导入脚本:
./bin/neo4j-import --into /usr/local/neo4j-community-3.0.7/data/databases/my.db --nodes /home/yidejun/uidNew.csv --nodes /home/yidejun/mobile3_20161124.csv  --relationships /home/yidejun/dejun.csv --relationships /home/yidejun/mobile-belong-user.csv --relationships /home/yidejun/data4.csv --skip-bad-relationships=true --stacktrace --bad-tolerance=50000 --skip-duplicate-nodes=true

这里写图片描述
这里写图片描述

  • 导入效果(导入后启动Neo4j服务器,浏览器访问):
    这里写图片描述

数据验证 :

  • 节点总数量
start n=node(*) return count(*)
  • 创建节点(保持uid唯一)
MERGE (n:Person {uid:'122'}) RETURN n;
start n=node:node_auto_index(uid='122') with count(*) as exists
where exists=0 create (n:Person {uid='122'}) return n;
  • 创建节点&关系
MERGE (one:Person {uid:'1222'}),(two:Person {uid:'1000'}) with one,two CREATE one-[:F]->two;
  • 删除关系(双向——不管什么关系)
Match (x:Person{uid:'1222'})-[r:F]-(y:Person{uid:'1223'}) delete  r;
  • 计算无关系的节点数
MATCH (player) WHERE NOT (player)-[:F]-() RETURN count(*)
match (u:Person {uid:'122622'}) set u.personIUCode='0205' return u
  • 我的通讯录联系人在系统中注册,但和我还不是好友(我有他的手机号码)
match (n:Person {uid:1222})-[:H]->(m)-[:Belong]->(b) where not (n)-[:F]-(b)   return b.uid
  • 拥有我的号码,但不是好友(他有我的手机号码)
match (n:Person {uid:1222})<-[:Belong]-(m)<-[:H]-(b) where not (n)-[:F]-(b)   return b.uid
  • 可能认识(我们都有同一个手机号码)
match (n:Person {uid:122})-[:H]->(m)<-[:H]-(b) with m,b where not (n)-[:F]-(b)   return DISTINCT  b
  • 有N个共同好友
MATCH (a:Person {uid:{1}})-[:F]->(friend)<-[:F]-(b:Person) RETURN b.uid AS uid,COUNT(*) AS comCount,4 AS type ORDER BY comCount DESC SKIP {2} LIMIT {3}
  • 有N个共同好友列表
match (a:Person {uid:'122622'})-[:F]->(friend)<-[:F]-(b:Person {uid:'100004'}) return friend
  • 好友的好友中交集最多的
Match (user:Person {uid:{1}}) match (user)-[:F]->(friend:Person {personIUCode:{2}})-[:F]->(fof:Person {personIUCode:{2}}) with user,fof OPTIONAL Match (user)-[r]-(fof) with fof,r where r is null  RETURN fof.uid as uid, COUNT(*) as count ORDER BY COUNT(*) DESC, fof.uid limit 20
  • 最短路径问题
Match(a:Person {uid:122}) Match(b:Person {uid:100})  p= shortpath((a) -[:F*1..2]->(b)) return a,b,p
MATCH (n:Person { uid:122 })-[:F*1..2]-(neighbors) RETURN n, collect(DISTINCT neighbors)
MATCH p=allShortestPaths((a:Person { uid: 1222})-[:F*..4]-(b:Person { uid: 123})) RETURN p

踩过的坑 :

  • 服务器句柄数设置
Starting Neo4j.
WARNING: Max 1024 open files allowed, minimum of 40000 recommended. See the Neo4j manual.
  • 导入数据——明确指定相关ID
    这里写图片描述

  • 导入数据——重要参数设置,否则莫名毫无提示的停止导入
    –skip-bad-relationships=true
    –stacktrace
    –bad-tolerance=50000
    –skip-duplicate-nodes=true

  • 版本兼容之OPTION MATCH 正确使用,参考https://dzone.com/articles/new-neo4j-optional
  • LOAD CSV超级慢
LOAD CSV WITH HEADERS FROM "file:/home/yidejun/account.csv" AS line RETURN COUNT(*);

LOAD CSV WITH HEADERS FROM "file:/home/yidejun/account.csv" AS line RETURN line LIMIT 5;

LOAD CSV WITH HEADERS FROM "file:/home/yidejun/account.csv" AS line FIELDTERMINATOR '\t' RETURN line.account LIMIT 5;

LOAD CSV WITH HEADERS FROM "file:/home/yidejun/account.csv" AS line FIELDTERMINATOR '\t' RETURN line.uid LIMIT 5;
USING PERIODIC COMMIT 100 LOAD CSV WITH HEADERS FROM "file:/home/yidejun/account.csv" AS line FIELDTERMINATOR '\t' WITH line CREATE (account:Account {account:line.account}) WITH line,account  MATCH (u:Person {uid:line.uid}) WITH account,u CREATE (account)-[:Belong]->(u);

主要参考:

JAVA代码:

评论 7
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值