请务必注意!github的secretpad示例,它采用的secretflow版本是1.3.0,而不是最新的1.5.0,而1.3和1.5之间的api做了细小的改动,请务必去secretflow的官网去看api的改动情况,本人在这里卡了一个小时,最后找bug终于发现,两个教程视频的secretflow版本不一样,导致api调用的参数有细小差别,请务必去
SecretFlow | SecretFlow v1.3.0b0 | 隐语 SecretFlow
SecretFlow | SecretFlow v1.5.0.dev240312 | 隐语 SecretFlow
这里查看两个版本api的具体型号,不要跟着教程无脑写!
ok,首先是配置psi的节点alice和bob,请注意,我们这里采用docker的桥接模式!并且在docker的yaml文件中直接确定子网和每个节点的具体ip!经本人测试,这样完全没有问题,这里,本人采用的secretflow版本是1.3.0!
services:
alice:
image: 'secretflow/secretnote:unstable-amd64'
platform: linux/amd64
environment:
- SELF_PARTy=alice
- ALL_PARTIES=alice,bob
ports:
- 8090:8888
entrypoint: /root/scripts/start.sh
volumes:
- /root/scripts
networks:
mynetwork:
ipv4_address: 192.168.0.10
bob:
image: 'secretflow/secretnote:unstable-amd64'
platform: linux/amd64
environment:
- SELF_PARTy=bob
- ALL_PARTIES=alice,bob
ports:
- 8092:8888
volumes:
- /root/scripts
entrypoint: /root/scripts/start.sh
networks:
mynetwork:
ipv4_address: 192.168.0.20
networks:
mynetwork:
driver: bridge
ipam:
config:
- subnet: 192.168.0.0/24
在这里,本人配置子网段192.168.0.0/24,alice在10,bob在20,这里直接确定ip,省的之后去docker的ip addr找ip搞得网段傻傻弄不清楚!
下面是根据张磊老师的psi示例,但是是1.3.0,而不是他的1.5.0的情况写的‘,经本人测试完全么有问题
寻找未使用端口
import socket
from contextlib import closing
from typing import cast
def tcp_port()->int:
with closing(socket.socket(socket.AF_INET,socket.SOCK_STREAM)) as sock:
sock.bind(("",0))
sock.setsockopt(socket.SOL_SOCKET,socket.SO_REUSEADDR,1)
return cast(int,sock.getsockname()[1])
print(tcp_port())
alice的节点
import secretflow as sf
cluster_config = {
'parties':{
'alice':{
'address':'192.168.0.10:47323',
'listen_addr':'0.0.0.0:47323'
},
'bob':{
'address':'192.168.0.20:35285',
'listen_addr':'0.0.0.0:35285'
},
},
'self_party':'alice'
}
sf.shutdown()
sf.init(address='local',cluster_config=cluster_config)
bob节点
import secretflow as sf
cluster_config = {
'parties':{
'alice':{
'address':'192.168.0.10:47323',
'listen_addr':'0.0.0.0:47323'
},
'bob':{
'address':'192.168.0.20:35285',
'listen_addr':'0.0.0.0:35285'
},
},
'self_party':'bob'
}
sf.shutdown()
sf.init(address='local',cluster_config=cluster_config)
spu初始化
import spu
cluster_def = {
'nodes':[{
'party':'alice',
'address':'192.168.0.10:49873'
},{
'party':'bob',
'address':'192.168.0.20:34189'
}],
'runtime_config':{
'protocol':spu.spu_pb2.SEMI2K,
'field':spu.spu_pb2.FM128,
}
}
spu = sf.SPU(
cluster_def,
link_desc={
'connect_retry_times':60,
'connect_retry_interval_ms':1000,
}
)
import pandas as pd
alice_df = pd.DataFrame({
'name':[2,3,100,101],
'age':[15,13,20,21],
})
from pathlib import Path
alice_df.to_csv(f"{str(Path.home())}/alice_input.csv",index=False)
bob的数据:
import pandas as pd
bob_df = pd.DataFrame({
'name':[2,3,4,5],
'sex':[1,0,0,1],
})
from pathlib import Path
bob_df.to_csv(f"{str(Path.home())}/bob_input.csv",index=False)
隐私交
请注意,在1.3.0中的psi_csv和1.5.0的psi_csv的参数不同,一个是device:str的元组,还有一个是str:str的元组,而1.3.0中索性没有psi这个api,只有psi_v2
from pathlib import Path
alice, bob = sf.PYU("alice"), sf.PYU("bob")
current_dir = Path.home()
input_path = {
alice: f"{current_dir}/alice_input.csv",
bob: f"{current_dir}/bob_input.csv",
}
output_path = {
alice: f"{current_dir}/alice_output.csv",
bob: f"{current_dir}/bob_output.csv",
}
spu.psi_csv("uid", input_path, output_path, "alice")