FATE —— 二.1.2 Hetero-NN快速入门:二进制分类任务

前言

在本教程中,您将学习如何使用Hetero NN。应该注意的是,Hetero NN也已升级为与Homo NN类似的工作方式,允许使用Pytorch后端对模型和数据集进行高度定制。我们将在后面的章节中专门介绍针对Hetero NN的定制。

此外,Hetero NN还改进了一些接口,如交互层接口,这使其使用逻辑更加清晰。

在本章中,我们将提供一个使用Hetero-NN的基本二进制分类任务的示例。使用此算法的过程与其他FATE算法一致:您将使用FATE提供的读取器和转换器接口来输入表数据,然后将数据输入到算法组件中。然后,组件将使用定义的顶部/底部模型、优化器和损失函数进行训练。此版本的用法与旧版本FATE的用法基本相同。

如果您想了解Hetero-NN算法的原理,可以参考异构神经网络

上载表格数据

一开始,我们将数据上传到FATE。我们可以使用管道直接上传数据。在这里,我们上传两个文件:来宾的brest_hetero_guest.csv和主机的brest_hetero_host.csv。请注意,在本教程中,我们使用的是独立版本,如果您使用的是集群版本,则需要在每台计算机上上载相应的数据。

from pipeline.backend.pipeline import PipeLine  # pipeline class

# we have two party: guest, whose data with labels
#                    host, without label
# the dataset is vertically split

dense_data_guest = {"name": "breast_hetero_guest", "namespace": f"experiment"}
dense_data_host = {"name": "breast_hetero_host", "namespace": f"experiment"}

guest= 9999
host = 10000

pipeline_upload = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

partition = 4

# 上传一份数据
pipeline_upload.add_upload_data(file="./examples/data/breast_hetero_guest.csv",
                                table_name=dense_data_guest["name"],             # table name
                                namespace=dense_data_guest["namespace"],         # namespace
                                head=1, partition=partition)               # data info

pipeline_upload.add_upload_data(file="./examples/data/breast_hetero_host.csv",
                                table_name=dense_data_host["name"],
                                namespace=dense_data_host["namespace"],
                                head=1, partition=partition)      # data info

pipeline_upload.upload(drop=1)

乳房数据集是一个具有30个特征的二进制数据集,它是垂直分割的:客人拥有10个胎儿和标签,而主人拥有20个特征

breast_hetero_guest数据写入
import pandas as pd
df = pd.read_csv('../examples/data/breast_hetero_guest.csv')  # 文件地址根据自己得环境设置
df
breast_hetero_host数据写入
import pandas as pd
df = pd.read_csv('../examples/data/breast_hetero_host.csv')
df

编写并执行Pipeline脚本

上传完成后,我们可以开始编写Pipeline脚本以提交FATE任务。

import torch as t
from torch import nn
from pipeline.backend.pipeline import PipeLine  # pipeline Class
from pipeline import fate_torch_hook
from pipeline.component import HeteroNN, Reader, DataTransform, Intersection  # Hetero NN Component, Data IO component, PSI component
from pipeline.interface import Data, Model # data, model for defining the work flow
fate_torch_hook

请确保执行以下fate_torch_hook函数,该函数可以修改某些torch类,以便可以通过Pipeline解析和提交您在脚本中定义的torch层、顺序、优化器和损失函数。

from pipeline import fate_torch_hook
t = fate_torch_hook(t)
guest = 9999
host = 10000
pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

guest_train_data = {"name": "breast_hetero_guest", "namespace": "experiment"}
host_train_data = {"name": "breast_hetero_host", "namespace": "experiment"}

pipeline = PipeLine().set_initiator(role='guest', party_id=guest).set_roles(guest=guest, host=host)

# read uploaded dataset
reader_0 = Reader(name="reader_0")
reader_0.get_party_instance(role='guest', party_id=guest).component_param(table=guest_train_data)
reader_0.get_party_instance(role='host', party_id=host).component_param(table=host_train_data)
# The transform component converts the uploaded data to the DATE standard format
data_transform_0 = DataTransform(name="data_transform_0")
data_transform_0.get_party_instance(role='guest', party_id=guest).component_param(with_label=True)
data_transform_0.get_party_instance(role='host', party_id=host).component_param(with_label=False)
# intersection
intersection_0 = Intersection(name="intersection_0")
异质神经网络组件

这里我们初始化Hetero-NN组件。我们使用get_party_instance分别获取来宾组件和主机组件。由于双方的模型架构不同,我们必须使用各自的组件为每一方指定模型参数。

hetero_nn_0 = HeteroNN(name="hetero_nn_0", epochs=2,
                       interactive_layer_lr=0.01, batch_size=-1, validation_freqs=1, task_type='classification', seed=114514)
guest_nn_0 = hetero_nn_0.get_party_instance(role='guest', party_id=guest)
host_nn_0 = hetero_nn_0.get_party_instance(role='host', party_id=host)
定义Guest 和 Host 模型
# Guest Bottom, Top Model
guest_bottom = t.nn.Sequential(
    nn.Linear(10, 2),
    nn.ReLU()
)
guest_top = t.nn.Sequential(
    nn.Linear(2, 1),
    nn.Sigmoid()
)

# Host Bottom Model
host_bottom = t.nn.Sequential(
    nn.Linear(20, 2),
    nn.ReLU()
)

# After using fate_torch_hook, nn module can use InteractiveLayer, you can view the structure of Interactive layer with print
interactive_layer = t.nn.InteractiveLayer(out_dim=2, guest_dim=2, host_dim=2, host_num=1)
print(interactive_layer)

guest_nn_0.add_top_model(guest_top)
guest_nn_0.add_bottom_model(guest_bottom)
host_nn_0.add_bottom_model(host_bottom)

optimizer = t.optim.Adam(lr=0.01) # Notice! After fate_torch_hook, the optimizer can be initialized without model parameter
loss = t.nn.BCELoss()

hetero_nn_0.set_interactive_layer(interactive_layer)
hetero_nn_0.compile(optimizer=optimizer, loss=loss)

InteractiveLayer(

(activation): ReLU()

(guest_model): Linear(in_features=2, out_features=2, bias=True)

(host_model): ModuleList(

(0): Linear(in_features=2, out_features=2, bias=True)

)

(act_seq): Sequential(

(0): ReLU()

)

)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值