[转载]Introducing TF-Encrypted

Introducing TF-Encrypted

TF-Encrypted/TFE is an open-source framework for Secure Multi-party Computation (MPC) machine learning. The advantage of TFE is that it’s built on top of TensorFlow, allowing non-cryptographic experts to quickly experiment MPC machine learning, while leveraging all the advantages of TensorFlow’s optimizations, including graph compilation and distributed orchestration.

We have developed several projects based on TFE (e.g. The solution winning the iDASH2019 competition), and are actively contributing to TFE. In this blog we will describe a proof-of-concept use case, and give a walkthrough on how to do it with TFE.

MPC machine learning training in five minutes with TF-Encrypted

The use case: Collaborative fraud detection

Suppose Alice is a bank, Bob is a government. Alice and Bob know many individuals in common, and both parties knows some information about the individuals from different aspects (e.g., Alice knows their credit card bills, while Bob knows their tax information), but only Bob knows wether the individuals have some fraud history or not (denoted by label =1 or 0). Now Bob wants to build a fraud detection model with the help of Alice. Alice is willing to collaborate, but she consider her part of user information sensitive and is not willing to share them directly.

This problem could be summarized as a how to do secure collaborative machine learning training on a vertically split dataset. Here’s a simple walkthrough showing how it could be done efficiently using TFE.

Suppose the dataset contains 7000 samples with 32 features, 16 of which is held by Alice, and the other 16 (and the label) is held by Bob. A random generated example dataset could be found here: aliceTrainFile.csv and bobTrainFileWithLabel.csv.

Step 1. Prepare three machines and set up environments

Check python3 and pip3 is correctly installed, then install TensorFlow and TFE on the three machines.

# python3 --version
Python 3.6.9
# pip3 --version
pip 9.0.1 from /usr/lib/python3/dist-packages (python 3.6)
#pip3 install tensorflow==1.13.2
#pip3 install tf-encrypted

Step 2. Edit the following file config.json

Replace machine:port with your own IP:Port. Make sure the three machines are able to access each other via IP:Port.

{
    "alice": "machine1:port1",
    "bob": "machine2:port2",
    "crypto-producer": "machine3:port3"
}

Step 3. Write TFE training code

We provide an example for Logistic Regression : common.py , training_alice.py , training_bob.py and training_server.py.

Step 4. Copy the files to the same directory

Copy config.json , common.py , training_alice.py , aliceTrainFile.csv to machine1;

Copy config.json , training_bob.py , bobTrainFileWithLabel.csv to machine2;

Copy config.json , training_server.py to machine3;

Step 5. Run!

Run the following command on the three machines, and the trained logistic regression model (the weights for each feature) will be printed on machine1.

python3 training_bob.py
python3 training_server.py
python3 training_alice.py

Extra notes

About crypto-producer

Currently a third-party crypto-producer is needed for complex tasks such as generating beaver triples, which means TFE is a three-party (with honest majority) computation framework.

We are making progress on eliminating the crypto-producer for the pure two-party case.

Production usage

TFE is an experimental software and must be hardened before used in production environments.


原文请看https://alibaba-gemini-lab.github.io/docs/blog/tfe/

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Linux block IO(块输入输出)是Linux操作系统的IO子系统,用于管理块设备(例如硬盘和SSD)的访问。在多核系统上引入多队列SSD访问是一种优化措施。 传统上,Linux操作系统在处理块设备访问时,使用单个队列(queue)来处理所有IO请求。这种单队列设计对于单核系统来说是合适的,因为只有一个CPU核心可以处理IO请求。然而,在多核系统中,这种设计却成为了性能瓶颈,因为所有的IO请求都必须经过单个队列,即使有多个CPU核心是可用的。 为了解决这个问题,Linux引入了多队列SSD访问功能。这意味着在多核系统上,每个CPU核心都有一个独立的队列来处理IO请求。每个队列可以独立处理IO请求,而不会受到其他队列的干扰。这种设计可以提高系统的并发性和吞吐量。 多队列SSD访问还可以充分利用SSD设备的性能特点。SSD设备通常具有多个通道(channel)和多个闪存芯片(chip),每个通道和芯片都可以同时处理IO请求。通过将IO请求分配给多个队列,可以同时利用多个通道和芯片,从而提高SSD的性能。 在Linux中实现多队列SSD访问需要对内核进行相应的修改和配置。用户可以通过命令和配置文件来设置每个队列的属性和参数,以满足特定场景下的需求。 总之,通过引入多队列SSD访问,Linux在多核系统上可以更好地利用硬件资源,提高系统的性能和吞吐量。这是一个重要的优化措施,可以提高块设备访问的效率和响应速度。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值