Notes about <Oblivious Multi-Party Machine Learning on Trusted Processors>

最新推荐文章于 2023-01-14 09:51:27 发布

kevin_darkelf

最新推荐文章于 2023-01-14 09:51:27 发布

阅读量274

点赞数

分类专栏： machine_learning

本文链接：https://blog.csdn.net/kevin_darkelf/article/details/79114679

版权

machine_learning 专栏收录该内容

17 篇文章 0 订阅

订阅专栏

Introduction

In our system, multiple parties
agree on a joint machine learning task to be executed on
their aggregate data, and on an SGX-enabled data center
to run the task. Although they do not trust one another,
they can each review the corresponding machinelearning
code, deploy the code into a processor-protected
memory region (called an enclave), upload their encrypted
data just for this task, perform remote attestation,
securely upload their encryption keys into the enclave,
run the machine learning code, and finally download the
encrypted machine learning model. The model may also
be kept within the enclave for secure evaluation by all
the parties, subject to their agreed access control policies.

Figure 1:

Sample privacy-preserving multi-party machine
learning system. Multiple hospitals encrypt patient datasets,
each with a different key. The hospitals deploy an agreed-upon
machine learning algorithm in an enclave in a cloud data center
and share their data keys with the enclave. The enclave processes
the aggregate datasets and outputs an encrypted machine
learning model.

Preliminaries

SGX

Adversary Model

Security Guarantees

for each machine learning algorithm, we specify public parameters that are allowed to be disclosed (such as the input sizes and the number of iterations to perform)
and we treat all other inputs as private.

We then say that an algorithm is data-oblivious
if an attacker that interacts with it and observes its interaction with memory, disk and network
learns nothing except possibly those public 2 parameters.

We define this interaction as a trace execution t of I/O events, each recording an access type (read
or write), an address, and some contents, controlled by
the adversary for all read accesses.

Data-Oblivious Primitives

oblivious assignments and comparisons

Oblivious array accesses

Oblivious sorting

Machine Learning Algorithms

Protocols

We assume that each party agrees on the machinelearning
code, its public parameters, and the identities
of all other parties (based, for example, on their public
keys for signature). One of the parties sends this collection
of code and static data to the cloud data center,
where an (untrusted) code-loader allocates resources and
creates an enclave with that code and data.

Each party independently establishes a secure channel
with the enclave, authenticating themselves (e.g., using
signatures) and using remote attestation [2] to check the
integrity of the code and static data loaded into the enclave.
They may independently interact with the cloud
provider to confirm that this SGX processor is part of that
data center. Each party securely uploads its private data
to the enclave, using for instance AES-GCM for confidentiality
and integrity protection. Each party uses a
separate, locally-generated secret key to encrypt its own
input data set, and uses its secure channel to share that
key with the enclave. The agreed-upon machine learning
code may also be optionally encrypted but we expect that
in the common case this code will be public.

After communicating with all parties, and getting the
keys for all the data sets, the enclave code runs the target
algorithm on the whole data set, and outputs a machine
learning model encrypted and integrity protected with a
fresh symmetric key. We note that denial-of-service attacks
are outside the threat model for this paper—the parties
or the data centre may cause the computation to fail
before completion. Conversely, any attempt to tamper
with the enclave memory (including its code and data)
would be caught as it is read by the SGX processor, and
hence the job completion guarantees the integrity of the
whole run. Finally, the system needs to guarantee that all
parties get access to the output. To achieve this, the enclave
sends the encrypted output to every party over their
secure authenticated channel, and waits for each of them
to acknowledge its receipt and integrity. It then publishes
the output key, sending it to all parties, as well as to any
reliable third-party (to ensure its fair availability).