PSI相关论文笔记-Privacy-Preserving Set Operations

3 篇文章 0 订阅
2 篇文章 1 订阅

Privacy-Preserving Set Operations 笔记

  • year: 2005
  • Authors: Lea Kissner and Dawn Song
  • Conference: CRYPTO

一、主要贡献

本文作者通过多项式性质为关于在多方数据集合进行隐私保护操作提供了一个框架,该框架可以进行高性能、安全的联合、相交、删除元素的操作。

二、预备知识

2.1 敌手模型
  • Honest-But-Curious Adversaries: 在该模型中,攻击者会诚实执行协议中预定动作,但期望从计算结果中推测出其他参与方的隐私信息。
  • Malicious Adversaries: 在该模型中,攻击者可能会随意输入数据、拒绝参与协议或过早终止协议。
2.2 加法同态密码系统

本文将使用具有语义安全的支持加法操作的公钥同态密码系统,令 E p k ( ⋅ ) E_{pk}(\cdot) Epk()代表利用 p k pk pk进行加密操作。该密码系统具有在无需私钥情况下的操作:(1)给定 a , b a,b a,b的密文 E p k ( a ) , E p k ( b ) E_{pk}(a),E_{pk}(b) Epk(a),Epk(b),可以计算两者和,即 E p k ( a + b ) = E p k ( a ) + E p k ( b ) E_{pk}(a+b)=E_{pk}(a)+E_{pk}(b) Epk(a+b)=Epk(a)+Epk(b); (2)给定密文 E p k ( a ) E_{pk}(a) Epk(a)和常数 c c c,支持两者相乘操作,即 E p k ( c ⋅ a ) : = c × h E p k ( a ) E_{pk}(c\cdot a):=c\times_{h}E_{pk}(a) Epk(ca):=c×hEpk(a);(3)该密码系统需要支持(n,n)-门限解密,即私钥被隔成 n n n份,只有所有的分割密钥结合在一起才可进行解密操作。另外,所有操作执行时所得密文需要被重新随机化。

目前,Paillier密码方案可满足上述要求。

三、方案技术

问题设定:

  • 假定参与者 i i i的隐私输入数据为 S i S_i Si,且 ∣ S i ∣ = k ( i ≤ k ≤ n ) |S_i|=k (i\le k \le n) Si=k(ikn),集合 i i i中的第 j j j个元素用符号 ( S i ) j (S_i)_j (Si)j表示,所有集合所在区域为 P , ( ∀ i ∈ [ n ] , ∀ j ∈ [ k ] , ( S i ) j ∈ P P,(\forall i\in [n],\forall j\in[k], (S_i)_j\in P P,(i[n],j[k],(Si)jP
  • R R R代表明文领域 D o m ( E p k ( ⋅ ) ) Dom(E_{pk}(\cdot)) Dom(Epk())(Paillier中, R R R Z N Z_{N} ZN)。 R R R需要足够大从而使得在 R R R中随机选择的数据 a a a存在可忽略的概率属于 P P P a ∈ P a \in P aP。一个元素 a a a属于 P P P需要满足以下情况 b = a ∣ ∣ h ( a ) ∈ P b=a||h(a) \in P b=a∣∣h(a)P, h ( ⋅ ) h(\cdot) h()代表着哈希函数。
3.1 多项式环以及多项式表示集合

多项式环 R [ x ] R[x] R[x]代表着系数来自 R R R的所有组合的多项式令 f , g ∈ R [ x ] f,g\in R[x] f,gR[x]

  • f ( x ) = ∑ i = 0 d e g ( f ) f [ i ] x i f(x) = \sum ^{deg(f)}_{i=0}f[i]x^{i} f(x)=i=0deg(f)f[i]xi, f [ i ] f[i] f[i]代表 x i x^{i} xi的对应系数。
  • f + g f+g f+g代表多项式 f f f g g g相加。
  • f ∗ g f*g fg代表多项式 f f f g g g相乘
  • f ( d ) f^{(d)} f(d)代表多项式 f f f的第 d d d次求导,求导完整表达式为 ∑ i = 0 d e g ( f ) − 1 ( i + 1 ) f [ i + 1 ] x i \sum^{deg(f)-1}_{i=0}(i+1)f[i+1]x^i i=0deg(f)1(i+1)f[i+1]xi.

多项式表达集合

给定多重集合 S = { S j } 1 ≤ j ≤ k S=\{S_j\}_{1\le j \le k} S={Sj}1jk,构造相应多项式 f ∈ R [ x ] , f ( x ) = ∏ 1 ≤ j ≤ k ( x − S j ) = ( x − S 1 ) ⋅ ⋅ ⋅ ( x − S k ) f\in R[x], f(x) = \prod_{1 \le j \le k}(x-S_j)=(x-S_1)···(x-S_k) fR[x],f(x)=1jk(xSj)=(xS1)⋅⋅⋅(xSk)。当给定多项式 f ∈ R [ x ] f\in R[x] fR[x],可以通过该多项式 f f f构造多重集合 S S S元素,若某满足条件(1) f ( a ) = 0 f(a)=0 f(a)=0; (2) a ∈ P a \in P aP

四、具有隐私保护的集合操作

f , g f,g f,g为多重集合 S , T S,T S,T的多项式表达方式,之后利用该多项式进行union, intersection, element reduction

4.1 Union

联合运算用符号 S ∪ T S \cup T ST表示元素 a a a S S S中出现的次数 b S ≥ 0 b_S \ge 0 bS0和在 T T T中出现的次数 b T ≥ 0 b_T \ge 0 bT0,那么联合结果中出现的次数为 b S + b T b_S+b_T bS+bT f ∗ g f * g fg S ∪ T S \cup T ST结果的多项式表现形式,原因如下:

  • 所有 S S S T T T中元素满足 ( f ( a ) = 0 ) ∧ ( g ( b ) = 0 ) → ( ( f ∗ g ) ( a ) = 0 ) ∧ ( ( f ∗ g ) ( b ) = 0 ) (f(a)=0)\wedge (g(b)=0) \rightarrow ((f*g)(a)=0)\wedge ((f*g)(b)=0) (f(a)=0)(g(b)=0)((fg)(a)=0)((fg)(b)=0)
  • f ( a ) = 0 ⇔ ( x − a ) ∣ f f(a)=0 \Leftrightarrow (x-a)|f f(a)=0(xa)f,如果重复元素则满足: ( f ( a ) = 0 ) ∧ ( g ( a ) = 0 ) → ( x − a ) 2 ∣ ( f ∗ g ) (f(a)=0) \wedge (g(a)=0) \rightarrow (x-a)^{2}|(f*g) (f(a)=0)(g(a)=0)(xa)2(fg)
4.2 Intersection

相交运算用符号 S ∩ T S\cap T ST表示,具体来说结果中的元素 a a a 存在 b S > 0 b_S>0 bS>0 b T > 0 b_T>0 bT>0,在结果中呈现的次数为 m i n { b S , b T } min\{b_S,b_T\} min{bS,bT}。令 S , T S,T S,T两个多重集合大小相同, f , g f,g f,g分别为其多项式表现形式。 f ∗ r + g ∗ s f*r+g*s fr+gs为最后 S ∩ T S\cap T ST的结果形式,其中 r , s ← R d e g ( f ) [ x ] r,s \leftarrow R^{deg(f)}[x] r,sRdeg(f)[x], R b [ x ] R^b[x] Rb[x]代表着次数从 0 0 0 b b b的所有多项式,且这些多项式系数从 R R R中均匀随机选择。即 r = ∑ i = 0 β r [ i ] x i , s = ∑ i = 0 β s [ i ] x i r=\sum^{\beta}_{i=0}r[i]x^i, s=\sum^{\beta}_{i=0}s[i]x^i r=i=0βr[i]xi,s=i=0βs[i]xi ∀ 0 ≤ i ≤ β   r [ i ] ← R , ∀ 0 ≤ i ≤ β   s [ i ] ← R \forall_{0\le i \le \beta}\ r[i] \leftarrow R, \forall_{0\le i \le \beta}\ s[i] \leftarrow R 0iβ r[i]R,0iβ s[i]R.该结果可通过如下引理所得:

① 引理一:令 f ^ , g ^ \hat{f},\hat{g} f^,g^ R [ x ] R[x] R[x]中多项式, R R R为环, d e g ( f ^ ) = d e g ( g ^ ) = α deg(\hat{f})=deg(\hat{g})=\alpha deg(f^)=deg(g^)=α g c d ( f ^ , g ^ ) = 1 gcd(\hat{f},\hat{g})=1 gcd(f^,g^)=1,令 r = ∑ i = 0 β r [ i ] x i , s = ∑ i = 0 β s [ i ] x i , , ∀ 0 ≤ i ≤ β   r [ i ]   ← R , ∀ 0 ≤ i ≤ β   s [ i ] ← R , β ≥ α r=\sum^{\beta}_{i=0}r[i]x^i, s=\sum^{\beta}_{i=0}s[i]x^i,,\forall_{0 \le i \le \beta}\ r[i] \ \leftarrow R, \forall_{0 \le i \le \beta}\ s[i] \leftarrow R, \beta \ge \alpha r=i=0βr[i]xi,s=i=0βs[i]xi,,0iβ r[i] R,0iβ s[i]R,βα. 令 u ^ = f ^ ∗ r + g ^ ∗ s = ∑ i = 0 α + β u [ i ] x i \hat{u} = \hat{f}*r + \hat{g}*s = \sum^{\alpha + \beta}_{i=0}u[i]x^{i} u^=f^r+g^s=i=0α+βu[i]xi, ∀ 0 ≤ i ≤ α + β   u ^ [ i ] \forall_{0 \le i \le \alpha+\beta}\ \hat{u}[i] 0iα+β u^[i] 均匀且独立分布于 R R R

通过该引理, f ∗ r + g ∗ s = g c d ( f , g ) ∗ u f*r + g*s = gcd(f,g)*u fr+gs=gcd(f,g)u,其中 u u u均匀分布于 R γ [ x ] , γ = 2 d e g ( f ) − ∣ S ∩ T ∣ R^{\gamma}[x],\gamma=2deg(f)-|S\cap T| Rγ[x],γ=2deg(f)ST。若 a a a S ∩ T S\cap T ST中出现 ℓ a \ell_a a次时,那么该元素满足(1) a a a g c d ( f , g ) gcd(f,g) gcd(f,g)的一个根;(2) ( x − q ) ℓ a ∣ g c d ( f , g ) (x-q)^{\ell_a}|gcd(f,g) (xq)agcd(f,g).另外,由于 u u u独立均匀于 R γ [ x ] R^{\gamma}[x] Rγ[x],那么以忽略概率出现 u u u的根属于 P P P。因此, S ∩ T S \cap T ST的结果呈现方式便是 f ∗ g + g ∗ s f*g+g*s fg+gs

4.3 Element Reduction

减少操作用符号 R d d ( S ) Rd_d(S) Rdd(S)表示:对于在 S S S中出现了 b b b次的元素 a a a要减少 d d d次,在结果多重集合中最多出现 m a x { b − d , 0 } max\{b-d,0\} max{bd,0}次, R d d ( S ) Rd_d(S) Rdd(S)的表现形式为 f ( d ) ∗ F ∗ r + f ∗ s f^{(d)}*F*r+f*s f(d)Fr+fs,其中 r , s ← R d e g ( f ) [ x ] r,s\leftarrow R^{deg(f)}[x] r,sRdeg(f)[x] F F F是次数为 d d d的任意多项式,如 ∀ a ∈ P   F ( a ) ≠ 0 \forall_{a\in P}\ F(a)\ne0 aP F(a)=0.

② 引理二: 令 f ∈ R [ x ] f\in R[x] fR[x],其中 R R R为一个环, d > 1 d>1 d>1

  • ( x − a ) d + 1 ∣ f (x-a)^{d+1}|f (xa)d+1f,则 ( x − a ) ∣ f ( d ) (x-a)|f^{(d)} (xa)f(d)
  • ( x − a ) ∣ f (x-a)|f (xa)f ( x − a ) d + 1 ∤ f (x-a)^{d+1} \nmid f (xa)d+1f,则 ( x − a ) ∤ f ( d ) (x-a) \nmid f^{(d)} (xa)f(d)

通过该引理以及 g c d ( F , f ) = 1 gcd(F,f)=1 gcd(F,f)=1,当且仅当元素 a a a R d d ( S ) Rd_d(S) Rdd(S)中出现 ℓ a \ell_a a次时才满足以下情况(1) a a a g c d ( f ( d ) , f ) gcd(f^{(d)},f) gcd(f(d),f)的某个根;(2) ( x − a ) ℓ a ∣ g c d ( f ( d ) , f ) (x-a)^{\ell_a}|gcd(f^{(d)},f) (xa)agcd(f(d),f)。此外,根据引理一得到, f ( d ) ∗ F ∗ r + f ∗ s = g c d ( f ( d ) , f ) ∗ u f^{(d)}*F*r+f*s=gcd(f^{(d)},f)*u f(d)Fr+fs=gcd(f(d),f)u,其中 u u u独立分布于 R γ [ x ] , γ = 2 k − ∣ R d d ( S ) ∣ R^{\gamma}[x],\gamma = 2k - |Rd_d(S)| Rγ[x],γ=2kRdd(S)。由于 u u u中的根以可忽略的概率存在于 P P P,因此 f ( d ) ∗ F ∗ r + f ∗ s f^{(d)}*F*r+f*s f(d)Fr+fs便是 R d d ( S ) Rd_d(S) Rdd(S)的多项式表达形式。

4.4 Operations with Encrypted Polynomials

下面将讨论针对加密多项式的操作,首先用符号 E p k ( f ) E_{pk}(f) Epk(f)代表多项式 f f f系数经过加法同态函数加密后的形式: E p k ( f [ ] ) , ⋅ ⋅ ⋅ , E p k ( f [ d e f ( f ) ] ) E_{pk}(f[]),\cdot \cdot \cdot,E_{pk}(f[def(f)]) Epk(f[]),,Epk(f[def(f)])。令 f 1 , f 2 , g f_1,f_2,g f1,f2,g代表 R [ x ] R[x] R[x]中的多项式, f 1 ( x ) = ∑ i = 0 d e g ( f 1 ) f 1 [ i ] x i , f 2 ( x ) = ∑ i = 0 d e g ( f 2 ) f 2 [ i ] x i , g ( x ) = ∑ i = 0 d e g ( g ) g [ i ] x i f_1(x)=\sum^{deg(f_1)}_{i=0}f_1[i]x^i,f_2(x)=\sum^{deg(f_2)}_{i=0}f_2[i]x^i,g(x)=\sum^{deg(g)}_{i=0}g[i]x^i f1(x)=i=0deg(f1)f1[i]xi,f2(x)=i=0deg(f2)f2[i]xi,g(x)=i=0deg(g)g[i]xi。令 a , b ∈ R a,b\in R a,bR,利用多项式性质可在无需私钥情况下对密文多项式执行以下操作:

  • Sum of encrypted polynomials:给定多项式 f 1 , f 2 f_1,f_2 f1,f2的加密系数,可通过 E p k ( g [ i ] ) : = E p k ( f 1 [ i ] ) + h E p k ( f 2 [ i ] ) ( 0 ≤ i ≤ m a x { d e g ( f 1 ) , d e g ( f 2 ) } ) E_{pk}(g[i]):=E_{pk}(f_1[i])+_h E_{pk}(f_2[i])(0\le i \le max\{deg(f_1),deg(f_2)\}) Epk(g[i]):=Epk(f1[i])+hEpk(f2[i])(0imax{deg(f1),deg(f2)})计算 g : = f 1 + f 2 g:=f_1+f_2 g:=f1+f2
  • Product of an unencrypted polynomial and an encrypted polynomial: 给定系数非加密的多项式 f 2 f_2 f2以及系数加密的多项式 f 1 f_1 f1,可以通过 E p k ( g [ i ] ) : = ( f 2 [ 0 ] × h E p k ( f 1 [ i ] ) ) + h ⋅ ⋅ ⋅ + h ( f 2 [ i ] × h E p k ( f 1 [ 0 ] ) ) ( 0 ≤ i ≤ d e g ( f 1 ) + d e g ( f 2 ) ) E_{pk}(g[i]):= (f_2[0] \times_h E_{pk}(f_1[i])) +_h \cdot \cdot \cdot +_h (f_2[i] \times_h E_{pk}(f_1[0]))(0 \le i \le deg(f_1)+deg(f_2)) Epk(g[i]):=(f2[0]×hEpk(f1[i]))+h+h(f2[i]×hEpk(f1[0]))(0ideg(f1)+deg(f2))来计算两者乘积, g : = f 1 ∗ f 2   ( f 1 ∗ h E p k ( f 1 ) ) g:=f_1*f_2\ (f_1 *_h E_{pk}(f_1)) g:=f1f2 (f1hEpk(f1))
  • Derivative of an encrypted polynomial:给定一个系数为密文的多项式 f 1 f_1 f1,可通过计算每个加密系数 E p k ( g [ i ] ) : = ( i + 1 ) × h E p k ( f 1 [ i + 1 ] ) ( 0 ≤ i ≤ d e g ( f 1 ) − 1 ) E_{pk}(g[i]):=(i+1)\times_h E_{pk}(f_1[i+1])(0\le i \le deg(f_1)-1) Epk(g[i]):=(i+1)×hEpk(f1[i+1])(0ideg(f1)1)来计算 g : = d d x f 1 g:= \frac{d}{dx}f_1 g:=dxdf1
  • Evaluation of an encrypted polynomial at an unencrypted point:给定一个系数加密的多项式 f 1 f_1 f1,可通过计算 E p k ( a ) = ( b 0 × h E p k ( f 1 [ 0 ] ) ) + h ⋅ ⋅ ⋅ + h ( b d e g ( f ) × h E p k ( f 1 [ d e g ( f 1 ) ] ) ) E_{pk}(a)=(b^0\times_h E_{pk}(f_1[0]))+_h \cdot \cdot \cdot +_h (b^{deg(f)}\times_h E_{pk}(f_1[deg(f_1)])) Epk(a)=(b0×hEpk(f1[0]))+h+h(bdeg(f)×hEpk(f1[deg(f1)]))求得 a : = f 1 ( b ) a:=f_1(b) a:=f1(b)

五、应用一:Private Set-Intersection

问题定义:假设存在n方,每一方输入数据为 S i ( 1 ≤ i ≤ n ) , ∣ S i ∣ = k S_i(1 \le i \le n),|S_i|=k Si(1in),Si=k S e t − I n t e r s e c t i o n   p r o b l e m Set-Intersection\ problem SetIntersection problem 定义如下:所有参与者能够获得输入数据求交后的结果但无法得到其他任何信息,即得到 S 1 ∩ S 2 ∩ ⋅ ⋅ ⋅ ∩ S n S_1\cap S_2 \cap \cdot \cdot \cdot \cap S_n S1S2Sn

5.1 Set-Intersection Protocol for honest-but-curious Adversary

image-20230414153736876

  • 其中第4步中推出来好像是错误的,我认为应该是
    KaTeX parse error: Undefined control sequence: \matrix at position 5: p= \̲m̲a̲t̲r̲i̲x̲{\lambda_n=\sum…

六、应用二:Private Over-Threshold Set-Union

问题定义:假设存在每个输入数据都为 S i ( ∣ S i ∣ = k ) S_i(|S_i|=k) Si(Si=k)的用户 i ( 1 ≤ i ≤ n ) i(1\le i\le n) i(1in) O v e r − T h r e s h o l d   S e t − U n i o n   p r o b l e m Over-Threshold\ Set-Union\ problem OverThreshold SetUnion problem定义如下:所有用户获得在联合结果中至少出现 t t t次的元素,但无法获得其他任何信息。例如:元素 a a a在联合数据集中出现了15次,门限值 t = 10 t=10 t=10,那么所有参与者都将知道 a a a出现了15次。该问题的计算可用符号 R d t − 1 ( S 1 ∪ ⋅ ⋅ ⋅ ∪ S n ) Rd_{t-1}(S_1 \cup \cdot \cdot \cdot \cup S_n) Rdt1(S1Sn)表示。

image-20230414171350361

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Privacy-preserving machine learning is becoming increasingly important in today's world where data privacy is a major concern. Federated learning and secure aggregation are two techniques that can be used to achieve privacy-preserving machine learning. Federated learning is a technique where the machine learning model is trained on data that is distributed across multiple devices or servers. In this technique, the model is sent to the devices or servers, and the devices or servers perform the training locally on their own data. The trained model updates are then sent back to a central server, where they are aggregated to create a new version of the model. The key advantage of federated learning is that the data remains on the devices or servers, which helps to protect the privacy of the data. Secure aggregation is a technique that can be used to protect the privacy of the model updates that are sent to the central server. In this technique, the updates are encrypted before they are sent to the central server. The central server then performs the aggregation operation on the encrypted updates, and the result is sent back to the devices or servers. The devices or servers can then decrypt the result to obtain the updated model. By combining federated learning and secure aggregation, it is possible to achieve privacy-preserving machine learning. This approach allows for the training of machine learning models on sensitive data while protecting the privacy of the data and the model updates.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值