gdpr合规性测试_使用生产数据在GDPR后世界进行测试

gdpr合规性测试

To SQL Server DBAs who are the shepherds of data in organizations, key GDPR questions, in general, center around whether data will need to be treated differently, safeguarded more etc. and specifically, as it relates to allowing production data to be used in testing.

对于作为组织中的数据牧羊人SQL Server DBA而言,关键的GDPR问题通常集中在是否需要区别对待数据,保护更多数据等方面,特别是涉及允许将生产数据用于测试中。 。

That will be the focus of this article as we’ll work our way through the details of this regulation as well as various authoritative articles on the subject, to address this key question. Then we’ll look to ways and means to potentially ameliorate our findings to provide alternatives and workarounds if possible.

这将是本文的重点,因为我们将逐步研究该法规的细节以及有关该主题的各种权威文章,以解决这个关键问题。 然后,我们将寻找可能改善我们的发现的方法,以便在可能的情况下提供替代方法和解决方法。

我可以在测试中使用原始生产数据并符合GDPR吗? (Can I use raw production data in testing and be compliant with GDPR?)

In terms of personal data e.g. names, addresses, phone numbers etc, the short answer is “no” and we’ll go through the reasons why not.

对于个人数据(例如姓名,地址,电话号码等),简短的回答是“否”,我们将详细说明为什么不这样做的原因。

Lawful basis

合法依据

To begin with GDPR stipulates that data can only be processed if there is a lawful basis

从GDPR开始,规定只有在有合法依据的情况下才能处理数据

“Data can only be processed if there is at least one lawful basis to do so. The lawful bases for processing data are:
  1. the data subject has given consent to the processing of his or her personal data for one or more specific purposes.
  2. processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract.
  3. processing is necessary for compliance with a legal obligation to which the controller is subject.
  4. processing is necessary in order to protect the vital interests of the data subject or of another natural person.
  5. processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.
  6. processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.” 1
“只有在至少有合法依据的情况下,才能处理数据。 处理数据的合法依据 是:
  1. 数据主体已同意出于一个或多个特定目的处理其个人数据。
  2. 为了执行数据主体所参与的合同,或者为了在数据主体订立合同之前根据数据主体的要求采取步骤,必须进行处理。
  3. 为了遵守控制者所承担的法律义务,必须进行处理。
  4. 为了保护数据主体或另一个自然人的切身利益,必须进行处理。
  5. 为了执行出于公共利益或行使控制者所赋予的官方权力而执行的任务,必须进行处理。
  6. 为了控制者或第三方追求的合法利益,有必要进行处理,除非这种利益被需要保护个人数据的数据主体的利益或基本权利和自由所取代,特别是在数据主题是个孩子。” 1个

For our purposes, item #6 is the only one that is relevant as none of the other items, by their nature, would allow for the use of production data, whether personal or sensitive, to be used for testing purposes.

就我们的目的而言,项目#6是唯一相关的项目,因为其他项目从本质上讲都不会允许使用个人或敏感的生产数据来进行测试。

So far we haven’t determined, with authority, that we can’t use raw production data for testing, but our window for allowing this has considerably narrowed, and we must demonstrate “legitimate interests” to proceed (see next).

到目前为止,我们还没有权限确定不能使用原始生产数据进行测试,但是我们允许这样做的窗口已经大大缩小,我们必须展示“合法利益”才能继续进行(请参见下一个)。

Legitimate basis

合法依据

Initially, this seems promising e.g. “Hey, testing is certainly a legitimate interest” but there are already some hints of trouble.

最初,这似乎很有希望,例如“嘿,测试当然是合法的利益”,但是已经有一些麻烦的提示。

Although testing certainly isn’t a nefarious activity, by any sense, the qualification is already limited by the “interests or fundamental rights and freedoms of the data subject”.

尽管从某种意义上说,测试当然不是邪恶的活动,但是该资格已经受到“数据主体的利益或基本权利和自由”的限制。

GDPR offers helpful examples of legitimate use of further data processing including Recital 47: processing for direct marketing purposes or preventing fraud; But … this is specifically qualified to include the interests, expectations and rights of the subjects of the data e.g. “The interests and fundamental rights of the data subject could in particular override the interest of the data controller where personal data are processed in circumstances where data subjects do not reasonably expect further processing”

GDPR提供了合法使用进一步数据处理的有用示例,包括第47号建议书:出于直接营销目的或防止欺诈的处理; 但是……这特别适用于包括数据主体的利益,期望和权利,例如“数据主体的利益和基本权利尤其可以在处理个人数据的情况下凌驾于数据控制者的利益之上,受试者不合理地期望进一步处理”

Although this recital refers to marketing, not testing, it is analogous. Would you expect that your data would be viewed, manipulated, tested or otherwise “processed” for the purposes of software development and quality control? Unlikely. Also, as there is no recital that explicitly (or implicitly) allows for data testing, we can’t rely on that.

尽管本次朗诵是针对市场营销,而不是针对测试,但这是相似的。 您是否希望出于软件开发和质量控制的目的,可以查看,操纵,测试或“处理”您的数据? 不太可能。 另外,由于没有明确(或暗示)进行数据测试的独奏会,因此我们不能依靠它。

So basically, unless the subjects of the data would have a reasonable expectation that their data would be used for such purposes, they can’t be considered “legitimate” in the context of GDPR and therefore would not be allowed under GDPR.

因此,基本上,除非数据主体有合理的期望将其数据用于此类目的,否则在GDPR的背景下不能将其视为“合法”,因此GDPR不允许这样做。

For the purposes of using raw production data in testing, this is essentially the end of the road. But if there was an ambiguity, or remaining questions, those should be put to rest by additional considerations that makes processing production data for testing, even a worse idea, in the context of GDPR, namely processing purpose and the right to object.

为了在测试中使用原始生产数据,这实际上是路的尽头。 但是,如果存在歧义或余下的问题,则应通过其他考虑因素解决这些问题,这些考虑因素使得处理生产数据以进行测试,甚至在GDPR的背景下,甚至是更差的主意,即处理目的和反对权。

Processing purpose

加工目的

This should come as no surprise to some who have followed, or been affected by similar legislation. This finding is consistent with the UK Data Protection Act, which is a precursor to GDPR. The UK Data Protection Act actually states that

对于某些遵循或受到类似立法影响的人来说,这不足为奇。 这一发现与GDPR的先驱英国《数据保护法》相一致 英国《数据保护法》实际上规定:

“Personal data shall be obtained only for one or more specified and lawful purposes, and shall not be further processed in any manner incompatible with that purpose or those purposes.”
“仅应出于一个或多个特定的合法目的而获得个人数据,不得以与该目的或那些目的不相容的任何方式对个人数据进行进一步处理。”

So unless the original purpose of the data collection was for testing purposes, then using it in such a manner would seem to violate both the spirit and letter of the act, and its subsequent incarnation, GDPR, as well.

因此,除非数据收集的原始目的是为了进行测试,否则以这种方式使用数据似乎违反了该法案的精神和实质,也违反了该法案的后续版本GDPR。

Right to object

反对权

Furthermore, GDPR provides for the right to object, so even if your organization used personal data without consent, for the reasons of “legitimate interests”, you would still have an obligation to inform the data subjects of the new instance of processing, and allow them to explicitly opt out of this. This would obviously be impractical for the purposes of software/database testing.

此外,GDPR规定了异议 ,因此,即使您的组织未经同意使用了个人数据,出于“合法利益”的原因,您仍然有义务将新的处理实例告知数据主体,并允许他们明确选择退出。 对于软件/数据库测试而言,这显然是不切实际的。

Summary

摘要

So regardless of the safeguards and protections applied to the data, production data can’t be processed for ulterior purposes from when it was originally obtained, without explicit permission from the data subject, an unrealistic scenario.

因此,无论对数据采取何种保护和保护措施,未经数据主体的明确许可,就无法从原始数据获取数据起就将其用于别有用心的情况。

好的,我不能使用原始生产数据,但是如果我混淆了假名又该怎么办? (Ok, I can’t use raw production data, but what if I obfuscate aka pseudonymise it? )

We’ve determined how re-processing raw production data for an ulterior purpose, in this case, for database testing, is a non-starter in the context of GDPR, but can anything be done to mitigate the requirement and/or provide some workarounds?

我们已经确定了在某些情况下重新处理原始生产数据(在这种情况下进行数据库测试)在GDPR的背景下不会起步,但是可以采取任何措施来减轻需求并/或提供一些解决方法?

GDPR explicitly encourages and recommends the obfuscation of data. In the context of GDPR, obfuscation is a basic requirement for re-processing data. Obfuscation doesn’t, necessarily, get you out from the stipulations of GDPR but it does relax some of the compliance and auditing requirements.

GDPR明确鼓励并建议对数据进行混淆。 在GDPR的背景下,混淆是重新处理数据的基本要求。 混淆不一定会让您脱离GDPR的规定,但确实会放松一些合规性和审核要求。

As to the question above, short answer is “yes” but with the qualification that such pseudonymised data, in turn, falls under GDPR, is considered “personal” data and must still be audited, secured etc. In some cases, the cost of GDPR compliance might preclude the use of production data for test, even though it would be allowable

对于上述问题,简短的答案是“是”,但具有这样的资格:假名数据又属于GDPR,被视为“个人”数据,并且仍必须进行审核,保护等。在某些情况下,即使可以接受GDPR,也可能无法使用生产数据进行测试

2 2

The basic goal of pseudonymization, is to break up interrelated data or reduce its “linkability” so that it can’t be attributed back to the original data subject. So, if a name, social security number and address were required to uniquely identify and individual, then if two of the three items were pseudonymised then there would be no way to definitively identify the data subject.

假名的基本目标是破坏相互关联的数据或降低其“可链接性”,以便不能将其归因于原始数据主体。 因此,如果需要使用姓名,社会安全号码和地址来唯一标识和区分个人,则如果对这三个项目中的两个进行假名化,则将无法确定身份。

Pseudonymised data is reversible, which means it is still considered personal data from the perspective of GDPR and must is held to the same rigorous compliance standards of non pseudonymised data.

假名数据是可逆的 ,这意味着从GDPR角度来看,假名数据仍然被视为个人数据,并且必须遵守与非假名数据相同的严格合规性标准。

Examples of pseudonymization might be converting the data based on a particular algorithm or process that is reversible or replacing data, but storing the replaced data in way that allows it to be achieved. Another example is encrypting the data, but allowing for decryption to its original state.

假名的示例可能是基于可逆的特定算法或过程转换数据或替换数据,但以允许实现的方式存储替换后的数据。 另一个示例是加密数据,但允许解密到其原始状态。

Advantages and disadvantages

的优点和缺点

The advantages of pseudonymization are that

假名的优点是

  • Data is usually not totally altered and transformed to a state that might break the integrity of systems or make data unreadable.

    数据通常不会完全更改或转换为可能破坏系统完整性或使数据无法读取的状态。
  • Pseudonymization is generally less effort because it is less intrusive/comprehensive

    假名化通常不那么费力,因为它不那么具有侵入性/全面性
  • The process is reversible

    这个过程是可逆的

The disadvantage of pseudonymization is that since it is reversible, it is still considered personal data, and falls under the same stringent data protection, auditing and compliance requirements as non-obfuscated data, which can be time consuming and expensive to implement, and expose additional teams, like QA, to liability.

假名的缺点在于,由于它是可逆的,因此仍被视为个人数据,并且与未混淆的数据处于相同的严格数据保护,审核和合规性要求下,这既耗时又昂贵,并且暴露了其他小组(如质量检查小组)承担责任。

我可以使用生产数据进行测试而不必担心GDPR吗? (Can I use production data for testing without having to worry about GDPR at all?)

Yes, the data can be used if it is anonymized.

是的,如果数据是匿名的 ,则可以使用

Anonymization is a more rigorous form of obfuscation, that essentially renders the process data to a state that it can never be re-identified, unlike pseudonymization where data can be re-identified. All of the data elements are obfuscated, vs Pseudonymization, in which only enough data elements need to be obfuscated to de-link them, to prevent identification of the data subject.

匿名化是一种更严格的混淆形式,实质上使过程数据处于永远无法重新识别的状态,这与可以重新识别数据的假名化不同。 与假名化相比, 所有数据元素都被混淆了,在伪化中,仅需要混淆足够的数据元素才能将它们去链接,以防止识别数据主体。

An example of anonymization, would be to encrypt the data and then delete the encryption key so that the data could never be decrypted again.

匿名化的示例是对数据进行加密,然后删除加密密钥,这样就永远不会再次解密数据。

Advantages and disadvantages

的优点和缺点

The advantage of anonymization is that GDPR compliance requirements no longer apply to it.

匿名化的优点是GDPR合规性要求不再适用于它。

The disadvantage of anonymization is that it can be costlier and labor intensive to apply, since all of the data must be processed and with methods that are sophisticated enough that they can’t be reversed.

匿名化的缺点是,应用程序可能会更昂贵且劳动强度大,因为必须处理所有数据并使用足够复杂的方法来处理它们,这些数据无法反转。

The challenge with this solution is that the cost and expense of anonymizing all of your production data, needs to be weighed against the costs and expenses of just complying with GDPR and using lessor forms of obfuscation.

此解决方案的挑战在于,需要权衡使所有生产数据匿名化的成本和费用与仅遵守GDPR并使用出租人形式的混淆处理的成本和费用。

混淆的常见解决方案是什么? (What are the common solutions for obfuscation?)

Masking is the primary means for data obfuscation. It is the process of scrambling, blurring, replacing existing data with data of approximate length and format.

屏蔽是进行数据混淆的主要手段。 这是加扰,模糊化,用近似长度和格式的数据替换现有数据的过程。

Encryption, can be thought of as just another obfuscation technique, but it generally protects data to a much higher degree and is irreversible without a particular “key”. Note that for encrypted data to be considered anonymized, it must be totally irreversible, so the key must be destroyed or otherwise be made inaccessible.

加密可以被认为是另一种混淆技术,但是它通常可以更高程度地保护数据,并且在没有特定“密钥”的情况下是不可逆的。 请注意,要使加密数据成为匿名数据,它必须是完全不可逆的,因此必须销毁密钥或使密钥不可访问。

除了混淆生产数据,还有其他选择吗? (Are there any alternatives to obfuscating production data?)

Yes, another approach would be to forego the use of production data entirely and instead use synthetic data. Randomly generated, synthetic data or “test data” isn’t production data so wouldn’t be required to be audited or secured to comply with GDPR.

是的,另一种方法是完全放弃生产数据的使用,而使用合成数据。 随机生成的合成数据或“测试数据”不是生产数据,因此不需要进行审核或保护以符合GDPR的要求。

Synthetic data could be used as a means to implement both pseudonymization and anonymization.

合成数据可用作实现假名和匿名化的手段。

Advantages and disadvantages

的优点和缺点

Creating test data would be equivalent, in time and expense, to masking production data.

创建测试数据在时间和费用上等同于掩盖生产数据。

If only some parts of personal production data were replaced with synthetic generated test data, to achieve pseudonymization, then you would be able to use it as test data, but not without adhering to GDPR compliance requirements.

如果仅将个人生产数据的某些部分替换为合成的生成的测试数据,以实现假名,那么您将能够将其用作测试数据,但必须遵守GDPR合规性要求。

If all personal data were replaced with synthetic generated test data, or an entirely new set of data was generated from scratch, you will have achieved the benefits of full anonymization and could escape the rigors of GDPR compliance entirely.

如果将所有个人数据替换为综合生成的测试数据,或者从头开始生成了一组全新的数据,则您将获得完全匿名化的好处,并且可以完全摆脱GDPR合规性的严峻考验。

Comparison

比较方式

This illustration shows the four source/types of data in the context of GDPR and obfuscation.

下图显示了GDPR和混淆情况下的四种数据源/类型。

Raw production data can’t be used in production.

无法在生产中使用原始生产数据。

Pseudonymised production data is partially obfuscated, using either masking or encryption. Although such data can be used in testing, GDPR compliance is still required.

使用掩蔽或加密可以部分混淆使用假名化的生产数据。 尽管此类数据可用于测试,但仍需要符合GDPR。

Anonymized production data is fully obfuscated either by irreversible masking or encryption. GDPR compliance is not required.

匿名生产数据通过不可逆掩蔽或加密被完全模糊。 不需要符合GDPR。

Automatically generated, synthetic test data, can be used to partially or fully obfuscated personal data, but in most cases, such data would be fully obfuscated. In this case, GDPR compliance is not required.

自动生成的合成测试数据可用于部分或完全混淆个人数据,但是在大多数情况下,此类数据将被完全混淆。 在这种情况下,不需要遵守GDPR。

The above schematic shows the relative costs and compliance level to the four types of data used in testing.

上面的示意图显示了测试中使用的四种数据的相对成本和合规性水平。

  • Raw data is the least expensive, because, by definition, it doesn’t require refactoring but it is also the least compliant, and in the context of GDPR, totally non-compliant/illegal 原始数据是最便宜的,因为按照定义,它不需要重构,但它也是最不合规的,并且在GDPR的情况下,完全不合规/非法
  • Pseudonymised data is more expensive, than raw data, as it must be partially obfuscated, at least, and through this re-processing, it becomes more compliant with GDPR, and even fully compliant as long as the data is audited and secured 假名化的数据比原始数据更昂贵,因为至少必须对它们进行部分混淆,并且通过重新处理,它变得更符合GDPR,甚至完全符合标准,只要对数据进行审核和保护即可。
  • Synthetic data, involves similar effort and expense of Pseudonymised data, but since the data is new, and totally different than the production data, that it would replace, it would no longer be considered personal data, and as such have no requirements for GDPR compliance 合成数据涉及假名化数据的类似工作量和费用,但是由于该数据是新数据,并且与生产数据完全不同,因此可以替换,因此不再视为个人数据,因此对GDPR的遵守没有要求
  • Anonymized data is more expensive because all personal data, not just parts, must be obfuscated and the obfuscation process must be irreversible. But for this effort and expense, the reward is that such data is fully compliant with GDPR to the degree that it falls outside the requirement for auditing, security etc 匿名数据更加昂贵,因为必须对所有个人数据(不仅是部分数据)进行混淆,并且混淆过程必须不可逆。 但是,为此付出的努力和费用,好处是这些数据完全符合GDPR的程度,超出了审核,安全性等要求

摘要 (Summary)

Using raw personal data in production, was never a good idea, but GDPR essentially makes it illegal, for affected countries because such data can’t be re-processed, without explicit opt in.

在生产中使用原始个人数据从来都不是一个好主意,但是对于受影响的国家,GDPR实质上将其视为非法,因为如果没有明确选择加入,则无法对这些数据进行重新处理。

But GDPR does allow for obfuscated data to be re-processed, even for purposes it was never originally intended or gathered for.

但是GDPR确实允许对混淆后的数据进行重新处理,即使出于原始目的或未曾为之目的也是如此。

Obfuscation, is an umbrella term that includes varying degrees of data transformation. It includes pseudonymization of data, which partially obfuscates data and is reversible. Anonymization is a process that fully obfuscates that data and it is irreversible.

模糊处理是一个笼统的术语,包括不同程度的数据转换。 它包括数据的假名化,这部分混淆了数据并且是可逆的。 匿名化是一个完全混淆数据并且不可逆的过程。

Although pseudonymization of data allows it to be re-processed, and does ameliorate some of the requirements of GDPR, it doesn’t change the classification of the data, so it must continue to be considered personal data and treated like that, in the context of GDPR, including auditing and security.

尽管数据的假名化允许对其进行重新处理,并确实满足了GDPR的某些要求,但它不会改变数据的分类,因此必须继续将其视为个人数据并在这种情况下进行此类处理GDPR,包括审核和安全性。

Anonymization of data, entirely frees you of the requirement to comply with GDPR in managing the obfuscated data.

数据匿名化使您完全摆脱了在处理混淆数据时遵守GDPR的要求。

Randomly generated, synthetic test data can be used as an alternative to obfuscated production data. It can be used to pseudonymise and/or fully anonymize data, depending on how much production data is replaced.

随机生成的合成测试数据可以用作混淆的生产数据的替代方法。 它可以用来假名和/或完全匿名化数据,具体取决于替换了多少生产数据。

1 EUR-Lex REGULATION (EU) 2016/679
2 Pseudonymization

1 EUR-Lex法规(EU)2016/679
2 假名化

翻译自: https://www.sqlshack.com/using-production-data-testing-post-gdpr-world/

gdpr合规性测试

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值