Record Linkage Privacy Preserving - One

Record Linkage Privacy Preserving - One

Contents


Introduction

Traditional record link is usually thought to be as entity resolution or data matching, is the identifying of two records whether belong to disparate data sets or refer to the same real-world entity. Usually it contains two main process: Blocking, is to formulate as many as possible matching pairs and, simultaneously, maintain the number of non-matching pairs as small as possible. Matching, is to calculate each pairs’ distance for identifing their similarities, since different data custodians own different types of records refer to the same real-world entity, usually exhibit variations, errors, misspellings, and typos. So it is challenging to balance its precision and performance.

Privacy preserving record linkage not only needs to maintain record linkage function but also further concerns about not leaking enough sensitive information that means no one can mining from these published data, such as its features bits frequencies and other attributes, to reidentify the original personal record. So it calls for anonymization or even some cryptographic methods for transfering sensitive records, due to using cryptographic method meets its huge calculation and cost of time, so there is still a challenge for developping new routes to meet its pratical use.

Hash Function

A hash function is any function that can be used to map data of arbitrary size
to data of a fixed size. The values returned by a hash function are called hash values, hash codes, digests, or simply hashes.

A good hash function for using may contains serveral factors as followed.
- Determinism. For a given input value it must always generate the same hash value.
- Uniformity. A good hash function should map the expected inputs as evenly as possible over its output range. That is, every hash value in the output range should be generated with roughly the same possibility.
- Defined range. It is often desirable that the output of a hash function have fixed size.
- Variable range. In many applications, the range of hash values may be different for each run of the program, or may change along the same run.
- Non-invertible. In cryptographic applications, hash functions are typically expected to be practically non-invertible, meaning that it is not realistic to reconstruct the input datum x x from its hash value h(x) alone without spending great amounts of computing time(see also One-way function).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值