ProvChain: A Blockchain-based Data Provenance Architecture in Cloud Environment论文翻译+一点点理解


Cloud data provenance is metadata that records the history of the creation and operations performed on a cloud data object.
Secure data provenance is crucial for data accountability, forensics and privacy. 安全的数据来源对于数据责任、取证和隐私至关重要。(forensics辩论练习,辩论术)
In this paper, we propose a decentralized and trusted cloud data provenance architecture using blockchain technology.
Blockchain-based data provenance can provide tamper-proof records, enable the transparency of data accountability in the cloud, and help to enhance the privacy and availability of the provenance data.
We make use of the cloud storage scenario and choose the cloud file as a data unit to detect user operations for collecting provenance data.
We design and implement ProvChain, an architecture to collect and verify cloud data provenance, by embedding the provenance data into blockchain transactions.
ProvChain operates mainly in three phases: (1) provenance
data collection, (2) provenance data storage, and (3) provenance data validation.
Results from performance evaluation demonstrate that ProvChain provides security features including tamper-proof provenance, user privacy and reliability with low overhead for the cloud storage applications.
性能评估结果表明,ProvChain为云存储应用程序提供了安全功能,包括防篡改来源、用户隐私和可靠性,开销较低。(overhead 也就是开销的意思)
Keywords-Data provenance, Blockchain, Cloud Computing, Privacy, Reliability, Blockchain Cloud.


Cloud computing is widely adopted by commercial and military environment to support data storage, on demand computing and dynamic provisioning. Cloud computing environments are distributed and heterogeneous with a diversity of software and hardware components which are provided by different vendors, possibly introducing risks of vulnerabilities and incompatibility. The security assurance of intra-cloud and inter-cloud data management and transfer arises as a key issue. Cloud auditing can only be effective if all operations on the data can be tracked reliably. Provenance is a process that determines the history of a data product, starting from its original sources [1]. Assured provenance data can help detect access violations within the cloud computing infrastructure. However, developing assured data provenance remains a critical issue for cloud storage applications. Besides, provenance data may contain sensitive information about the original data and the data owners. Hence, there is a need to secure not only the cloud data but also ensure integrity and trustworthiness of provenance data. State-of-the-art cloud based provenance services are vulnerable to accidental corruption or malicious forgery of provenance data[2] .
Blockchain technology has attracted interest due to a shared, distributed and fault-tolerant database that every participant in the network can share the ability to nullify adversaries by harnessing the computational capabilities of the honest nodes and information exchanged is resilient to manipulation. Blockchain network is a distributed public ledger where any single transaction is witnessed and verified by network nodes. Blockchain’s decentralized architecture can be leveraged to develop an assured data provenance capability for cloud computing environment. In decentralized architecture, every node participates in the network for providing services, thereby providing better efficiency. Availability is also ensured because of blockchain’s distributed characteristics. Since a centralized authority is frequently used in cloud services, there is a need to safeguard the personal data while maintaining privacy. With blockchain based cloud data provenance service, all data operations are transparently and permanently recorded. Thus, the trust between users and cloud service providers can easily be established. Furthermore, maintaining provenance can assist in improving the trust of cloud users toward cyber-threat information sharing [3] [4] to enable proactive cyber defense at a reduced security investment [5] [6].

In this paper, we present ProvChain, a blockchain based data provenance architecture to provide assurance of data operations in a cloud storage application, while enhancing privacy and availability at the same time. ProvChain records the operation history as provenance data which will be hashed into Merkle tree nodes [7]. A list of hashes of provenance data will constitute a Merkle tree and the tree root node will be anchored to a blockchain transaction. A list of blockchain transactions will be used to form a block
and the block needs to be confirmed by a set of nodes in order to be included in the blockchain. An attempt to modify a provenance data record will require an adversary to locate the transaction and the block. Blockchain’s underlying cryptographic theory will allow to modify a block record only if the adversary can present a longer chain of blocks than the rest of miners’ blockchain, which is quite difficult to achieve. By leveraging the global-scale computing power of blockchain network, the blockchain based data provenance can provide integrity and trustworthiness. In our architecture, we keep the hashed identity of users in order to protect their privacy from rest of the nodes in blockchain network. The rest of the paper is organized as follows. Section II
provides an overview of the state-of-the-art data provenance efforts and blockchain technology. Section III describes the design of ProvChain, our blockchain based data provenance architecture. The detailed implementation is given in Section IV. Performance evaluation of ProvChain is presented in Section V. Finally, we conclude in Section VI.


A. Data provenance
Data provenance is very critical for cloud computing system administrators to debug break-ins to the system or network. Cloud computing environments are typically characterized by data transfers between diverse system and network components. These data exchanges could take place within a data center or across federated data centers. The data does not usually follow the same path due to multiples copies of the data and diversity of paths taken to ensure resilience. This design adds degree of difficulty for administrators
to accurately identify the origin of attack, what software and/or hardware components caused the attack, and the impacts of the attack. Security violations needed to be identified at a fine granularity and provenance can assist. Current state-of-the art provenance systems in the cloud support the above tasks through logging and auditing technologies. These technologies are not effective in cloud computing systems, which are complex in nature, due to several layers of interoperating software and hardware
components spread across geographical and organizational boundaries. To identify the origin, cause and impact of security
violations in cloud infrastructures will require collection of forensics and logs from the diverse and disparate sources
which is an insurmountable task. At the same time, logs only provide a sequential history of actions related to every application. The provenance data provides the history of the origins of all changes to a data object, list of components that have either forwarded or processed the object and users who have viewed and/or modified the object and has enhanced requirements for assurance.

评论 1




当前余额3.43前往充值 >
领取后你会自动成为博主和红包主的粉丝 规则
钱包余额 0


