1 Part 1 - ByteTide Package Loader & Merkle Tree

WX: help-assignment

1 Part 1 - ByteTide Package Loader & Merkle Tree
To get started, you will need to be able to parse a .bpkg file and load it. To assist you with writing your code and complying with the test program, you are advised to complete the pkgchk.c file in the src directory.
1.1 The Package and its File Format (.bpkg)
In this program, a file is composed of several anomalous data chunks. The chunks are organised in a specific way such that when they are combined, the entire contents of the file can be constructed and presented to the user. A package defines the necessary information and resources required to construct the contents of a file. Packages represent a unique file given by an identifier string ident.
The package file format is a text format that will need to be parsed by your program. The package file format has the following fields. Please refer to the hash and chunk parts of the glossary. To also clarify, the package file, can be modelled as a binary tree, the term h, refers to the height of the tree in this instance.
• ident, hexadecimal string (1024 characters max), the identifier is used within the network to identify the same packages.
• filename, string (256 characters max), This is used to help save and locate the file to update when data is sent to it.
• size, uint32_t, specifies the size in bytes
• nhashes, uint32_t, specifies the number of hashes that are pre-computed from the original file. There must be only 2ˆ(h-1)-1 hashes which will correspond to the hashes of all non-leaf node
• hashes, string[2ˆ(h-1) - 1] (64 characters for each string), these correspond to the number hashes in the previous nhashes field.
• nchunks, uint32_t, specifies the number of chunks. The number of chunks
must be a 2ˆ(h-1) value.
• chunks, struct[2ˆ(h-1)], each chunk have the fields: hash, offset and size.
– hash refers to a string (64 characters), corresponding to the datablock hash value – offset, uint32_t, is the offset within the file
– size, uint32_t, is the size of the chunk in bytes
The format below gives an outline to the structure of a .bpkg file. Refer to the resources folder in the scaffold for a real example.
ident:
filename:
size:
nhashes:
hashes:
“hash value”

nchunks: <number of chunks, these are all leaf nodes>
chunks:
“hash value”,offset,size

1.2 Package Loading
The focus of this task is to load the .bpkg file and also store the details into a merkle tree. Please refer to Section 1.3 for information on a merkle tree.
Read and load .bpkg files that comply with the format outlined in Section 1.1
Once the .bpkg has been loaded successfully, it is advisable that your program also knows if the file exists or not and has functionality to construct a file of the size outlined in the file. Refer to pkgchk.c:bpkg_file_check function.
Implement a merkle tree. Use the data from a .bpkg to construct a merkle-tree
Refer to pkgchk.c:bpkg_get_all_hashes and pkgchk.c:bpkg_get_all_chunk_hashes_from_hash functions, as you should be able to satisfy these operations after implementing a merkle tree without any IO on the data file.
Computing the merkle tree hashes, ensuring that combined hashes match the parents hashes when computed and finding minimum completed hashes. Refer to pkgchk.c:bpkg_get_completed_chunks and pkgchk.c:bpkg_get_min_completed_hashes functions. You will need to perform
vali- dation on the chunks and discover portions of the file.
The above verifies chunks against package files and the data’s integrity.
1.3
Binary Tree A merkle tree is a variation on a binary tree. A binary tree is tree
data structure,
where a node is compose of the following.
• It holds a value/data
• Usually implemented to hold a key as well (Key-Value/Map Data Structure)
• Connected to two other nodes that are referred to as children. These are referred to as left and right nodes.
A common structure within C for a binary tree node is as follows.
struct bt_node { void* key;
void* value;
struct bt_node* left; struct bt_node* right;
};
The above node, holds a key that will allow it to be searchable with the rule that
it must be unique. It also holds a value, which can be assigned to arbitrary data.
Please Note: When building a tree with a key field that allows you to perform a search an efficient tree search, you will need to ensure that your tree is using an appropriate function for the job. Hint, if your tree is going to be multi-purpose, consider giving your tree a function pointer to compare the key.
To navigate and/or traverse a tree, you’d be advised to traverse it in in-order traversal. Please make sure refer to your tree traversals. Please refer to the following documents to revise on tree-traversals:
• Tree-Traversal - Wikipedia • Visualgo – BST
Qualities of a merkle tree A merkle tree must is typically a perfect or full and complete binary tree but it can also be represented as a just a complete binary tree (Refer to Errata, Variations and Notes).
• Given a depth of d, the total number of nodes in your tree will be 2ˆd - 1
• All levels are full (necessary for a perfect binary tree).
• A merkle tree will have 2ˆ(d-1) nodes at depth d, these will refer to your chunks.
• A merkle tree will have 2ˆ(d-1) -1 non-leaf-nodes.
• All leaves have the same depth (no skewing)
All nodes in a merkle tree have a hash value. Hashes of a leaf node corresponds to a hash value of a data chunk. This value is derived from computing hash value of the data chunk itself.
All other non-leaf nodes derive their hash value by hashing their children’s hash values together. Lets break down the above diagram.
• L1-L4 are data blocks, these refer to chunks in a file.
• Your leaf nodes 0-0 to 1-1 use a hash function to compute the hash of those data blocks.
Given this part already, we have enough information to validate individual blocks. Pseudocode Example: self.hash = Hash(DataBlock[i])

• Your non-leaf nodes 0, 1 and root, compute their hashes by combining the hash of their chil- dren into a long string and compute the hash of that (Refer: Errata, Variations and Notes)
Pseudocode Example: self.hash = Hash(left.hash + right.hash)
在这里插入图片描述

The following is in relation to the .bpkg file and your merkle tree’s construction. You will have an expected hash value stored by your nodes and a computed hash value that you can use to 1) compute the hash on datablocks if it is a leaf node, or 2) compute the hash from the concatenation of left and right node hashes if it is a non-leaf node.
The following is an expansion of the operations. We are going through an example of computing the hash of root node of a tree with 7 nodes (similar to the diagram):
Expansion Pseudocode, with steps:
We need to compute the hash of the left and right child

  1. Hash(root) = Hash(
    Hash(root.left) + Hash(root.right)
    )
    Since left and right child are not leaf nodes, we need to do it again
  2. Hash(root) = Hash(
    Hash(
    Hash(root.left.left) + Hash(root.left.right)
    )
  • Hash(
    Hash(root.right.left) + Hash(root.right.right)
    )
    )
    We have found the leaf nodes
    Compute the hash of the data blocks, the size is the chunk size as outlined in the .bpkg
  1. Hash(root) = Hash(
    )
    Hash(
    Hash(DataBlock[0]) + Hash(DataBlock[1])
    )
  • Hash(
    Hash(DataBlock[2]) + Hash(DataBlock[3])
    )
    We concatenate the leaf children hashes that is assigned to their computed field
  1. Hash(root) = Hash(
    Hash(
    root.left.left.computed + root.left.right.computed
    )
  • Hash(
    root.right.left.computed + root.right.right.computed
    )
    )
    Once again, concatenate the children hashes and compute the hash of that
  1. Hash(root) = Hash(
    root.left.computed + root.right.computed
    )
    To help you get started, you can use the following struct as well as some helpful scaffold data.
    struct merkle_tree_node { void* key;
    void* value;
    struct merkle_tree_node* left;
    struct merkle_tree_node* right;
    int is_leaf;
    char expected_hash[64]; //Refer to SHA256 Hexadecimal size char
    computed_hash[64];
    };
    struct merkle_tree {
    struct merkle_tree_node* root;
    size_t n_nodes;
    };
    Feel free to add and modify the struct above.
    Do note You can construct a merkle tree that isn’t a perfect binary tree. However, this may make management of your data more difficult. Refer to Errata, Variations and Notes.
    Do note Please make sure when you compute the hash, you use the hexadecimal representation. This is very important for non-leaf nodes that are computing the hash from an ordered concatenation of their children (left + right) hashes.

WX: help-assignment

  • 12
    点赞
  • 20
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值