从0到1手写分布式对象存储系统-06断点传输

断点下载流程

断点上传流程

测试

生成随机文件并计算散列值

Jack Ma@DESKTOP-L24D7IP MINGW64 /d/GoFiles/mock-exls
$ ls
cmd/  go.mod  go.sum  test.csv  tmpFile-2022-01-05

Jack Ma@DESKTOP-L24D7IP MINGW64 /d/GoFiles/mock-exls
$ openssl dgst -sha256 -binary tmpFile-2022-01-05 | base64
0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=

# tmpFile-2022-01-05 为随机生成的文件,也可以通过绝对路径指定

Jack Ma@DESKTOP-L24D7IP MINGW64 /d/GoFiles/mock-exls
$ openssl dgst -sha256 -binary /d/GoFiles/mock-exls/tmpFile-2022-01-05 | base64
0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=

# 0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos= 为该文件的散列值

上传文件

curl -v 192.168.2.1:12346/handleObjs/testFile -XPOST -H "digest:SHA-256=0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=" -H "size:44124160"

[sam@hecs-37464 ~]$ curl -v 192.168.2.1:12346/handleObjs/testFile -XPOST -H "digest:SHA-256=0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=" -H "size:44124160"
* About to connect() to 192.168.2.1 port 12346 (#0)
*   Trying 192.168.2.1...
* Connected to 192.168.2.1 (192.168.2.1) port 12346 (#0)
> POST /handleObjs/testFile HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 192.168.2.1:12346
> Accept: */*
> digest:SHA-256=0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=
> size:44124160
> 
< HTTP/1.1 201 Created
< Location: /temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19
< Date: Thu, 06 Jan 2022 04:16:00 GMT
< Content-Length: 0
< 
* Connection #0 to host 192.168.2.1 left intact

可以看到,在接口服务的响应正文中 Location 字段回带了 token ,接下来我们可以通过这个 URL 进行 HEAD 和 PUT 操作。

HEAD 请求

查看上传进度,即实际写入 token 的数据长度。

[sam@hecs-37464 ~]$ curl -I 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

HTTP/1.1 200 OK
Content-Length: 0
Date: Thu, 06 Jan 2022 04:27:38 GMT

可以看到 "Content-Length: 0 ",即当前数据长度为 0 。

PUT 请求

先上传该文件前 50000 字节长度的数据,这里用 dd 命令模拟数据分块:

[sam@hecs-37464 ~]$ dd if=tmpFile/tmpFile-2022-01-05 of=tmpFile/firstBlock bs=1000 count=50
50+0 records in
50+0 records out
50000 bytes (50 kB) copied, 0.000256658 s, 195 MB/s
[sam@hecs-37464 ~]$ ls tmpFile/
firstBlock  tmpFile-2022-01-05
# 上传部分数据
curl -v -XPUT --data-binary @tmpFile/firstBlock 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

可以看到响应正文中 Content-Length: 50000  ,我们通过 PUT 请求上传了 50000 字节的 firstBlock 数据块,但实际上写入 token 的数据可能不是这个数值,通过再次调用 HEAD 请求查看进度:

curl -I 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

由此可见,真正写入的数据只有 32000 字节,所以下次 PUT 请求时应从 32000 字节开始。

# 模拟数据块
dd if=tmpFile/tmpFile-2022-01-05 of=tmpFile/secondBlock bs=1000 skip=32 count=50
# 上传第二数据块
curl -v -XPUT --data-binary @tmpFile/secondBlock -H "range:bytes=32000-" 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

# 查看 token 写入进度
curl -I 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

同样地,这次也没有完全将 50000 字节数据上传,实际上只上传了 32000 字节数据,目前一共上传了 64000 字节数据。 最后将剩下的数据切分为第三数据块并上传:

[sam@hecs-37464 ~]$ dd if=tmpFile/tmpFile-2022-01-05 of=tmpFile/thirdBlock bs=1000 skip=64
44060+1 records in
44060+1 records out
44060160 bytes (44 MB) copied, 0.122806 s, 359 MB/s

# 上传第三数据块
curl -v -XPUT --data-binary @tmpFile/thirdBlock -H "range:bytes=64000-" 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

# 查看总进度,应为 404 ,因为已上传完毕
curl -I 192.168.2.1:12346/temp/eyJOYW1lIjoidGVzdEZpbGUiLCJTaXplIjo0NDEyNDE2MCwiSGFzaCI6IjBpK3FFZlNYTUdnOHFpM2RkQ2ZUQmhOc0hwS3Jtd3VueE1lckhpVmNWb3M9IiwiU2VydmVycyI6WyIxOTIuMTY4LjEuNDoxMjM0NSIsIjE5Mi4xNjguMS4yOjEyMzQ1IiwiMTkyLjE2OC4xLjU6MTIzNDUiLCIxOTIuMTY4LjEuNjoxMjM0NSIsIjE5Mi4xNjguMS4zOjEyMzQ1IiwiMTkyLjE2OC4xLjE6MTIzNDUiXSwiVXVpZHMiOlsiNDZiZTE3NzUtYzk5Yi00ZWI5LTgwOGUtOGMyZDY1MzY1NmM0IiwiYmFhMmRkZGUtYzA0Ny00NTEzLTg0NDktZGUyOWY1NTdlMDRjIiwiN2FlMWI1ZmUtNjQ2OC00MjAxLTgzNTEtMzc3OTIyNjcyNDQzIiwiNTkzZWIzNjktZmI5NS00MTJhLWI2ZTItZTNkMDFkYzQ2NmZmIiwiOTc5OGI0NDktMzU4Ny00ZDJkLWEzNjYtY2NmMGUwMDc5ZmEyIiwiNjA3YzhkMjYtZmEyNy00ZWNmLTliMGUtNGIzMDU1NGI1OGQzIl19

# 查看数据节点中是否已完成上传
ls files/?/objects

可以看到,第三数据块上传完毕后,token 接口响应为 404 ,则说明数据已经上传完毕,不需要进度条了。查看数据节点是否已完成数据上传:

至此,各个数据服务节点都完成了数据的上传。

GET 请求

通过 GET 请求来查看比对数据:

[sam@hecs-37464 ~]$ curl 192.168.2.1:12346/handleObjs/testFile > tmpFile/resOutput
[sam@hecs-37464 ~]$ ls tmpFile/
firstBlock  resOutput  secondBlock  thirdBlock  tmpFile-2022-01-05
[sam@hecs-37464 ~]$ openssl dgst -sha256 -binary tmpFile/tmpFile-2022-01-05 | base64
0i+qEfSXMGg8qi3ddCfTBhNsHpKrmwunxMerHiVcVos=
[sam@hecs-37464 ~]$ openssl dgst -sha256 -binary tmpFile/resOutput | base64
47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=
[sam@hecs-37464 ~]$ diff -s tmpFile/resOutput tmpFile/tmpFile-2022-01-05
Files tmpFile/resOutput and tmpFile/tmpFile-2022-01-05 are identical

比对第三数据块:

[sam@hecs-37464 ~]$ curl 192.168.2.1:12346/handleObjs/testFile -H "range:bytes=64000-" > tmpFile/resOutput3
[sam@hecs-37464 ~]$ ls tmpFile/
firstBlock  resOutput  resOutput3  secondBlock  thirdBlock  tmpFile-2022-01-05
[sam@hecs-37464 ~]$ diff -s tmpFile/resOutput3 tmpFile/thirdBlock
Files tmpFile/resOutput3 and tmpFile/thirdBlock are identical

为什么不校验第二数据块?

因为我们做的是断点下载,每次遇到网络中断的时候,偏移量就记录到了 offset 位置,当网络恢复时,又从 offset 位置开始,所以请求头为 -H "range:bytes=<offset>-" ,并不会处理 -H "range:bytes=64000-96000" 这种情况,代码中已经明确是取第一个值:

/*从 header 获取偏移量*/
func GetOffsetFromHeader(h http.Header) int64 {
	byteRange := h.Get("range")
	if len(byteRange) < 7 {
		return 0
	}
	if byteRange[:6] != "bytes=" {
		return 0
	}
	bytePos := strings.Split(byteRange[6:], "-")
    // 偏移量恒定为 -H "range:bytes=<offset>-" ,因为断点下载永远从当前偏移量开始
	offset, _ := strconv.ParseInt(bytePos[0], 0, 64)
	return offset
}

至此,断点上传和下载都已完成。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

余衫马

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值