rgw分片上传/下载分析

rgw分片上传/下载分析

上传

1.客户端发送httppost请求,请求uploadid.
2.客户端收到uploadid后,按照上传不同的分片,url需要带上uploadid和分片数(partnumber).
3.全部分片上传完成后发送http-post请求,标记上传完成.

1.请求上传

[root@sns-nvceph01 ~]# s3cmd put 111.txt s3://mpbucket01/123.dat -d
.
.
.
DEBUG: String 'root' encoded to 'root'
DEBUG: String 'root' encoded to 'root'
DEBUG: attr_header: {'x-amz-meta-s3cmd-attrs': u'atime:1553238345/ctime:1553241376/gid:0/gname:root/md5:43643d513924af57c46c65f60af8b817/mode:33188/mtime:1553241376/uid:0/uname:root'}
.
.
.
DEBUG: format_uri(): /mpbucket01/123.dat?uploads
DEBUG: Sending request method_string='POST', uri=u'/mpbucket01/123.dat?uploads', headers={'x-amz-meta-s3cmd-attrs': u'atime:1553238345/ctime:1553241376/gid:0/gname:root/md5:43643d513924af57c46c65f60af8b817/mode:33188/mtime:1553241376/uid:0/uname:root', 'content-type': 'text/plain', 'Authorization': u'AWS P13CYRER52386RFC7JYI:OAo9lLJIcbwmAOa2nFtHFLtUqTM=', 'x-amz-date': 'Fri, 26 Apr 2019 09:39:41 +0000', 'x-amz-storage-class': 'STANDARD'}, body=(0 bytes)
DEBUG: ConnMan.put(): connection put back to pool (http://10.4.44.88:7480#1)
DEBUG: Response:
{'data': '<?xml version="1.0" encoding="UTF-8"?><InitiateMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Bucket>mpbucket01</Bucket><Key>123.dat</Key><UploadId>2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI</UploadId></InitiateMultipartUploadResult>',
 'headers': {'content-length': '248',
             'content-type': 'application/xml',
             'date': 'Fri, 26 Apr 2019 09:39:41 GMT',
             'x-amz-request-id': 'tx00000000000000544ca39-005cc2d1dd-11c8-default'},
 'reason': 'OK',
 'status': 200}
 .
 .
 .

上图是贴了一段s3cmd 分片上传的debuginfo,客户端首先发起post请求,请求uploadid(2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI).
s3cmd客户端上传的时候添加了自定义md5字段,用于下载完整性的校验,因为下载大文件时etag已经不是文件的md5值了.

2.分片上传文件

客户端可以并发上传,利用http-put请求,并且可以上传不同的rgw,需要客户端自己去协调各个分片的序号,并最后发起complete操作.
每个分片的md5值需要记录下载,发起完成请求时需要每个分片的md5.

.
.
.
DEBUG: DeUnicodising u'111.txt' using UTF-8
DEBUG: Using signature v2
DEBUG: SignHeaders: u'PUT\n\n\n\nx-amz-date:Fri, 26 Apr 2019 09:39:43 +0000\n/mpbucket01/123.dat?partNumber=11&uploadId=2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI'
DEBUG: get_hostname(mpbucket01): 10.4.44.88:7480
DEBUG: ConnMan.get(): re-using connection: http://10.4.44.88:7480#11
DEBUG: format_uri(): /mpbucket01/123.dat?partNumber=11&uploadId=2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI
    65536 of 15728640     0% in    0s  1240.19 kB/sDEBUG: ConnMan.put(): connection put back to pool (http://10.4.44.88:7480#12)
DEBUG: Response:
{'data': '',
 'headers': {'accept-ranges': 'bytes',
             'content-length': '0',
             'date': 'Fri, 26 Apr 2019 09:39:43 GMT',
             'etag': '"e88436e684d037a8704f5bafd14a0a0f"',
             'x-amz-request-id': 'tx00000000000000544ca59-005cc2d1df-11c8-default'},
 'reason': 'OK',
 'size': 15728640,
 'status': 200}
 15728640 of 15728640   100% in    0s    65.87 MB/s  done
DEBUG: MD5 sums: computed=e88436e684d037a8704f5bafd14a0a0f, received=e88436e684d037a8704f5bafd14a0a0f
.
.
.

此例为第11号分片的上传.

3.完成分片上传

待所有分片上传完成后,客户端发起http-post请求,完成分片上传.

.
.
.
DEBUG: MultiPart: Upload finished: 14 parts
DEBUG: MultiPart: Completing upload: 2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI
DEBUG: CreateRequest: resource[uri]=/123.dat
DEBUG: Using signature v2
DEBUG: SignHeaders: u'POST\n\n\n\nx-amz-date:Fri, 26 Apr 2019 09:39:44 +0000\n/mpbucket01/123.dat?uploadId=2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI'
DEBUG: Processing request, please wait...
DEBUG: get_hostname(mpbucket01): 10.4.44.88:7480
DEBUG: ConnMan.get(): re-using connection: http://10.4.44.88:7480#15
DEBUG: format_uri(): /mpbucket01/123.dat?uploadId=2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI
DEBUG: Sending request method_string='POST', uri=u'/mpbucket01/123.dat?uploadId=2~TMUmSED2HdmQQMZhtkd_YwxiXGjedxI', headers={'content-length': '1232', 'Authorization': u'AWS P13CYRER52386RFC7JYI:n1az/AZ5AjoxbQPUm2GrxMDq4sY=', 'x-amz-date': 'Fri, 26 Apr 2019 09:39:44 +0000'}, body=(1232 bytes)
DEBUG: ConnMan.put(): connection put back to pool (http://10.4.44.88:7480#16)
DEBUG: Response:
{'data': '<?xml version="1.0" encoding="UTF-8"?><CompleteMultipartUploadResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Location>http://10.4.44.88:7480/mpbucket01/123.dat</Location><Bucket>mpbucket01</Bucket><Key>123.dat</Key><ETag>b29c2c878e4e07876f1fb3b8c6ff8556-14</ETag></CompleteMultipartUploadResult>',
 'headers': {'content-length': '304',
             'content-type': 'application/xml',
             'date': 'Fri, 26 Apr 2019 09:39:44 GMT',
             'x-amz-request-id': 'tx00000000000000544ca5f-005cc2d1e0-11c8-default'},
 'reason': 'OK',
 'status': 200}
 .
 .
 .

request的body为每一个分片的md5值,服务端会比较之前上传的分片是否完整.

下载

同样利用s3cmd下载刚才上传的文件

[root@sns-nvceph01 ~]# s3cmd get  s3://mpbucket01/123.dat 123.dat -d
DEBUG: s3cmd version 2.0.2
DEBUG: ConfigParser: Reading file '/root/.s3cfg'
DEBUG: ConfigParser: access_key->P1...17_chars...I
DEBUG: ConfigParser: cloudfront_host->http://10.4.44.88:7480
DEBUG: ConfigParser: host_base->http://10.4.44.88:7480
.
.
.
DEBUG: SignHeaders: u'GET\n\n\n\nx-amz-date:Fri, 26 Apr 2019 10:59:43 +0000\n/mpbucket01/123.dat'
download: 's3://mpbucket01/123.dat' -> '123.dat'  [1 of 1]
DEBUG: get_hostname(mpbucket01): 10.4.44.88:7480
DEBUG: ConnMan.get(): creating new connection: http://10.4.44.88:7480
DEBUG: non-proxied HTTPConnection(10.4.44.88, 7480)
DEBUG: format_uri(): /mpbucket01/123.dat
DEBUG: Response:
{'headers': {'accept-ranges': 'bytes',
             'content-length': '220194451',
             'content-type': 'text/plain',
             'date': 'Fri, 26 Apr 2019 10:59:43 GMT',
             'etag': '"b29c2c878e4e07876f1fb3b8c6ff8556-14"',
             'last-modified': 'Fri, 26 Apr 2019 09:39:44 GMT',
             'x-amz-meta-s3cmd-attrs': 'atime:1553238345/ctime:1553241376/gid:0/gname:root/md5:43643d513924af57c46c65f60af8b817/mode:33188/mtime:1553241376/uid:0/uname:root',
             'x-amz-request-id': 'tx000000000000005453a33-005cc2e49f-11c8-default'},
 'reason': 'OK',
 's3cmd-attrs': {'atime': '1553238345',
                 'ctime': '1553241376',
                 'gid': '0',
                 'gname': 'root',
                 'md5': '43643d513924af57c46c65f60af8b817',
                 'mode': '33188',
                 'mtime': '1553241376',
                 'uid': '0',
                 'uname': 'root'},
 'status': 200}
     65536 of 220194451     0% in    0s  1570.98 kB/sDEBUG: ConnMan.put(): connection put back to pool (http://10.4.44.88:7480#1)
 220194451 of 220194451   100% in    0s   250.68 MB/s  done
DEBUG: ReceiveFile: Computed MD5 = 43643d513924af57c46c65f60af8b817
DEBUG: DeUnicodising u'123.dat' using UTF-8
DEBUG: set mtime to 1556242784.0

可以看到下载的respone中的etag并不是文件的md5值,而是一个通过各个分片计算出来的md5值,后面的-14表示有14个md5,etag的计算看这里.
可以通过比较上传时用户自定义字段设置的md5来比较是否完整

上传失败

如果上传失败或者中断了,会有一些临时对象留在pool中,需要手动进行删除参考这里

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 5
    评论
评论 5
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值