摘要:AWS CLI提供了基本而且灵活的S3(AmazonSimple Storage Service)数据获取方式,但是高级的数据获取方式比如续传需要用户自己实现。基本的数据获取可以使用CLI命令,但是高级的实现需要依赖不同语言的API,比如Java,C#等等。
1 AWS CLI request data with s3api get-object
cmd: aws s3api get-object
https://docs.aws.amazon.com/cli/latest/reference/s3api/get-object.html
The example below demonstrates the use of--range to download a specific byte range from an object. Note the byte ranges needs to be prefixed with "bytes=":
awss3api get-object --bucket text-content --key dir/my_data --rangebytes=8888-9999 my_data_range
Synopsis
get-object
--bucket <value>
[--if-match <value>]
[--if-modified-since <value>]
[--if-none-match <value>]
[--if-unmodified-since <value>]
--key <value>
[--range <value>]
[--response-cache-control <value>]
[--response-content-disposition<value>]
[--response-content-encoding <value>]
[--response-content-language <value>]
[--response-content-type <value>]
[--response-expires <value>]
[--version-id <value>]
[--sse-customer-algorithm <value>]
[--sse-customer-key <value>]
[--sse-customer-key-md5 <value>]
[--request-payer <value>]
[--part-number <value>]
outfile <value>
Description (dou):
Basiccmd:
awss3api get-object --bucket text-content --key dir/my_data my_data_range
--bucket (string): data bucket, i.e. ownerdefined data pool
–key (string): full dir of requested datain bucket
Outfile: output file name to be saved, userdefined
Partially download:
Method 1:
aws s3apiget-object --bucket text-content --key dir/my_data --range bytes=8888-9999my_data_range
--range (string): Downloads the specifiedrange bytes of an object.
Method 2:
aws s3apiget-object --bucket text-content --key dir/my_data -- part-number 1 my_data_range
--part-number (integer) Part number of the object being read. This is a positive integer between 1 and 10,000. Effectively performs a 'ranged' GET request for the part specified. Useful for downloading just a part of an object.
For more AWS CLI command reference:
https://docs.aws.amazon.com/cli/latest/reference/
2 Request data with SDK (e.g. C#)
2.1 Getting Started with the AWS SDK for .NET
https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/net-dg-setup.html
- Create an AWS Account and Credentials
- Create a profile and save it to the .NET credentials file
var options = new CredentialProfileOptions
{
AccessKey = "access_key",
SecretKey = "secret_key"
};
var profile = newAmazon.Runtime.CredentialManagement.CredentialProfile("basic_profile",options);
profile.Region =RegionEndpoint.USWest1;
var netSDKFile = new NetSDKCredentialsFile();
netSDKFile.RegisterProfile(profile);
TheRegisterProfile method is used to register a new profile. Your applicationtypically calls this method only once for each profile.
- Install the .NET Development Environment
- Microsoft .NET Framework 3.5 or later
- Microsoft Visual Studio 2010 or later
- Install AWSSDK Assemblies
- Go to AWS SDK for .NET (this is for VS2013+, for VS2010-2012 using this).
- In the Downloads section, choose Download MSI Installer to download the installer.
- To start installation, run the downloaded installer and follow the on-screen instructions.
- Start a New Project
- Create a new project from Template
(1号坑:这里要在VS新建项目选择AWS的模板,而不是新建普通项目添加相应的dll)
(2号坑:新建的项目编译错误找不到命名空间Amazon,要查看项目.NET版本,手动选择AWS SDK安装目录添加对应版本的dll,目录一般是Program File (X86))
2.2 Continued request code
这里主要考虑下载的文件比较大时,网络不稳定,下载一会就断掉就比较坑。考虑利用分块的方法持续下载。
(1)VS 2013 (my). New a project with template AWS S3 sample.
(2)Configure profile.
Press Ctrl+K, and then press A.
Choose the New (or Edit) Account Profile icon to the right of the Profile list.
https://docs.aws.amazon.com/toolkit-for-visual-studio/latest/user-guide/credentials.html
(3)Set region
The regionrefers the location of bucket. The region must right or cause an error “Thebucket you are attempting to access must be addressed using the specifiedendpoint. Please send all future requests to this endpoint.”. (3号坑:必须指定正确的region也即endpoint)
Endpointscurrently do not support cross-region requests—ensure that you create yourendpoint in the same region as your bucket. You can find the location of yourbucket by using the Amazon S3 console, or by using the get-bucket-location command.
Detail: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-endpoints-s3.html
Synopsis
get-bucket-location
--bucket<value>
[--cli-input-json<value>]
[--generate-cli-skeleton<value>]
https://docs.aws.amazon.com/cli/latest/reference/s3api/get-bucket-location.html
Example:
The followingcommand retrieves the location constraint for a bucket named my-bucket, if aconstraint exists:
aws s3api get-bucket-location --bucket my-bucket
Output:
{
"LocationConstraint":"us-west-2"
}
For my case, Iget null.(4号坑:us-east-1, i.e. US East (N. Virginia), 获取的region是null)
aws s3api get-bucket-location --bucket spacenet-dataset
Output:
{
"LocationConstraint":"null"
}
According to the servicedocumentation, S3 returns a null location if the bucket is in the US East(N. Virginia) region. So this is expected behavior. If you are trying to use such a bucket, you need to construct the client with the RegionEndpoint.USEast1
region.
(5号坑:VS里通过选择设置Region无效,通过修改App.config来修改Region)
App.config
<add key="AWSRegion"value="us-east-1" />
Other way to selectAWS region (endpoint):
https://docs.aws.amazon.com/sdk-for-net/v3/developer-guide/net-dg-region-selection.html
other: China(Beijing) Region Endpoints: cn-north-1
(4) Modify code
My cmd: awss3api get-object --bucket spacenet-dataset --keySpaceNet_Roads_Competition/AOI_2_Vegas_Roads_Train.tar.gz --request-payerrequester --part-number 1 AOI_2_Vegas_Roads_Train.tar.gz.1
Code reference:
https://docs.aws.amazon.com/AmazonS3/latest/dev/AuthUsingAcctOrUserCredDotNet.html
// In Main()
bucketName ="spacenet-dataset";
keyName ="SpaceNet_Roads_Competition/AOI_3_Paris_Roads_Test_Public.tar.gz";
outPath ="E:\\data\\";
RP =RequestPayer.Requester;
// loop
for (PartNum =17; PartNum<10001; PartNum++)
{
bool flag =false;
do
{
flag = ReadingAnObject();
}
while (flag ==false);
}
// update ReadingAnObject
static boolReadingAnObject()
{
bool flag = false;
try
{
GetObjectRequestrequest = new GetObjectRequest()
{
BucketName = bucketName,
Key = keyName,
RequestPayer = RP,
PartNumber = PartNum
};
using(GetObjectResponse response = client.GetObject(request))
{
string title =response.Metadata["x-amz-meta-title"];
Console.WriteLine("The object's titleis {0}", title);
// string dest =Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),keyName);
string dest = Path.Combine(outPath,keyName) + "." + PartNum.ToString();
// if (!File.Exists(dest))
{
response.WriteResponseStreamToFile(dest);
}
}
flag = true;
}
catch (AmazonS3Exception amazonS3Exception)
{
if(amazonS3Exception.ErrorCode != null &&
(amazonS3Exception.ErrorCode.Equals("InvalidAccessKeyId") ||
amazonS3Exception.ErrorCode.Equals("InvalidSecurity")))
{
Console.WriteLine("Please check theprovided AWS Credentials.");
Console.WriteLine("If you haven'tsigned up for Amazon S3, please visit http://aws.amazon.com/s3");
}
else
{
Console.WriteLine("An error occurredwith the message '{0}' when reading an object",amazonS3Exception.Message);
}
}
return flag;
}