亚马逊开店pdf_Amazon S3教程–最终指南（PDF下载）

最新推荐文章于 2024-04-02 11:09:44 发布

danpob13624

最新推荐文章于 2024-04-02 11:09:44 发布

阅读量887

点赞数

文章标签： python java 大数据分布式 linux

原文链接：https://www.javacodegeeks.com/2017/03/amazon-s3-tutorial.html

版权

本文是关于Amazon S3的详细教程，包括介绍、资料模型、使用方法、存储类别、加密、版本控制、命令行工具的使用和价格。Amazon S3是一个Web服务，提供可扩展的云存储，适用于各种用例，如数据备份、网站托管和内容分发。文中介绍了如何创建和管理存储桶、对象，以及如何利用生命周期管理、不同存储类别降低成本。此外，还涵盖了身份验证、上传和下载对象、多部分上传和删除操作。

摘要由CSDN通过智能技术生成

亚马逊开店pdf

编者注： Amazon S3（简单存储服务）是Amazon Web Services提供的Web服务。 Amazon S3通过Web服务接口（REST，SOAP和BitTorrent）提供存储。

亚马逊并未公开S3设计的细节，尽管它显然使用对象存储架构来管理数据。 据亚马逊称，S3的设计旨在以商品成本提供可扩展性，高可用性和低延迟。

S3旨在在给定的一年中提供99.999999999％的耐用性和99.99％的对象可用性，尽管没有有关耐用性的服务级别协议。 （来源：维基百科）

现在，我们提供了全面的指南，以便您可以开发自己的基于Amazon S3的应用程序。 我们涵盖了广泛的主题，从设置和配置到API使用和定价。 有了本指南，您将能够在最短的时间内启动并运行自己的项目。 请享用！

介绍

尽管亚马逊的网上商店多年来增长，但对可伸缩的IT基础架构的需求日益迫切。这使亚马逊可以创建自己的基于服务的基础架构。在寻找新的商业模式时，亚马逊是最早向客户提供自己的业务构建服务的先驱之一。

因此，Amazon S3是一项存储服务，旨在存储“随时随地在网络上的任何数量的数据”。它提供了易于使用的Web服务界面，应用程序开发人员可以将其用于不同的用例。项目可以将其网页以及应用程序所需的工件存储在Amazon S3上。除此之外，应用程序可以在S3中存储使用数据。亚马逊还将存储服务集成到其他服务中，因此也可以将S3中存储的数据用于分析调查或面部识别。

资料模型

在Amazon S3内部存储数据的基本单位是“对象”。对象由丰富了元数据的实际数据组成。元数据是一组名称-值对，它们提供有关对象的其他信息。在默认信息（如上次修改日期）旁边，此信息还提供标准的HTTP元数据（如Content-Type）。还可以提供用户定义的元数据，例如用于创建此对象的应用程序。

对象按存储桶组织。您可以将存储桶视为共享相同名称空间的对象的集合。存储桶也用于访问控制，使用情况报告和转移费用。例如，图像文件geek.jpg位于在桶javacodegeeks有URL http://javacodegeeks.s3.amazonaws.com/photos/geek.jpg 。

当存储桶跨越命名空间时，存储桶中的每个对象都有唯一的键。因此，“ Web服务终端节点+存储桶+密钥+版本”的组合标识了Amazon S3中的一个对象。在javacodegeeks作为存储桶之前的示例中， s3.amazonaws.com是Web服务端点，而photos/geek.jpg是密钥。该版本是可选的，可以省略。

尽管Amazon声明您可以从世界任何地方访问数据，但是对于许多应用程序而言，了解数据所在的位置至关重要。因此，您可以选择存储桶所在的区域。这有助于最终用户优化延迟时间并满足法规要求。当前提供以下区域：

美国东部（弗吉尼亚北部）
美国东部（俄亥俄州）
美国西部（加利福尼亚北部）
美国西部（俄勒冈州）
加拿大（中部）
亚太地区（孟买）
亚太地区（首尔）
亚太地区（新加坡）
亚太地区（悉尼）
亚太地区（东京）
欧盟（法兰克福）
欧盟（爱尔兰）
欧盟（伦敦）
南美洲（圣保罗）

存储桶中密钥的更新是原子的。这意味着该操作要么成功并且所有数据都被写入，要么不成功并且没有数据被写入。该操作的原子性保证了在出现问题时不会遗留任何损坏的数据。由于写入密钥的数据是在Amazon的计算机上分布的，因此成功的写入操作也意味着该数据已在Amazon集群中复制，并且可以防止集群中的计算机发生故障。

但是由于有关新数据的信息必须分布在整个群集上，因此修改后的操作可能不会反映更新。例如，在将新对象添加到存储桶之后，存储桶内容的列表可能不会返回新对象。发生这种情况的另一种方法是，删除对象后，该对象仍被列为存储桶的成员。另一种情况是数据被更新并在更新后直接读取。在这种情况下，该进程可能仍会读取旧数据。

Amazon S3不支持对象锁定。这意味着，如果您的应用程序需要保护现有对象免受其他进程的修改，则必须在自己的应用程序中实现此功能。随后对不同对象（也称为事务）的更新也是如此。由于Amazon S3不提供任何种类的交易行为，例如，将数据从一个密钥移动到另一个密钥可能会导致以下状态：成功在新位置创建了数据，但随后删除旧密钥失败。

如果该数据易于复制，例如缩略图，则可以使用Amazon S3的“减少冗余存储”（RRS）选项。 RRS降低了存储成本，但同时没有将数据复制到没有该选项的情况下那样多的不同存储设备。启用RRS后，预期的数据丢失率为每年对象的0.01％，这可以接受，具体取决于具体的用例。

用法

蜜蜂

Amazon S3为开发人员提供了两种不同的API：REST接口和SOAP接口。虽然不赞成使用HTTP上的SOAP，但仍支持HTTPS上的SOAP。但是新功能可能无法通过SOAP获得，因此Amazon建议使用REST接口。通过REST，应用程序使用标准的HTTP请求（例如GET，PUT等）来下载或上传数据。亚马逊甚至在数据检索期间将对象的元数据放入HTTP标头中。

建立

现在，我们已经了解了Amazon S3的概念，我们可以开始构建第一个应用程序。作为构建系统，我们使用3.x或更高版本的Apache Maven 。

这使我们能够直接从原型创建项目：

mvn archetype:generate -DgroupId=com.javacodegeeks -DartifactId=amazon-s3 -DarchetypeArtifactId=maven-archetype-quickstart -DinteractiveMode=false

这将创建以下目录结构：

amazon-s3 |-- pom.xml `-- src

第一步，我们在项目的根目录中编辑pom.xml文件，并添加以下依赖项：

<properties>
	<aws-sdk.version>1.11.86</aws-sdk.version>
	<google-guava.version>19.0</google-guava.version>
	<junit.version>4.12</junit.version>
	<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>

<dependencies>
	<dependency>
		<groupId>com.amazonaws</groupId>
		<artifactId>aws-java-sdk-core</artifactId>
		<version>${aws-sdk.version}</version>
	</dependency>
	<dependency>
		<groupId>com.amazonaws</groupId>
		<artifactId>aws-java-sdk-s3</artifactId>
		<version>${aws-sdk.version}</version>
	</dependency>
	<dependency>
		<groupId>com.google.guava</groupId>
		<artifactId>guava</artifactId>
		<version>${google-guava.version}</version>
	</dependency>
	<dependency>
		<groupId>junit</groupId>
		<artifactId>junit</artifactId>
		<version>${junit.version}</version>
	</dependency>
</dependencies>

如您所见，我们添加了Amazon提供的两个工件aws-java-sdk-core和aws-java-sdk-s3 。它们包含AWS的核心功能以及与S3服务进行通信的Java类。此外，我们添加了Google的Guava库和junit库，以进行单元测试。

添加了必要的依赖关系之后，我们添加了一个build部分，该部分定义了我们的类文件的目标版本，并使用maven-assembly-plugin构建了一个包含所有依赖关系的jar。这样，我们以后可以通过仅在类路径上提供一个jar来简单地启动应用程序（ java -cp target\amazon-s3-1.0-SNAPSHOT-jar-with-dependencies.jar ... ）：

<build>
	<plugins>
		<plugin>
			<groupId>org.apache.maven.plugins</groupId>
			<artifactId>maven-compiler-plugin</artifactId>
			<version>3.6.1</version>
			<configuration>
				<source>1.8</source>
				<target>1.8</target>
			</configuration>
		</plugin>
		<plugin>
			<artifactId>maven-assembly-plugin</artifactId>
			<configuration>
				<archive>
					<manifest>
						<mainClass>com.javacodegeeks.App</mainClass>
					</manifest>
				</archive>
				<descriptorRefs>
					<descriptorRef>jar-with-dependencies</descriptorRef>
				</descriptorRefs>
			</configuration>
			<executions>
				<execution>
					<id>make-assembly</id>
					<phase>package</phase>
					<goals>
						<goal>single</goal>
					</goals>
				</execution>
			</executions>
		</plugin>
	</plugins>
</build>

现在，您可以调用该构建并创建一个仅输出“ Hello World！”的简单jar应用程序：

>mvn package
>java -jar target/amazon-s3-1.0-SNAPSHOT-jar-with-dependencies.jar
Hello World!

现在，我们的Java应用程序已设置完毕，我们可以创建一个AWS账户。如果您已经拥有一个AWS账户，则可以跳过以下步骤，直接登录到AWS控制台。要创建一个新的AWS账户，请将浏览器指向以下URL： https : //aws.amazon.com/s3 ，然后单击“注册”：

Amazon S3登录

填写您的电子邮件地址，然后单击“使用我们的安全服务器登录”。这将带您进入下一页，在此页面中，您必须输入名称，重新输入电子邮件地址并输入密码。

Amazon S3登录凭证

在下一页上，您将必须提供联系信息：

Amazon S3联系信息

要完成注册过程，以下页面将要求您提供付款信息，验证您的身份并要求支持计划。最后，您可以确认所有输入并创建一个AWS账户。

认证方式

注册了Amazon S3之后，现在该创建一个专用用户了。可以通过AWS控制台搜索“ IAM”（这是“身份和访问管理”的简称）来完成。这将带您到以下屏幕：

Amazon S3添加用户

单击“添加用户”，可以输入用户名及其访问类型：

Amazon S3用户详细信息

由于我们将要创建的用户仅应与S3 API进行交互，因此我们不需要选中“ AWS管理控制台访问”选项，而只需选中“编程访问”选项。在以下屏幕上，我们将用户添加到名为“ s3group”的新组中。通过单击“创建组”并提供组的名称以及选择权限集“ AmazonS3FullAccess”来完成此操作：

Amazon S3权限

在下一页上，您可以查看更改并保存。在此页面上，您还将看到新用户的访问密钥和秘密密钥。

Amazon S3 SDK可以从在OS用户主目录中创建的配置文件读取您帐户的凭证，该文件位于以下位置：

在Linux，OS X或Unix上~/.aws/credentials
Windows上的C:\Users\USERNAME\.aws\credentials

该文件包含以下格式的访问密钥ID和秘密访问密钥：

[default]
aws_access_key_id = <your_access_key_id>
aws_secret_access_key = <your_secret_access_key>

为了进行测试，将凭据作为环境变量提供也很方便。在Linux，OS X和Unix上，可以通过以下方式设置它们：

export AWS_ACCESS_KEY_ID=<your_access_key_id>
export AWS_SECRET_ACCESS_KEY=<your_secret_access_key>

在Windows系统上，可以这样进行：

set AWS_ACCESS_KEY_ID=<your_access_key_id>
set AWS_SECRET_ACCESS_KEY=<your_secret_access_key>

Java SDK甚至支持通过系统属性来设置访问密钥ID和秘密访问密钥：

java -Daws.accessKeyId=<your_access_key_id> -Daws.secretKey=<your_secret_access_key> -jar <your.jar>

如果您有多个访问密钥，则可以将它们存储在配置文件的不同部分中：

[default]
aws_access_key_id=<your_access_key_id>
aws_secret_access_key=<your_secret_access_key>

[profile2]
aws_access_key_id=<your_access_key_id>
aws_secret_access_key=<your_secret_access_key>

在这种情况下，你选择配置文件来告诉它的名字被用来ProfileCredentialsProvider ：

AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
                        .withCredentials(new ProfileCredentialsProvider("profile2"))
                        .build();

或者，您也可以使用环境变量AWS_PROFILE ：

export AWS_PROFILE="myProfile"

在Windows上：

set AWS_PROFILE="myProfile"

环境变量AWS_CREDENTIAL_PROFILES_FILE使您可以为配置文件指定替代位置。

如果您不提供任何凭据提供程序，则SDK会通过以下方式搜索它们：

环境变量
Java系统属性
默认凭证配置文件

使用Amazon ECS时，还有其他选项（请参阅此处）。

最后但并非最不重要的一点是，您可以通过编程方式提供凭据：

BasicAWSCredentials awsCreds = new BasicAWSCredentials("your_access_key_id", "your_secret_access_key");
AmazonS3 s3Client = AmazonS3ClientBuilder.standard()
                        .withCredentials(new AWSStaticCredentialsProvider(awsCreds))
                        .build();

列表桶

有关身份验证的知识使我们能够实现第一个简单的客户端，该客户端仅列出帐户中的所有存储桶：

public class ListBucketsApp {

    public static void main(String[] args) throws IOException {
        ClientConfiguration clientConf = new ClientConfiguration();
        clientConf.setProxyHost("wish-proxy-ilm.mph.morpho.com");
        clientConf.setProxyPort(3128);
        clientConf.setProxyUsername("g510581");
        clientConf.setProxyPassword("MAR42l1id");
        clientConf.setConnectionTimeout(60 * 1000);
        AWSCredentials credentials = getAwsCredentials();
        AWSStaticCredentialsProvider credentialsProvider = new AWSStaticCredentialsProvider(credentials);
        AwsEnvVarOverrideRegionProvider regionProvider = new AwsEnvVarOverrideRegionProvider();
        AmazonS3 amazonS3 = AmazonS3ClientBuilder.standard()
                .withClientConfiguration(clientConf)
                .withCredentials(credentialsProvider)
                .withRegion(regionProvider.getRegion())
                .build();
        List<Bucket> buckets = amazonS3.listBuckets();
        for (Bucket bucket : buckets) {
            System.out.println(bucket.getName() + ": " + bucket.getCreationDate());
        }
    }

    private static AWSCredentials getAwsCredentials() throws IOException {
        AWSCredentials credentials;
        try (InputStream is = ListBucketsApp.class.getResourceAsStream("/credentials.properties")) {
            if (is == null) {
                throw new RuntimeException("Unable to load credentials from properties file.");
            }
            credentials = new PropertiesCredentials(is);
        }
        return credentials;
    }
}

首先，我们构造一个ClientConfiguration并将连接超时设置为一分钟。此部分是可选的，但说明了如何为Amazon S3客户端设置低级参数。小助手方法getAwsCredentials构造一个AWSCredentials实例。在这种情况下，我们选择了PropertiesCredentials 。它需要一个InputStream或File ，它指向一个属性文件，其中包含两个属性accessKey和secretKey ，其中访问密钥ID和秘密密钥为值。

拥有一个AWSCredentials实例使我们可以创建一个凭证提供程序，例如AWSStaticCredentialsProvider 。或者，我们也可以使用EnvironmentVariableCredentialsProvider或ProfileCredentialsProvider 。如上所述，第一个从环境变量中读取凭据，而第二个使用已提到的配置文件提取必要的信息。例如，如下所示：

AmazonS3 s3Client = new AmazonS3Client(new ProfileCredentialsProvider());

提供者的概念也可以用于该地区。 AWS开发工具包提供了例如AwsEnvVarOverrideRegionProvider ，该提供程序检查环境变量AWS_REGION中要使用的区域。然后，该区域串可以被传递给该方法withRegion()的的AmazonS3ClientBuilder 。

提供了客户端配置，凭据和区域后，我们最终可以构建AmazonS3的实例。在此实例上，我们可以调用方法listBuckets()来检索所有可用存储桶的列表。

您可以通过将JVM的类路径设置为创建的jar并提供类名来运行此示例：

mvn clean install
export AWS_REGION=eu-central-1
java -cp target/amazon-s3-1.0-SNAPSHOT-jar-with-dependencies.jar com.javacodegeeks.ListBucketsApp

如果看到以下错误消息，则可能忘记了设置环境变量AWS_REGION： Unable to load region information from any provider in the chain 。请根据您的位置选择地区（请参见此处）。

创建桶

仅查看现有数据很无聊，因此我们创建了一个附加类，使我们可以在指定区域中创建新存储桶：

public class CreateBucketApp {
    private static final Logger LOGGER = Logger.getLogger(CreateBucketApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 1) {
            LOGGER.log(Level.WARNING, "Please provide the following arguments: <bucket-name>");
            return;
        }
        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        CreateBucketRequest request = new CreateBucketRequest(args[0],
                AwsClientFactory.createRegionProvider().getRegion());
        Bucket bucket = amazonS3.createBucket(request);
        LOGGER.log(Level.INFO, "Created bucket " + bucket.getName() + ".");
    }
}

为了使该类更具可读性，我们已将构建AmazonS3客户端和AwsRegionProvider的代码移到一个简单的工厂类中。上面的示例允许用户在命令行上提供存储桶名称作为参数，然后将其传递给CreateBucketRequest对象。然后可以将后者提供给AmazonS3客户端的方法createBucket() 。

如果您不想自己创建CreateBucketRequest对象，则也可以使用方法createBucket(String)来指定存储桶名称。然后从亚马逊端点的URL中提取区域。

与上面的示例类似，我们在构建应用程序后启动应用程序，并在命令行上提供存储桶名称：

mvn clean install
export AWS_REGION=eu-central-1
java -cp target\amazon-s3-1.0-SNAPSHOT-jar-with-dependencies.jar com.javacodegeeks.CreateBucketApp mybucket

放置物件

创建存储桶后，我们可以将对象插入其中。以下代码显示了简化版本的操作方法：

public class PutObjectApp {
    private static final Logger LOGGER = Logger.getLogger(PutObjectApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> <file>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        Path filePath = Paths.get(args[2]);
        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, filePath.toFile());
        PutObjectResult result = amazonS3.putObject(putObjectRequest);
        LOGGER.info("Put file '" + filePath + "' under key " + key + " to bucket " + bucketName);
    }
}

首先，我们检查用户是否至少提供了注册所需的三个参数：存储区名称，密钥，文件。为了简单起见，我们不检查客户端中的值是否正确，例如，存在一些密钥命名规则：

字母数字字符[0-9a-zA-Z]
特殊字符！，-，_，。，*，'，（和）

另外，尽管Amazon S3不支持文件层次结构，但可以在键名中使用斜杠“ /”。因此，如果您将关键videos/vacations2017.mpg命名为“”，那么Amazon控制台将使用斜杠将以videos/开头的密钥分组在一起，但是数据模型是扁平结构。

以下字符需要特殊处理，例如URL编码，因为普通浏览器可能无法正常使用它们或无意中处理它们：＆，$，@，=，;，：，+ 、。通常，应避免使用方括号（例如{，}，（，）），尖号，代字号，竖线或井字号。

通过引用AmazonS3对象，我们可以调用putObject()方法并将构造的PutObjectRequest传递给它。这个PutObjectRequest精确地获得了我们之前从用户那里请求的参数。

分段上传

我们在上一节中看到的put对象操作仅适用于小型对象。如果您需要上传较大的对象（最大5 TB），我们想使用Amazon S3的“分段上传”功能。它使用多个请求（甚至可以并行发出）来提高吞吐量，而不是在一个请求中上传数据。

在开始上传之前，我们必须启动分段上传：

InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(bucketName, key);
InitiateMultipartUploadResult initResponse = amazonS3.initiateMultipartUpload(initRequest);

请求对象需要存储桶的名称以及新对象的键。后续请求通过InitiateMultipartUploadResult响应中包含的“上传ID”与该分段上传相关联。当我们构造UploadPartRequest时，您可以看到：

UploadPartRequest uploadRequest = new UploadPartRequest()
		.withBucketName(bucketName)
		.withKey(key)
		.withUploadId(initResponse.getUploadId())
		.withPartNumber(part) // parts start with 1
		.withFileOffset(filePosition)
		.withFile(filePath.toFile())
		.withPartSize(partSize);
UploadPartResult uploadPartResult = amazonS3.uploadPart(uploadRequest);
partETags.add(uploadPartResult.getPartETag());

UploadPartRequest在已经提到的“上传ID”旁边还需要存储桶名称和密钥。除此之外，我们还必须提供零件号（编号以1而不是零开头），文件中应上传的偏移量以及该零件的大小。部件必须至少为5 MB，并且不得超过5 GB。一次上传最多可以包含10,000个部分。

UploadPartRequest的响应包含一个eTag。必须收集这些eTag，然后将其传递给CompleteMultipartUploadRequest ：

CompleteMultipartUploadRequest completeMultipartUploadRequest = new CompleteMultipartUploadRequest(bucketName,
	key, initResponse.getUploadId(), partETags);
amazonS3.completeMultipartUpload(completeMultipartUploadRequest);

如果我们想中止已经启动的分段上传，可以通过调用Amazon S3客户端的方法abortMultipartUpload（）来做到这一点：

amazonS3.abortMultipartUpload(new AbortMultipartUploadRequest(bucketName, 
	key, initResponse.getUploadId()));

完整的应用程序如下所示：

public class MultipartUploadApp {
    private static final Logger LOGGER = Logger.getLogger(PutObjectApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> <file>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        Path filePath = Paths.get(args[2]);
        AmazonS3 amazonS3 = AwsClientFactory.createClient();

        InitiateMultipartUploadRequest initRequest = new InitiateMultipartUploadRequest(bucketName, key);
        InitiateMultipartUploadResult initResponse = amazonS3.initiateMultipartUpload(initRequest);
        LOGGER.info("Initiated upload with id: " + initResponse.getUploadId());

        long fileSize = Files.size(filePath);
        long partSize = 5 * 1024 * 1024; // 5MB
        long filePosition = 0;
        List<PartETag> partETags = new ArrayList<>();

        try {
            int part = 1;
            while (filePosition < fileSize) {
                partSize = Math.min(partSize, fileSize - filePosition);

                UploadPartRequest uploadRequest = new UploadPartRequest()
                        .withBucketName(bucketName)
                        .withKey(key)
                        .withUploadId(initResponse.getUploadId())
                        .withPartNumber(part) // parts start with 1
                        .withFileOffset(filePosition)
                        .withFile(filePath.toFile())
                        .withPartSize(partSize);
                UploadPartResult uploadPartResult = amazonS3.uploadPart(uploadRequest);
                partETags.add(uploadPartResult.getPartETag());
                LOGGER.info("Uploaded part: " + part);

                filePosition += partSize;
                part++;
            }

            CompleteMultipartUploadRequest completeMultipartUploadRequest = new CompleteMultipartUploadRequest(bucketName,
                    key, initResponse.getUploadId(), partETags);
            amazonS3.completeMultipartUpload(completeMultipartUploadRequest);
        } catch (SdkClientException e) {
            LOGGER.warning("Failed to upload file: " + e.getMessage());
            amazonS3.abortMultipartUpload(new AbortMultipartUploadRequest(bucketName, key, initResponse.getUploadId()));
            LOGGER.info("Aborted upload.");
        }
    }
}

尽管上面的应用程序说明了分段上传的一般处理方式，但不幸的是，仅上传一个较大的文件是一个相当长的代码段。由于通常需要实现这样的上传，因此Amazon SDK附带了TransferManager ，这大大简化了分段上传的处理：

TransferManager tm = TransferManagerBuilder.defaultTransferManager();

它尝试使用多个线程进行上传，这可能对吞吐量和可靠性产生重大影响。由于TransferManager管理线程和连接，因此可以通过整个应用程序使用单个实例。

除了使用TransferManager的默认配置外，还可以配置自己的配置：

AmazonS3 amazonS3 = AwsClientFactory.createClient();
TransferManagerBuilder transferManagerBuilder = TransferManagerBuilder.standard();
transferManagerBuilder.setS3Client(amazonS3);
transferManagerBuilder.setExecutorFactory(() -> Executors.newFixedThreadPool(4));
TransferManager tm = transferManagerBuilder.build();

除了像我们的示例中那样设置具体的AmazonS3实例之外，还可以配置应使用的线程池或设置文件大小，以在TransferManager使用分段上传。如果文件大小低于配置的阈值，则TransferManager仅使用单个请求上传，如前所述。

上传开始后，就可以注册进度监听器。这样，用户应用程序可以提供有关当前上传状态的信息：

PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, 
	key, filePath.toFile());
Upload upload = tm.upload(putObjectRequest);

upload.addProgressListener((ProgressEvent progressEvent) -> {
	LOGGER.info("Progress: " + progressEvent);
});

最后，我们的应用程序的主线程必须等待上传完成：

try {
	upload.waitForCompletion();
} catch (InterruptedException e) {
	LOGGER.warning("Failed to upload file: " + e.getLocalizedMessage());
}

列出对象

现在我们已经将一个或多个对象上传到存储桶中，现在该列出这些对象了。因此，Amazon S3 API提供了ListObjectsV2Request请求，如以下示例应用程序所示：

public class ListObjectsApp {
    private static final Logger LOGGER = Logger.getLogger(ListObjectsApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <delimiter> <prefix>");
            return;
        }
        String bucketName = args[0];
        String delimiter = args[1];
        String prefix = args[2];

        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        ListObjectsV2Request listObjectsRequest = new ListObjectsV2Request();
        listObjectsRequest.setBucketName(bucketName);
        listObjectsRequest.setDelimiter(delimiter);
        listObjectsRequest.setPrefix(prefix);
        ListObjectsV2Result result = amazonS3.listObjectsV2(listObjectsRequest);

        for (S3ObjectSummary summary : result.getObjectSummaries()) {
            LOGGER.info(summary.getKey() + ":" + summary.getSize());
        }
    }
}

列出对象键可能会有些棘手，因为存储桶是对象的平坦层次结构，即，即使您在自己的层次结构级别之间使用例如斜杠作为分隔符，Amazon S3也不关心它。但是为了简化此类层次结构的处理，列表对象请求提供了按前缀和定界符进行过滤的可能性。例如，如果您的应用程序管理备份文件，则可能具有如下的对象键：

know-how.txt
documents/programming/java/amazon-s3.java
documents/programming/python/amazon-s3.py
photos/2016/12/31/img_3213.jpg
...

要仅列出位于documents/programming下的对象，您必须将前缀设置为documents/programming/ ，并将定界符设置为/ 。请注意，前缀以定界符结尾，即请求将仅返回包含此前缀并带有反斜杠的键。

通过仅将斜杠指定为定界符并且不提供前缀，可以仅查询位于层次结构“根级别”中的对象。由于根级别中的对象的键中不包含斜杠，因此定界符/使服务仅返回know-how.txt文件。

删除物件

在本教程的这一点上，我们知道如何创建存储桶和对象以及如何列出它们。仍然缺少一项基本操作：删除。因此，我们可以发出DeleteObjectRequest从存储桶中删除现有对象就不足为奇了：

public class DeleteObjectApp {
    private static final Logger LOGGER = Logger.getLogger(DeleteObjectApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 2) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];

        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        DeleteObjectRequest deleteObjectRequest = new DeleteObjectRequest(bucketName, key);
        amazonS3.deleteObject(deleteObjectRequest);
        LOGGER.info("Deleted object with key '" + key + "'.");
    }
}

该代码易于理解和直接。从参数列表中读取提供的存储桶名称和密钥后，将创建一个新的AmazonS3客户端。 DeleteObjectRequest需要知道存储桶名称，当然还有密钥。如果未引发任何异常，则将对象成功从存储桶中删除。

由于通常需要删除一整套对象，因此SDK也为此提供了适当的API。一个可以将对象键列表传递给DeleteObjectsRequest ，一个接一个地删除：

public class DeleteObjectsApp {
    private static final Logger LOGGER = Logger.getLogger(DeleteObjectsApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 2) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> [key...]");
            return;
        }
        String bucketName = args[0];
        List<DeleteObjectsRequest.KeyVersion> keyVersionList = new ArrayList<>(args.length - 1);
        for (int i = 1; i < args.length; i++) {
            DeleteObjectsRequest.KeyVersion keyVersion = new DeleteObjectsRequest.KeyVersion(args[i]);
            keyVersionList.add(keyVersion);
        }

        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        DeleteObjectsRequest deleteObjectsRequest = new DeleteObjectsRequest(bucketName);
        deleteObjectsRequest.setKeys(keyVersionList);
        List<DeleteObjectsResult.DeletedObject> deletedObjects;
        try {
            DeleteObjectsResult result = amazonS3.deleteObjects(deleteObjectsRequest);
            deletedObjects = result.getDeletedObjects();
        } catch (MultiObjectDeleteException e) {
            deletedObjects = e.getDeletedObjects();
            List<MultiObjectDeleteException.DeleteError> errors = e.getErrors();
            for (MultiObjectDeleteException.DeleteError error : errors) {
                LOGGER.info("Failed to delete object with key '" + error.getKey() 
					+ "': " + e.getLocalizedMessage());
            }
        }
        for (DeleteObjectsResult.DeletedObject deletedObject : deletedObjects) {
            LOGGER.info("Deleted object with key '" + deletedObject.getKey() + "'.");
        }
    }
}

deleteObjects()调用返回已成功删除的所有对象的列表。如果抛出异常，则相应的MultiObjectDeleteException包含以下信息：哪些对象已成功删除，哪些对象未成功删除。因此，可以通过使用deleteObjects()调用进行多对象删除来节省带宽和金钱。

复制物件

处理大对象时的另一个重要操作是复制调用。可以使用服务器端的一次API调用来执行此操作，而不必使用新密钥再次下载并再次上传数据。因此，SDK提供了copyObject()方法：

public class CopyObjectApp {
    private static final Logger LOGGER = Logger.getLogger(CopyObjectApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 4) {
            LOGGER.warning("Please provide the following parameters:    ");
            return;
        }
        String srcBucketName = args[0];
        String srcKey = args[1];
        String dstBucketName = args[2];
        String dstKey = args[3];

        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        CopyObjectRequest copyObjectRequest = new CopyObjectRequest(srcBucketName, srcKey, dstBucketName, dstKey);
        CopyObjectResult result = amazonS3.copyObject(copyObjectRequest);
        String eTag = result.getETag();
        LOGGER.info("Copied object from bucket " + srcBucketName + " with key " + srcKey + " to bucket"
                + dstBucketName + " with key " + dstKey + ", eTag: " + eTag);
    }
}

它以源对象的存储桶名称和键以及目标对象的存储桶名称和键作为参数。

获取对象

上传对象后，您的应用程序还需要能够在以后下载它们。 GetObjectRequest被设计为完全做到这一点：

public class GetObjectApp {
    private static final Logger LOGGER = Logger.getLogger(GetObjectApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> <file>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        Path filePath = Paths.get(args[2]);
        AmazonS3 amazonS3 = AwsClientFactory.createClient();

        GetObjectRequest getObjectRequest = new GetObjectRequest(bucketName, key);
        S3Object s3Object = amazonS3.getObject(getObjectRequest);

        LOGGER.info("Downloaded S3 object: " + s3Object.getKey() + " from bucket " +s3Object.getBucketName());

        S3ObjectInputStream stream = s3Object.getObjectContent();
        FileUtils.copyInputStreamToFile(stream, filePath.toFile());
    }
}

它只是将存储桶的名称以及对象的键作为构造函数的参数。调用Amazon的S3客户端的方法getObject()然后返回一个S3Object 。可以通过调用getObjectContent()来检索实际的文件内容。在上面的示例代码中，我们使用Apache的commons-io库来简化InputStream到文件的转换。

删除存储桶

最后，是时候删除我们创建的存储桶了：

public class DeleteBucketApp {
    private static final Logger LOGGER = Logger.getLogger(DeleteBucketApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 1) {
            LOGGER.log(Level.WARNING, "Please provide the following arguments: <bucket-name>");
            return;
        }
        String bucketName = args[0];
        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        DeleteBucketRequest deleteBucketRequest = new DeleteBucketRequest(bucketName);
        amazonS3.deleteBucket(deleteBucketRequest);
        LOGGER.log(Level.INFO, "Deleted bucket " + bucketName + ".");
    }
}

与创建存储桶示例类似，我们必须创建一个DeleteBucketRequest并将其传递给deleteBucket() 。

储存类别

Amazon S3提供了满足不同要求的不同存储类别：

标准
STANDARD_IA
REDUCED_REDUNDANCY
冰川

STANDARD类专为对性能敏感的用例而设计，即数据经常更新和访问，并且必须实时可用。如果在上传数据时未指定存储类别，则STANDARD是默认类别。

STANDARD_IA中的IA代表不频繁访问，并以此解释此类。它可以用于存储时间较长但不经常访问的数据。尽管这些对象是实时可用的，但访问确实要花费更多的钱。要使用STANDARD_IA存储类将对象放入存储桶，只需在PutObjectRequest上设置存储类PutObjectRequest ：

PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, filePath.toFile());
putObjectRequest.setStorageClass(StorageClass.StandardInfrequentAccess);

STANDARD_IA的PUT，COPY和POST请求以及GET请求的价格通常是它们在STANDARD类中的两倍。另外，还必须为从STANDARD_IA到STANDARD的过渡付费。因此，STANDARD_IA中的存储成本仅比STANDARD类便宜27％。因此，您应该仔细考虑使用STANDARD或STANDARD_IA。

可以很简单地复制某些数据，例如从较大图像版本中计算出的缩略图图像。丢失此数据并不像丢失原始数据那样重要。因此，亚马逊针对此类数据引入了单独的存储类：REDUCED_REDUNDANCY。此类的耐用性仅为99.99％，而不是STANDARD和STANDARD_IA的99.999999999％。这意味着预期的对象损失约为每年0.01％。例如，如果存储10.000个对象，则可能每年丢失1个对象。如果对象丢失，Amazon S3服务将返回405错误。

最后但并非最不重要的一点是，Amazon还为应该归档的对象提供了一个存储类：GLACIER。如果您有不经常访问的数据并且不需要实时访问，则适合使用此类。最后一点意味着如果要访问它们，则必须还原数据。

请注意，首次上传对象时，不能将GLACIER指定为存储类。相反，您将不得不使用其他三个存储类之一：STANDARD，STANDARD_IA或REDUCED_REDUNDANCY。然后使用亚马逊的“对象生命周期管理”将对象移至GLACIER类。

对象生命周期管理

Amazon允许您定义自动应用于驻留在特定存储桶中的对象的特定操作。这些操作作为“生命周期子资源”附加到存储桶，并且由以下两个操作组成：

过渡动作：将对象从一种存储类移动到另一种。
到期操作：对象到期时将被删除。

虽然可以使用过渡操作来管理一个存储桶中对象的存储类，但是使用到期操作来指定某些对象何时可以由Amazon自动删除。如果您在应用程序仅需要一定时间的情况下将日志文件存储在S3中，则后者很有趣。该时间到期后，可以删除日志文件。

如前所述，我们可以使用生命周期管理将对象从一种实时访问存储类移动到GLACIER类。以下是在执行此操作之前必须了解的几点：

您可以将对象移至GLACIER，但不能将其移回到STANDARD，STANDARD_IA或REDUCED_REDUNDANCY。如果需要这样做，则必须首先还原该对象，然后使用存储类设置复制该对象。
您无法通过Amazaon Glacier API访问从S3移动到GLACIER存储类的对象。
Amazon在S3中为每个对象存储8 KB的元信息数据块，以便您实时列出所有对象。
GLACIER中的每个对象都占用一个附加的数据块，该数据块的元信息约为32 KB。因此，存储大量的小文件可能会花费额外的钱。
从GLACIER恢复对象作为临时副本最多可能需要5个小时。这些对象将保留在GLACIER存储中，直到您将其删除为止。

当然可以使用Amazon的SDK为存储桶设置配置规则。下面的示例代码演示了这一点：

public class LifeCycleConfigurationApp {
    private static final Logger LOGGER = Logger.getLogger(LifeCycleConfigurationApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 1) {
            LOGGER.log(Level.WARNING, "Please provide the following arguments: <bucket-name>");
            return;
        }
        String bucketName = args[0];

        BucketLifecycleConfiguration.Rule rule =
                new BucketLifecycleConfiguration.Rule()
                        .withId("Transfer to IA, then GLACIER, then remove")
                        .withFilter(new LifecycleFilter(
                                new LifecycleTagPredicate(new Tag("archive", "true"))))
                        .addTransition(new BucketLifecycleConfiguration.Transition()
                                .withDays(30)
                                .withStorageClass(StorageClass.StandardInfrequentAccess))
                        .addTransition(new BucketLifecycleConfiguration.Transition()
                                .withDays(365)
                                .withStorageClass(StorageClass.Glacier))
                        .withExpirationInDays(365 * 5)
                        .withStatus(BucketLifecycleConfiguration.ENABLED);
        BucketLifecycleConfiguration conf =
                new BucketLifecycleConfiguration()
                        .withRules(rule);

        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        amazonS3.setBucketLifecycleConfiguration(bucketName, conf);
    }
}

首先，我们创建一个Rule ，其标识符为“传输到IA，然后是GLACIER，然后删除”。顾名思义，我们将在一段时间后将对象移动到存储类STANDARD_IA，然后再移动到GLACIER。除了将规则应用于一个存储桶中的所有对象之外，我们还可以过滤满足特定条件的对象。在此示例中，我们只想移动标记有archive=true 。例如，在上传过程中标记对象的代码如下：

putObjectRequest.setTagging(new ObjectTagging(Arrays.asList(new Tag("archive", "true"))));

接下来，我们添加两个过渡，一个将30天后的对象移动到STANDARD_IA存储中，另一个将一年后的对象移动到GLACIER类。如果五年后这些对象仍然存在，我们希望将其完全删除。因此，我们适当地设置了到期时间。最后，我们创建一个BucketLifecycleConfiguration并添加刚刚创建的规则。然后可以使用Amazon的S3客户端发送此配置。

在使用存储类时，必须考虑一些通用规则：

在STANDARD_IA中转换对象后，您无法将它们移回到STANDARD或REDUCED_REDUNDANCY。
一旦在GLACIER中转换了对象，就无法将它们移回任何其他类。
您不能将对象移动到REDUCED_REDUNDANCY。

除了上面的一般规则外，还必须承认亚马逊的价格，因为例如将对象转换为STANDARD_IA会花费金钱。

加密

加密是指以只有授权人员才能解码和使用它的方式在Amazon S3中对对象进行编码。基本上，可以在将数据发送到Amazon的服务器时和/或将数据存储在Amazon时保护数据。

为了在传输期间保护数据，您可以选择使用安全套接字层（SSL）或它的后继传输层安全性（TLS）来传输HTTP请求（HTTPS）。使用Amazon的Java SDK，只需使用ClientConfiguration即可设置协议：

ClientConfiguration clientConf = new ClientConfiguration();
clientConf.setProtocol(Protocol.HTTPS);
AWSCredentialsProvider credentialsProvider = getAwsCredentialsProvider();
AwsEnvVarOverrideRegionProvider regionProvider = createRegionProvider();
return AmazonS3ClientBuilder.standard()
		.withClientConfiguration(clientConf)
		.withCredentials(credentialsProvider)
		.withRegion(regionProvider.getRegion())
		.build();

HTTPS提供身份验证（即可以合理确定与亚马逊的服务器进行通信）和加密（即数据以不允许他人读取或操纵数据的方式进行编码）的方式。但是使用HTTPS仅能保护数据的传输。数据到达Amazon的服务器后，数据再次不受保护，可以从存储中读取。为了保护数据驻留在Amazon服务器上时的数据，您基本上有两种选择：

服务器端加密：Amazon S3在将数据保存到磁盘之前先对其进行加密。
客户端加密：您的客户端应用程序管理加密过程，并将已加密的数据发送到Amazon。

首选（服务器端加密）的优势在于，亚马逊已经为您实施了加密以及密钥管理。但这也意味着该数据在亚马逊的计算机上存储的时间很短（存储之前）。使用客户端加密可以让您自己的应用程序对数据进行加密并管理密钥。这样，数据已经在客户端计算机的内存中受到保护，而Amazon永远不会看到原始内容。但是您必须自己关心算法和密钥管理。

亚马逊提供三种不同类型的服务器端加密：

Amazon S3托管密钥（SSE-S3）
AWS KMS托管密钥（SSE-KMS）
客户提供的密钥（SSE-C）

Amazon S3托管密钥（SSE-S3）

使用SSE-S3将使用多因素加密使用唯一密钥对每个对象进行加密。此外，此唯一密钥还使用定期旋转的主密钥进行加密。 S3服务使用256位高级加密标准（AES-256）加密数据。请注意，服务器端加密仅加密对象数据，而不加密其元数据。

要将SSE-S3加密与Java SDK结合使用，您将必须在PutObjectRequest SSE算法设置为元数据：

PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, filePath.toFile());

ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION);
putObjectRequest.setMetadata(objectMetadata);

返回的PutObjectResult告诉您Amazon使用的加密算法：

PutObjectResult result = amazonS3.putObject(putObjectRequest);
result.getSSEAlgorithm();

如果您需要更改算法，则可以简单地使用CopyObjectRequest并在其上设置新算法：

CopyObjectRequest copyObjRequest = new CopyObjectRequest(
	sourceBucket, sourceKey, targetBucket, targetKey);
	
ObjectMetadata objectMetadata = new ObjectMetadata();
objectMetadata.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION);

copyObjRequest.setNewObjectMetadata(objectMetadata);

AWS KMS托管密钥（SSE-KMS）

将服务器端加密与AWS KMS托管密钥（SSE-KMS）结合使用意味着可以使用Amazon的密钥管理服务（AWS KMS），该服务旨在针对大型分布式应用程序进行扩展。与SSE-S3相比，AWS KMS服务使用由客户（CMK）使用AWS KMS的API服务或AWS的IAM控制台创建的主密钥。两者都允许创建密钥，为其定义策略并提供有关每个密钥用法的审核日志。

首次将SSE-KMS加密的对象添加到特定区域中的存储桶时，Amazon将创建一个默认密钥（CMK）。除非您指定其他键，否则此键将在后续呼叫中使用。为了指示应使用SSE-KMS，相应的REST请求应具有带有密钥x-amz-server-side-encryption和值为aws:kms 。附加标头x-amz-server-sideencryption-aws-kms-key-id用于指定密钥ID。当您指定SSEAwsKeyManagementParams时，Java SDK会自动设置标头x-amz-server-side-encryption ：

SSEAwsKeyManagementParams awsKMParams = new SSEAwsKeyManagementParams();
putObjectRequest.setSSEAwsKeyManagementParams(awsKMParams);

如果您已经创建了AWS KMS-Key，则可以直接将其ID作为参数提供给构造函数：

SSEAwsKeyManagementParams awsKMParams = new SSEAwsKeyManagementParams(awsKmsKeyId);
putObjectRequest.setSSEAwsKeyManagementParams(awsKMParams);

可以在此处找到有关Amazon密钥管理的更多信息。

客户提供的密钥（SSE-C）

如果您不喜欢Amazon存储用于加密数据的密钥的想法，那么您也可以在请求中提供自己的自定义密钥（SSE-C）。然后，Amazon使用此密钥在服务器端加密或解密数据，但不存储密钥。它们仅存储随机加盐的加密密钥的HMAC值，以验证将来的请求。 But this also means that if you lose the key provided for encryption, the information stored in the object is lost as your application is responsible for the key management.

The following sample shows how to upload and download an object to a bucket using a custom key:

public class PutAndGetObjectEncryptedSSECApp {
    private static final Logger LOGGER = Logger.getLogger(PutAndGetObjectEncryptedSSECApp.class.getName());

    public static void main(String[] args) throws IOException, NoSuchAlgorithmException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> <file>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        Path filePath = Paths.get(args[2]);
        AmazonS3 amazonS3 = AwsClientFactory.createClient();

        KeyGenerator generator = KeyGenerator.getInstance("AES");
        generator.init(256, new SecureRandom());
        SecretKey secretKey = generator.generateKey();
        SSECustomerKey sseKey = new SSECustomerKey(secretKey);

        PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, filePath.toFile());
        putObjectRequest.setSSECustomerKey(sseKey);

        PutObjectResult result = amazonS3.putObject(putObjectRequest);

        LOGGER.info("Put file '" + filePath + "' under key " + key + " to bucket " 
			+ bucketName + " " + result.getSSEAlgorithm());

        GetObjectRequest getObjectRequest = new GetObjectRequest(bucketName, key);
        getObjectRequest.setSSECustomerKey(sseKey);
        S3Object s3Object = amazonS3.getObject(getObjectRequest);
        FileUtils.copyInputStreamToFile(s3Object.getObjectContent(), filePath.toFile());
    }
}

To generate the key no additional libraries are necessary. Instead you can use the functionality from the JRE to create a KeyGenerator and let it generate a key for you. This key is wrapped inside a SSECustomerKey which subsequently is passed to the PutObjectRequest as well as to the GetObjectRequest .

5.4 AWS KMS–Managed Customer Master Key (CSE-KMS)

The server-side encryption may not be enough for your requirements because in all cases Amazon knows the key you are using and has at some point in time the raw data in its hand. To prevent that Amazon knows the raw data, you can use a KMS key but encrypt the data inside the client. In this case the AmazonS3EncryptionClient from the SDK first sends a request to the KMS service to retrieve a data key for encryption. The KMS answers with a randomly chosen data key and returns it in two versions. The first version is used by the client to encrypt the data, the second one is used ciphered version of this data key. The ciphered version is send as metadata along with the encrypted data to Amazon S3. When downloading the data the client downloads the encrypted object together with the ciphered data key. It then sends the ciphered data key to the KMS service and receives a plain text version of it that can be used to decrypt the object.

The AmazonS3EncryptionClient can be created just like the normal Amazon S3 client:

ClientConfiguration clientConf = new ClientConfiguration();
clientConf.setConnectionTimeout(60 * 1000);
AWSCredentialsProvider credentialsProvider = getAwsCredentialsProvider();
AwsEnvVarOverrideRegionProvider regionProvider = createRegionProvider();
CryptoConfiguration cryptoConfiguration = new CryptoConfiguration();
cryptoConfiguration.setAwsKmsRegion(RegionUtils.getRegion(regionProvider.getRegion()));
KMSEncryptionMaterialsProvider materialsProvider = new
		KMSEncryptionMaterialsProvider("");
AmazonS3EncryptionClientBuilder.standard()
		.withClientConfiguration(clientConf)
		.withCredentials(credentialsProvider)
		.withRegion(regionProvider.getRegion())
		.withCryptoConfiguration(cryptoConfiguration)
		.withEncryptionMaterials(materialsProvider)
		.build();

Additionally we are providing a CryptoConfiguration as well as a KMSEncryptionMaterialsProvider . While the latter is used to provide the id of your customer master key, the CryptoConfiguration has the task to deliver the region for the KMS service. The client constructed as described above implements the same interface as the “normal” S3 client, hence all operations can be executed as already shown.

5.5 Client-Side Master Key (CSE-C)

The “AWS KMS–Managed Customer Master Key” allows you to encrypt the data inside the client, but Amazon still knows the data key you are using. If you need to manage the keys in your own application, you can also generate the master key yourself. In this scenario the client generates a random data key used to encrypt the object before the upload. It then encrypts the data key with the master key provided by the client application and sends the encrypted data key together with the encrypted data to Amazon. This way Amazon does not know about the raw data nor the key used to encrypt the data. During the download Amazon loads next to the object the encrypted data key and decrypts it using the client's master key. The additional material information tells the client which master key to use.

The following code shows how to create a AmazonS3EncryptionClient with a static encryption materials provider that takes a symmetric SecretKey as input:

public static AmazonS3 createEncryptionClient(SecretKey secretKey) throws IOException {
	ClientConfiguration clientConf = new ClientConfiguration();
	clientConf.setConnectionTimeout(60 * 1000);
	AWSCredentialsProvider credentialsProvider = getAwsCredentialsProvider();
	AwsEnvVarOverrideRegionProvider regionProvider = createRegionProvider();
	CryptoConfiguration cryptoConfiguration = new CryptoConfiguration();
	cryptoConfiguration.setAwsKmsRegion(RegionUtils.getRegion(regionProvider.getRegion()));
	EncryptionMaterials materials = new EncryptionMaterials(secretKey);
	StaticEncryptionMaterialsProvider materialsProvider = new StaticEncryptionMaterialsProvider(materials);
	return AmazonS3EncryptionClientBuilder.standard()
			.withClientConfiguration(clientConf)
			.withCredentials(credentialsProvider)
			.withRegion(regionProvider.getRegion())
			.withCryptoConfiguration(cryptoConfiguration)
			.withEncryptionMaterials(materialsProvider)
			.build();
}

As the AmazonS3EncryptionClient also implements the interface AmazonS3 , it can be used like the “normal” S3 client:

public class PutAndGetObjectEncryptedCSECApp {
    private static final Logger LOGGER = Logger.getLogger(PutAndGetObjectEncryptedCSECApp.class.getName());

    public static void main(String[] args) throws IOException, NoSuchAlgorithmException {
        if (args.length < 3) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key> <file>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        Path filePath = Paths.get(args[2]);

        KeyGenerator generator = KeyGenerator.getInstance("AES");
        generator.init(256, new SecureRandom());
        SecretKey secretKey = generator.generateKey();

        AmazonS3 amazonS3 = AwsClientFactory.createEncryptionClient(secretKey);

        PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key, filePath.toFile());
        PutObjectResult result = amazonS3.putObject(putObjectRequest);

        LOGGER.info("Put file '" + filePath + "' under key " + key + " to bucket " + 
			bucketName + " " + result.getSSEAlgorithm());

        GetObjectRequest getObjectRequest = new GetObjectRequest(bucketName, key);
        S3Object s3Object = amazonS3.getObject(getObjectRequest);
        FileUtils.copyInputStreamToFile(s3Object.getObjectContent(), filePath.toFile());
    }
}

版本控制

To protect your data against unintended deletion or overwrites, you can enable versioning for a bucket. This will create a new version of an object if you upload new data to the same key instead of overwriting the old data or only set a deletion marker in case you delete the object. Hence an object like image.jpg can exist in multiple versions (like version 1, version 2, etc.) and you can retrieve the old versions if necessary.

Versioning is enabled or suspended on a complete bucket and not on single objects. Once you have enabled versioning on a bucket you cannot disable it but only suspend it. This means that buckets can be in one of the following states:

unversioned: This is the default state, ie objects are not versioned.
versioning-enabled: Objects are versioned.
versioningsuspended: Versionin has been enabled on this bucket but is now suspended.

Whenever you change the state of a bucket, the new state applies to all objects in the bucket but no object is changed. That means for example that if you enable versioning for the first time on a bucket, all existing objects get the version null . On the other hand it also means that if you suspend versioning, all existing versions are kept.

The following code demonstrates how to enable (or disable) versioning on an existing bucket:

public class BucketVersioningApp {
    private static final Logger LOGGER = Logger.getLogger(BucketVersioningApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 2) {
            LOGGER.log(Level.WARNING, "Please provide the following arguments: <bucket-name> <versioning-status>");
            return;
        }
        String bucketName = args[0];
        String versioningStatus = args[1];
        AmazonS3 amazonS3 = AwsClientFactory.createClient();

        if (!BucketVersioningConfiguration.ENABLED.equals(versioningStatus)
                && !BucketVersioningConfiguration.SUSPENDED.equals(versioningStatus)) {
            LOGGER.log(Level.SEVERE, "Please provide a valid versioning status.");
            return;
        }

        BucketVersioningConfiguration conf = new BucketVersioningConfiguration();
        conf.setStatus(versioningStatus);

        SetBucketVersioningConfigurationRequest request =
                new SetBucketVersioningConfigurationRequest(bucketName, conf);
        amazonS3.setBucketVersioningConfiguration(request);

        conf = amazonS3.getBucketVersioningConfiguration(bucketName);
        LOGGER.info("Bucket " + bucketName + " has this versioning status: " + conf.getStatus());
    }
}

As arguments you have to provide the name of the bucket and the new versioning status. This can be one of the following string constants: Enabled , Suspended . The status Off is not allowed because you can only suspend versioning on a bucket but never disable it. Amazon's S3 client expects an object of type SetBucketVersioningConfigurationRequest as request object for setBucketVersioningConfiguration() . The SetBucketVersioningConfigurationRequest is filled with the bucket name and the BucketVersioningConfiguration .

The method getBucketVersioningConfiguration() can be used to retrieve the current versioning configuration.

Once you have enabled versioning on a bucket you can upload different versions of the same object for the same key and Amazon S3 will automatically create the versions on the server for you. The following example demonstrates this (assuming that you have enabled versioning on the bucket):

public class VersioningExampleApp {
    private static final Logger LOGGER = Logger.getLogger(VersioningExampleApp.class.getName());

    public static void main(String[] args) throws IOException {
        if (args.length < 2) {
            LOGGER.warning("Please provide the following parameters: <bucket-name> <key>");
            return;
        }
        String bucketName = args[0];
        String key = args[1];
        AmazonS3 amazonS3 = AwsClientFactory.createClient();
        Charset charset = Charset.forName("UTF-8");

        byte[] bytes = "Version 1".getBytes(charset);
        PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, key,
                new ByteArrayInputStream(bytes), new ObjectMetadata());
        amazonS3.putObject(putObjectRequest);

        bytes = "Version 2".getBytes(charset);
        putObjectRequest = new PutObjectRequest(bucketName, key,
                new ByteArrayInputStream(bytes), new ObjectMetadata());
        amazonS3.putObject(putObjectRequest);

        List<String> versionIds = new ArrayList<>();
        ListVersionsRequest listVersionsRequest = new ListVersionsRequest();
        listVersionsRequest.setBucketName(bucketName);
        VersionListing versionListing;
        do {
            versionListing = amazonS3.listVersions(listVersionsRequest);
            for (S3VersionSummary objectSummary : versionListing.getVersionSummaries()) {
                LOGGER.info(objectSummary.getKey() + ": " + objectSummary.getVersionId());
                if (objectSummary.getKey().equals(key)) {
                    versionIds.add(objectSummary.getVersionId());
                }
            }
            listVersionsRequest.setKeyMarker(versionListing.getNextKeyMarker());
            listVersionsRequest.setVersionIdMarker(versionListing.getNextVersionIdMarker());
        } while (versionListing.isTruncated());

        for (String versionId : versionIds) {
            GetObjectRequest getObjectRequest = new GetObjectRequest(bucketName, key);
            getObjectRequest.setVersionId(versionId);
            S3Object s3Object = amazonS3.getObject(getObjectRequest);
            StringWriter stringWriter = new StringWriter();
            IOUtils.copy(s3Object.getObjectContent(), stringWriter, charset);
            LOGGER.info(versionId + ": " + stringWriter.toString());
        }
		
        for (String versionId : versionIds) {
            DeleteVersionRequest dvr = new DeleteVersionRequest(bucketName, key, versionId);
            amazonS3.deleteVersion(dvr);
            LOGGER.info("Deleted version " + versionId + ".");
        }
    }
}

For the sake of simplicity we just upload two versions of the same object with different content (here the strings Version 1 and Version 2 ). The ListVersionsRequest allows us to list all objects and their versions. Therefore we have to set at least the bucket name and the issue this request by invoking the method listVersions() on the Amazon S3 client. This returns an object of type VersionListing that contains a list of S3VersionSummary . The example above iterates over all summaries and fetches those that correspond to the key of the object we have just created.

As the list of objects and versions may be huge for buckets with plenty of objects, the version list is chunked. To retrieve the next chunk we have to provide the “key marker” and the “version id marker” that is provided in the last response and set them on the next request. This way the Amazon servers know how to proceed.

Once we have collected all versions of the object we have uploaded, we can retrieve them using the already known GetObjectRequest . Now that we are working with versions, we have to set additionally the version id on the request object.

If you want to delete a specific version, you cannot do this using the normal DeleteObjectRequest . This request will only mark the latest version as deleted and return on a subsequent GetObjectRequest a 404 error. To permanently remove objects from a bucket you have to use the DeleteVersionRequest . It takes next to the obligatory bucket name and the key also the versionId. This way you can delete specific versions from a bucket.

Please note that if you submit a simple delete request on a delete marker creates just another delete marker. Hence a delete marker can be thought of another version of the object just denoting that this object has been removed. To permanently remove a delete marker, you will have to use the DeleteVersionRequest just like for a “normal” object version.

命令行

This section explains how to use Amazon's Command Line Client (aws-cli) to interact with the S3 service. The project is hosted on github such that you have access to the latest sources.

安装

The aws-cli tool is written in python and can be obtained through pip:

$ pip install --upgrade --user awscli

The upgrade option lets pip update all requirements of the tool while the user argument lets pip install aws-cli into a subdirectory of your user directory to avoid conflicts with already installed libraries.

If you haven't installed python and pip yet, Amazon provides detailed tutorials that explain these installation steps for different operating systems:

Once everything has been setup correctly, you should be able to run the following command:

$ aws --version

Finally, you can uninstall the aws-cli with the following command:

$ pip uninstall awscli

Once you have got aws-cli running, you can configure it using the following command:

$ aws configure
AWS Access Key ID [None]: 
AWS Secret Access Key [None]: 
Default region name [None]: us-west-2
Default output format [None]: json

The configure option asks interactively for your AWS access key and secret access key. After you have provided the two keys you have to set the default region (here: us-west-2 ) and the default output format (here: json ). If you need different profiles, you can specify this additionally using the argument --profile :

$ aws configure --profile my-second-profile

Beyond using the CLI configuration you can also specify some configuration options as arguments to the command:

$ aws ... --output json --region us-west-2

Your access key and secret access key can be provided through environment variables (here for Linux, macOS, or Unix):

$ export AWS_ACCESS_KEY_ID=
$ export AWS_SECRET_ACCESS_KEY=
$ export AWS_DEFAULT_REGION=us-west-2

On Windows systems you have to set them the following way:

$ set AWS_ACCESS_KEY_ID=
$ set AWS_SECRET_ACCESS_KEY=
$ set AWS_DEFAULT_REGION=us-west-2

Of course, you can use named profiles stored in configuration files. The file ~/.aws/credentials (on Windows: %UserProfile%\.aws\credentials ) looks like this:

[default]
aws_access_key_id=
aws_secret_access_key=

The remaining configuration options are stored in ~/.aws/config (on Windows: %UserProfile%\.aws\config ):

[default]
region=us-west-2
output=json

水桶

High-level commands for the creation and deletion of buckets are provided by the aws-cli tool. Just provide the string s3 as first argument to aws followed by a shortcut for “make bucket”, “remove bucket”:

$ aws s3 mb s3://my-bucket

The command above creates a new bucket with the name my-bucket while the following one removes this bucket again:

$ aws s3 rb s3://my-bucket

Please note that you can only delete buckets that are already empty. Removing buckets with contents can be done by providing the option --force :

$ aws s3 rb s3://my-bucket --force

To list the contents of a bucket use the following command:

$ aws s3 ls s3://my-bucket

The aws tool also supports filtering by prefix, hence you can easily list all objects in the “folder” my-path :

$ aws s3 ls s3://my-bucket/my-path

Last but not least it is possible to list all available buckets:

$ aws s3 ls

对象

The command line interface supports high-level operations for uploading, moving and removing objects. These commands are encoded similar to the well-known Unix command line tools cp , mv and rm .

Uploading a local file to s3 therefore can be written like this:

$ aws s3 cp local.txt s3://my-bucket/remote.txt

Additionally one can also specify the storage class for the new remote object:

$ aws s3 cp local.txt s3://my-bucket/remote.txt --storage-class REDUCED_REDUNDANCY

The new file can be removed by using the command rm :

$ aws s3 rm s3://my-bucket/remote.txt

It is also possible to move complete sets of files from S3 to the local machine:

$ aws s3 mv s3://my-bucket/my-path ./my-dir --include '*.jpg' --recursive

As the options indicate, the command only moves files that end with .jpg and does this recursively on the complete folder structure.

7.4 Synchronization

Often it is useful to synchronize complete folder structures and their content either from the local machine to S3 or vice versa. Therefor the aws-cli tool comes with the handy option sync :

$ aws s3 sync my-dir s3://my-bucket/my-path

The command above updates all files in S3 that have a different size and/or modification timestamp than the one in the local directory. As it does not remove objects from the S3, you must specify the --delete option to let the tool also remove files in S3 that are not present in your local copy:

$ aws s3 sync my-dir s3://my-bucket/my-path --delete

When synchronizing your local copy with the remote files in S3 you can also specify the storage class and the access privilges:

$ aws s3 sync my-dir s3://my-bucket/my-path --acl public-read --storage-class STANDARD_IA

The acl option takes the arguments private , public-read and public-read-write .

价钱

When you sign-up for Amazon S3 you get 5 GB of standard storage, 20,000 GET and 2,000 PUT requests and 15 GB of data transfer for free as part of the AWS Free Usage Tier. If this contingent is empty, you have to pay for all requests and all storage depending on the storage class as listed on the Pricing site.

The storage prices also vary by region. While on GB costs for example currently $0.026 in the region “US West”, it costs $0.0405 in the region “South America”. In Europe for example is the region “Ireland” cheaper than “Frankfurt” or “London”.

Next to the storage one also has to pay for the HTTP requests. The pricing here depends on the type of request. Requests that upload or download larger amounts of data cost more than simple GET requests. In the “US West” region you have to pay for example $0.004 per 10,000 GET requests and $0.005 per 1,000 PUT, COPY, POST or LIST requests. DELETE requests are free, as long as the object resides in the standard storage class.