【AWS】DynamoDB扫描操作获取表全部数据

Amazon DynamoDB 是一种完全托管的 NoSQL 数据库服务,提供快速且可预测的性能,同时还能够实现无缝扩展。使用 DynamoDB,您可以免除操作和扩展分布式数据库的管理工作负担,因而无需担心硬件预置、设置和配置、复制、软件修补或集群扩展等问题。DynamoDB 还提供静态加密,这消除了在保护敏感数据时涉及的操作负担和复杂性。

背景

有时需要将DynamoDB中的数据完全拷贝下来。需要用到扫描操作。

Amazon DynamoDB 中的 Scan 操作读取表或二级索引中的每个项目。默认情况下,Scan 操作返回表或索引中每个项目的全部数据属性。但是,单个 Scan 请求最多可检索 1 MB 数据。对于大表需要进行多次扫描操作。且需要注意的是读取限制,如果超出限制,那么就会告警。

我的表格结构如下

JAVA环境的配置 

(1)JDK和maven环境就不说了。

(2)将用户的认证相关内容放入配置文件当中,也可以直接写在代码当中(后面解决方法二就是直接写的)

(3)maven配置

因为用到了写入csv文件,因此引入了csv相关工具包。

<dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>com.amazonaws</groupId>
                <artifactId>aws-java-sdk-bom</artifactId>
                <version>1.11.327</version>
                <type>pom</type>
                <scope>import</scope>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-java-sdk-dynamodb</artifactId>
        </dependency>
        <dependency>
            <groupId>net.sourceforge.javacsv</groupId>
            <artifactId>javacsv</artifactId>
            <version>2.0</version>
        </dependency>
    </dependencies>

解决方法(一)小容量读取

import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.PropertiesCredentials;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.model.AttributeValue;
import com.amazonaws.services.dynamodbv2.model.ScanRequest;
import com.amazonaws.services.dynamodbv2.model.ScanResult;

import java.io.File;
import java.io.IOException;
import java.util.List;
import java.util.Map;


public class DynamoDBUtils {

    public static void main(String[] args) throws IOException {
        AWSCredentials credentials = new PropertiesCredentials(new File("src/main/resources/key.properties"));
        AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().
                withCredentials(new AWSStaticCredentialsProvider(credentials)).
                withRegion("us-east-1").build();

        ScanRequest scanRequest = new ScanRequest().withTableName("yucheng");
        ScanResult result = client.scan(scanRequest);
        for (Map<String, AttributeValue> item : result.getItems()) {
            System.out.println(item);
            String query = item.get("query").getS();
            System.out.println(query);
            List<AttributeValue> asin_list = item.get("asin_list").getL();
            for (AttributeValue value : asin_list) {
                System.out.print(value.getS() + " ");
            }
            System.out.println();
        }
    }
}

读取结果如下。

 解决方法(二)大容量读取


import com.amazonaws.auth.AWSCredentials;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDB;
import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder;
import com.amazonaws.services.dynamodbv2.model.AttributeValue;
import com.amazonaws.services.dynamodbv2.model.ScanRequest;
import com.amazonaws.services.dynamodbv2.model.ScanResult;

import java.io.IOException;
import java.nio.charset.Charset;
import java.util.ArrayList;

import java.util.List;
import java.util.Map;

import com.csvreader.CsvWriter;


public class DynamoDBUtils {

    private static String region = "us-east-1";//替换成自己的
    private static String AWS_ACCESS_KEY_ID = "XXXXXX";//替换成自己的
    private static String AWS_SECRET_ACCESS_KEY = "XXXXXXXXXXX";//替换成自己的


    public static void main(String[] args) {
        try {
            f();
        } catch (InterruptedException | IOException e) {
            e.printStackTrace();
        }
    }


    public static void f() throws IOException, InterruptedException {
        String filePath = "XXXX.tsv";//替换成自己的
        AWSCredentials credentials = new BasicAWSCredentials(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY);
        AmazonDynamoDB client = AmazonDynamoDBClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(credentials)).withRegion(region).build();
        CsvWriter csvWriter = new CsvWriter(filePath, '\t', Charset.forName("UTF-8"));

        int count = 0;
        Map<String, AttributeValue> lastKeyEvaluated = null;
        do {
            count++;
            if (count % 10 == 0) {
                System.out.println(count);
            }
            ScanRequest scanRequest = new ScanRequest()
                    .withTableName("yucheng")
                    .withExclusiveStartKey(lastKeyEvaluated);

            ScanResult result = client.scan(scanRequest);
            for (Map<String, AttributeValue> item : result.getItems()) {
                String query = item.get("query").getS();

                List<AttributeValue> asinList = item.get("asin_list").getL();
                List<String> asins = new ArrayList<>();

                for (AttributeValue value : asinList) {
                    asins.add(value.getS());
                }

                String[] line = new String[2];
                line[0] = query;
                line[1] = String.join(",", asins);
                csvWriter.writeRecord(line);

            }
            lastKeyEvaluated = result.getLastEvaluatedKey();
            //为了控制频率,需要将此处进行一个暂停,不然DynamoDB承受不住就会告警
            Thread.sleep(40000);
        } while (lastKeyEvaluated != null);
        csvWriter.close();
    }

}

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

康雨城

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值