AWS Dynamodb简介
- Amazon DynamoDB 是一种完全托管式、无服务器的 NoSQL 键值数据库,旨在运行任何规模的高性能应用程序。
- DynamoDB能在任何规模下实现不到10毫秒级的一致响应,并且它的存储空间无限,可在任何规模提供可靠的性能。
- DynamoDB 提供内置安全性、连续备份、自动多区域复制、内存缓存和数据导出工具。
Redshift简介
- Amazon Redshift是一个快速、功能强大、完全托管的PB级别数据仓库服务。用户可以在刚开始使用几百GB的数据,然后在后期扩容到PB级别的数据容量。
- Redshift是一种联机分析处理OLAP(Online Analytics Processing)的类型,支持复杂的分析操作,侧重决策支持,并且能提供直观易懂的查询结果。
资源准备
VPC
- vpc
- cird block: 10.10.0.0/16
- internet gateway
- elastic ip address
- nat gateway:使用elastic ip address作为public ip
- public subnet
- 三个Availability Zone
- private subnet
- 三个Availability Zone
- public route table:public subnet关联的route table
- destination: 0.0.0.0/0 target: internet-gateway-id(允许与外界进行通信)
- destination:10.10.0.0/16 local(内部通信)
- private route table:private subnet关联的route table
- destination:10.10.0.0/16 local(内部通信)
- destination: 0.0.0.0/0 target: nat-gateway-id(允许内部访问外界)
- web server security group
- 允许任意ip对443端口进行访问
- 允许自己的ipdui22端口进行访问,以便ssh到服务器上向数据库插入数据
- glue redshift connection security group
- 只包含一条self-referencing rule ,允许同一个security group对所有tcp端口进行访
- 创建Glue connection时需要使用该security group:
- Reference: glue connection security group must have a self-referencing rule to allow to allow AWS Glue components to communicate. Specifically, add or confirm that there is a rule of Type All TCP, Protocol is TCP, Port Range includes all ports, and whose Source is the same security group name as the Group ID.
- private redshift security group
- 允许vpc内部(10.10.0.0/24)对5439端口进行访问
- 允许glue connection security group对5439端口进行访问
- public redshift security group
- 允许vpc内部(10.10.0.0/24)对5439端口进行访问
- 允许kenisis firehose所在region的public ip 对5439端口进行访问
-
13.58.135.96/27
for US East (Ohio) -
52.70.63.192/27
for US East (N. Virginia) -
13.57.135.192/27
for US West (N. California) -
52.89.255.224/27
for US West (Oregon) -
18.253.138.96/27
for AWS GovCloud (US-East) -
52.61.204.160/27
for AWS GovCloud (US-West) -
35.183.92.128/27
for Canada (Central) -
18.162.221.32/27
for Asia Pacific (Hong Kong) -
13.232.67.32/27
for Asia Pacific (Mumbai) -
13.209.1.64/27
for Asia Pacific (Seoul) -
13.228.64.192/27
for Asia Pacific (Singapore) -
13.210.67.224/27
for Asia Pacific (Sydney) -
13.113.196.224/27
for Asia Pacific (Tokyo) -
52.81.151.32/27
for China (Beijing) -
161.189.23.64/27
for China (Ningxia) -
35.158.127.160/27
for Europe (Frankfurt) -
52.19.239.192/27
for Europe (Ireland) -
18.130.1.96/27
for Europe (London) -
35.180.1.96/27
for Europe (Paris) -
13.53.63.224/27
for Europe (Stockholm) -
15.185.91.0/27
for Middle East (Bahrain) -
18.228.1.128/27
for South America (São Paulo) -
15.161.135.128/27
for Europe (Milan) -
13.244.121.224/27
for Africa (Cape Town) -
13.208.177.192/27
for Asia Pacific (Osaka) -
108.136.221.64/27
for Asia Pacific (Jakarta) -
3.28.159.32/27
for Middle East (UAE) -
18.100.71.96/27
for Europe (Spain) -
16.62.183.32/27
for Europe (Zurich) -
18.60.192.128/27
for Asia Pacific (Hyderabad)
-
VPC全部资源的serverless文件:
- custom:bucketNamePrefix 替换为自己的创建的bucket
- 运行以下命令部署: sls deploy -c vpc.yml
- vpc.yml
-
service: dynamodb-to-redshift-vpc custom: bucketNamePrefix: "jessica" provider: name: aws region: ${opt:region, "ap-southeast-1"} stackName: ${self:service} deploymentBucket: name: com.${self:custom.bucketNamePrefix}.deploy-bucket serverSideEncryption: AES256 resources: Parameters: VpcName: Type: String Default: "test-vpc" Resources: VPC: Type: "AWS::EC2::VPC" Properties: CidrBlock: "10.10.0.0/16" EnableDnsSupport: true EnableDnsHostnames: true InstanceTenancy: default Tags: - Key: Name Value: !Sub "VPC_${VpcName}" # Internet Gateway InternetGateway: Type: "AWS::EC2::InternetGateway" Properties: Tags: - Key: Name Value: !Sub "VPC_${VpcName}_InternetGateway" VPCGatewayAttachment: Type: "AWS::EC2::VPCGatewayAttachment" Properties: VpcId: !Ref VPC InternetGatewayId: !Ref InternetGateway # web server security group WebServerSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow access from public VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 443 ToPort: 443 CidrIp: "0.0.0.0/0" Tags: - Key: Name Value: !Sub "VPC_${VpcName}_WebServerSecurityGroup" # public route table RouteTablePublic: Type: "AWS::EC2::RouteTable" Properties: VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_RouteTablePublic" RouteTablePublicInternetRoute: Type: "AWS::EC2::Route" DependsOn: VPCGatewayAttachment Properties: RouteTableId: !Ref RouteTablePublic DestinationCidrBlock: "0.0.0.0/0" GatewayId: !Ref InternetGateway # public subnet SubnetAPublic: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [0, !GetAZs ""] CidrBlock: "10.10.0.0/24" MapPublicIpOnLaunch: true VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetAPublic" RouteTableAssociationAPublic: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetAPublic RouteTableId: !Ref RouteTablePublic SubnetBPublic: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [1, !GetAZs ""] CidrBlock: "10.10.32.0/24" MapPublicIpOnLaunch: true VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetBPublic" RouteTableAssociationBPublic: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetBPublic RouteTableId: !Ref RouteTablePublic SubnetCPublic: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [2, !GetAZs ""] CidrBlock: "10.10.64.0/24" MapPublicIpOnLaunch: true VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetCPublic" RouteTableAssociationCPublic: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetCPublic RouteTableId: !Ref RouteTablePublic # redshift security group PrivateRedshiftSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow access from inside vpc VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 5439 ToPort: 5439 CidrIp: 10.10.0.0/24 - IpProtocol: tcp FromPort: 5439 ToPort: 5439 SourceSecurityGroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupId Tags: - Key: Name Value: !Sub "VPC_${VpcName}_PrivateRedshiftSecurityGroup" # redshift security group PublicRedshiftSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow access from inside vpc and Kinesis Data Firehose CIDR block VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 5439 ToPort: 5439 CidrIp: 10.10.0.0/24 - IpProtocol: tcp FromPort: 5439 ToPort: 5439 CidrIp: 13.228.64.192/27 Tags: - Key: Name Value: !Sub "VPC_${VpcName}_PublicRedshiftSecurityGroup" GlueRedshiftConnectionSecurityGroup: Type: AWS::EC2::SecurityGroup Properties: GroupDescription: Allow self referring for all tcp ports VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_GlueRedshiftConnectionSecurityGroup" GlueRedshiftConnectionSecurityGroupSelfReferringInboundRule: Type: "AWS::EC2::SecurityGroupIngress" Properties: GroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupId IpProtocol: tcp FromPort: 0 ToPort: 65535 SourceSecurityGroupId: !GetAtt GlueRedshiftConnectionSecurityGroup.GroupId SourceSecurityGroupOwnerId: !Sub "${aws:accountId}" # nat gateway EIP: Type: "AWS::EC2::EIP" Properties: Domain: vpc NatGateway: Type: "AWS::EC2::NatGateway" Properties: AllocationId: !GetAtt "EIP.AllocationId" SubnetId: !Ref SubnetAPublic # private route table RouteTablePrivate: Type: "AWS::EC2::RouteTable" Properties: VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_RouteTablePrivate" RouteTablePrivateRoute: Type: "AWS::EC2::Route" Properties: RouteTableId: !Ref RouteTablePrivate DestinationCidrBlock: "0.0.0.0/0" NatGatewayId: !Ref NatGateway # private subnet SubnetAPrivate: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [0, !GetAZs ""] CidrBlock: "10.10.16.0/24" VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetAPrivate" RouteTableAssociationAPrivate: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetAPrivate RouteTableId: !Ref RouteTablePrivate SubnetBPrivate: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [1, !GetAZs ""] CidrBlock: "10.10.48.0/24" VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetBPrivate" RouteTableAssociationBPrivate: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetBPrivate RouteTableId: !Ref RouteTablePrivate SubnetCPrivate: Type: "AWS::EC2::Subnet" Properties: AvailabilityZone: !Select [2, !GetAZs ""] CidrBlock: "10.10.80.0/24" VpcId: !Ref VPC Tags: - Key: Name Value: !Sub "VPC_${VpcName}_SubnetCPrivate" RouteTableAssociationCPrivate: Type: "AWS::EC2::SubnetRouteTableAssociation" Properties: SubnetId: !Ref SubnetCPrivate RouteTableId: !Ref RouteTablePrivate Outputs: VPC: Description: "VPC." Value: !Ref VPC Export: Name: !Sub "${self:provider.stackName}" SubnetsPublic: Description: "Subnets public." Value: !Join [ ",", [!Ref SubnetAPublic, !Ref SubnetBPublic, !Ref SubnetCPublic], ] Export: Name: !Sub "${self:provider.stackName}-PublicSubnets" SubnetsPrivate: Description: "Subnets private." Value: !Join [ ",", [!Ref SubnetAPrivate, !Ref SubnetBPrivate, !Ref SubnetCPrivate], ] Export: Name: !Sub "${self:provider.stackName}-PrivateSubnets" DefaultSecurityGroup: Description: "VPC Default Security Group" Value: !GetAtt VPC.DefaultSecurityGroup Export: Name: !Sub "${self:provider.stackName}-DefaultSecurityGroup" WebServerSecurityGroup: Description: "VPC Web Server Security Group" Value: !Ref WebServerSecurityGroup Export: Name: !Sub "${self:provider.stackName}-WebServerSecurityGroup" PrivateRedshiftSecurityGroup: Description: "The id of the RedshiftSecurityGroup" Value: !Ref PrivateRedshiftSecurityGroup Export: Name: !Sub "${self:provider.stackName}-PrivateRedshiftSecurityGroup" PublicRedshiftSecurityGroup: Description: "The id of the RedshiftSecurityGroup" Value: !Ref PublicRedshiftSecurityGroup Export: Name: !Sub "${self:provider.stackName}-PublicRedshiftSecurityGroup" GlueRedshiftConnectionSecurityGroup: Description: "The id of the self referring security group" Value: !Ref GlueRedshiftConnectionSecurityGroup Export: Name: !Sub "${self:provider.stackName}-GlueSelfRefringSecurityGroup"
Redshift Cluster
- Private Cluster subnet group
- 创建一个包含private subnet的private subnet group
- Private Cluster:用于测试glue job同步数据到redshift,PubliclyAccessible必须设为false,否则glue job无法连接
- ClusterSubnetGroupName
- 使用private subnet group
- VpcSecurityGroupIds
- 使用privat
- ClusterSubnetGroupName