14. Other Amazon Database services

Amazon Neptune

  • Amazon Neptune is a fast, reliable, fully managed graph database service that makes it easy to build and run applications that work with highly connected datasets.
  • The core of Neptune is a purpose-built, high-performance graph database engine.
  • This engine is optimized for storing billions of relationships and querying the graph with milliseconds latency.
  • Neptune supports the popular graph query languages Apache TinkerPop Gremlin and W3C’s SPARQL, enabling you to build queries that efficiently navigate highly connected datasets.
  • Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.
  • Neptune is highly available, with read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across Availability Zones.
  • Neptune provides data security features, with support for encryption at rest and in transit

Key Service Components

  • Primary DB instance – Supports read and write operations, and performs all of the data modifications to the cluster volume. Each Neptune DB cluster has one primary DB instance that is responsible for writing (that is, loading or modifying) graph database contents.
  • Neptune replica – Connects to the same storage volume as the primary DB instance and supports only read operations. Each Neptune DB cluster can have up to 15 Neptune Replicas in addition to the primary DB instance. This provides high availability by locating Neptune Replicas in separate Availability Zones and distribution load from reading clients.
  • Cluster volume – Neptune data is stored in the cluster volume, which is designed for reliability and high availability. A cluster volume consists of copies of the data across multiple Availability Zones in a single AWS Region. Because your data is automatically replicated across Availability Zones, it is highly durable, and there is little possibility of data loss.
    • Neptune also automatically detects failures in the disk volumes that make up the virtual cluster volume. When a segment of a disk volume fails, Neptune immediately repairs that segment, using data in other disk volumes in the virtual cluster volume to ensure that the data in the repaired segment is current.

Amazon Quantum Ledger Database (Amazon QLDB)

  • Amazon Quantum Ledger Database (Amazon QLDB) is a fully managed ledger database that provides a transparent, immutable, and cryptographically verifiable transaction log owned by a central trusted authority.
  • You can use Amazon QLDB to track all application data changes, and maintain a complete and verifiable history of changes over time.
  • Ledgers are typically used to record a history of economic and financial activity in an organization.
  • In Amazon QLDB, the journal is the core of the database. Structurally similar to a transaction log, the journal is an immutable, append-only data structure that stores your application data along with the associated metadata. All write transactions, including updates and deletes, are committed to the journal first.
  • QLDB uses the journal to determine the current state of your ledger data by materializing it into queryable, user-defined tables.
  • In addition, the journal handles concurrency, sequencing, cryptographic verification, and availability of the ledger data.

Journal 

Diagram titled QLDB: the journal is the database, showing journal                         architecture, with an application that connects to a ledger and commits                         transactions to the journal, which are materialized into tables.

  • In this example, an application connects to a ledger and runs transactions that insert, update, and delete a document into a table named cars.
  • The data is first written to the journal in sequenced order.
  • Then the data is materialized into the table with built-in views. These views let you query both the current state and the complete history of the car, with each revision assigned a version number.
  • You can also export or stream data directly from the journal.
  • The following diagram shows the mapping constructs of the core components between a traditional RDBMS and Amazon QLDB.

Diagram of core components of traditional RDBMS (database, table, index,                     row, column, etc.) mapping to corresponding QLDB components (ledger, table,                     index, Ion document, doc attribute, etc).

Data storage

  • Journal storage – The disk space that is used by a ledger's journal. The journal is append-only and contains the complete, immutable, and verifiable history of all the changes to your data.
  • Indexed storage – The disk space that is used by a ledger's tables, indexes, and indexed history. Indexed storage consists of ledger data that is optimized for high-performance queries.

Amazon Timestream

  • Amazon Timestream is a fast, scalable, fully managed, purpose-built time series database that makes it easy to store and analyze trillions of time series data points per day.
  • Timestream saves you time and cost in managing the lifecycle of time series data by keeping recent data in memory and moving historical data to a cost optimized storage tier based upon user defined policies.
  • Timestream’s purpose-built query engine lets you access and analyze recent and historical data together, without having to specify its location.
  • Amazon Timestream has built-in time series analytics functions, helping you identify trends and patterns in your data in near real-time.
  • Timestream is serverless and automatically scales up or down to adjust capacity and performance.

Timestream Key Benefits

  • Serverless with auto-scaling 
  • Data lifecycle management
  • Simplified data access
  • Purpose-built for time series 
  • Always encrypted 
  • High availability
  • Durability 

Timestream Use Cases

  • Monitoring metrics to improve the performance and availability of your applications.
  • Storage and analysis of industrial telemetry to streamline equipment management and maintenance.
  • Tracking user interaction with an application over time.
  • Storage and analysis of IoT sensor data

Timestream Concepts

  • Time series data is a sequence of data points recorded over a time interval. This type of data is used for measuring events that change over time. Examples include:
    • stock prices over time
    • temperature measurements over time
    • CPU utilization of an EC2 instance over time
  • Record - A single data point in a time series.
  • Dimension - An attribute that describes the meta-data of a time series.
  • Measure The actual value being measured by the record.
  • Timestamp - Indicates when a measure was collected for a given record.
  • Table - A container for a set of related time series.
  • Database A top level container for tables.

Timestream Architecture 

Amazon Keyspaces (for Apache Cassandra)

  • Amazon Keyspaces (for Apache Cassandra) is a scalable, highly available, and managed Apache Cassandra–compatible database service. 
  • Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service automatically scales tables up and down in response to application traffic.
  • Amazon Keyspaces (for Apache Cassandra) stores three copies of your data in multiple Availability Zones for durability and high availability. 
  • Encryption at rest is automatically enabled when you create a new Amazon Keyspaces table and all client connections require Transport Layer Security (TLS).

Diagram of Amazon Keyspaces interacting with client application.

  • Storage——You can visualize your Cassandra data in tables, with each row representing a record and each column representing a field within that record.
  • Table Design: Query First ——There are no JOINs in CQL. 
  • Partitions——Your data is stored in partitions on disk. The number of partitions your data is stored in and how it is distributed across the partitions is determined by your partition key.
  • Primary Key
    • In Cassandra, data is stored as a key-value pair.
    • To that end, every Cassandra table must have a primary key, which is the key to each row in the table.
    • The primary key is the composite of a required partition key and optional clustering columns.
    • The data that comprises the primary key must be unique across all records in a table.
    • Partition key 
      • The partition key portion of the primary key is required and determines which partition of your cluster the data is stored in.
      • The partition key can be a single column, or it can be a compound value composed of two or more columns. 
    • Clustering column 
      • The optional clustering column portion of your primary key determines how the data is clustered and sorted within each partition.
      • If you include a clustering column in your primary key, the clustering column can have one or more columns.
      • If there are multiple columns in the clustering column, the sorting order is determined by the order that the columns are listed in the clustering column, from left to right.

Reference

https://docs.aws.amazon.com/qldb/latest/developerguide/what-is.html

https://docs.aws.amazon.com/timestream/latest/developerguide/what-is-timestream.html

https://docs.aws.amazon.com/keyspaces/latest/devguide/what-is-keyspaces.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值