Storage systems use one of two different architectural approaches to provide scalability, performance and resiliency: eventual consistency or strong consistency. Object storage systems such as Amazon S3 and Swift are eventually consistent, which provide massive scalability and ensures high availability to data even during hardware failures. Block storage systems and filesystems are strongly consistent, which is required for databases and other real-time data, but limits their scalability and may reduce availability to data when hardware failures occur.
Eventually Consistent Storage Systems | Strongly Consistent Storage Systems |
Amazon S3 OpenStack Swift | Block storage Filesystems |
A key reason why Swift serves so well for highly-available, unstructured application data is that its design, just like Amazon S3, incorporates eventual consistency. In Swift, objects are protected by storing multiple copies of data so that if one node fails, the data can be retrieved from another node. Even if multiple nodes fail, data will remain available to the user. Swift’s design for eventual consistency means that there is a guarantee that the system will eventually become consistent and have the most up-to-date version of data for all copies of the data but still provide availability to data should hardware fail. This design makes it ideal when performance and scalability are critical, particularly for massive, highly distributed infrastructures with lots of unstructured data serving global sites.
Strong consistency is required when all reads needs to be guaranteed to return the most recent data. With this approach, all nodes in the storage system must be queried to ensure that all updates have been written to all nodes and the read is returning the most recent copy. While databases with transactions require strong consistency, backup files, log files, and unstructured data do not need that same consistency. Based on this architecture, storage systems with strong consistency are difficult to scale, especially when it comes to multi-site configurations. This means that as the data grows, becomes more distributed—such as over multiple regions—or when there is a hardware failure, the chance for data not being available in a strongly consistent storage system increases. By using an eventual consistency design, Swift does not have those drawbacks, which makes it an ideal choice for for highly scalable, distributed storage of unstructured data.
Each of these architectural approaches has its own definition, appropriate use cases, and tradeoffs—all of which need to be understood to appropriately identify which architecture is most appropriate for your data.