David Childs' work on extended set theory (XST) provides a mathematical foundation for managing data. Childs' research has focused on a mathematical identity for data and the use of mathematical expressions for behavior. Extended set theory is an augmentation of classical set theory. Instead of set operations with ordered pairs, extended set theory involves operations with n-tuples. In extended set theory, n-tuples are just sets with integer scopes (superscripts on element values).
Techniques for partitioning data have become an area of interest for database designers and system architects. Besides sharding, other approaches include hash partitioning, range partitioning and list partitioning but extended set theory can play a role. The advantage of the Childs' set-theoretic data model is it enables us to model data at a logical level and dynamically restructure it to match query requirements. That means a query can operate with an informationally-dense data set, requiring fewer I/O operations to deliver excellent performance against gigabyte-sized data sets, such as those generated for the Transaction Processing Council's TPC-H queries. Benchmarks have shown extended set theory offers some performance advantages when processing large data sets. Indeed some of the execution times for decision-support queries have been startling when compared to IBM DB2, Microsoft SQL Server and Oracle.
Many researchers and developers concerned about Big Data and cloud computing scalability have accepted Brewer's CAP Theorem and the need for partitioning data for performance reasons. XST treats all data representations as mathematical objects and provides a mathematical foundation for operations with data that’s been partitioned. Childs' XST might prove to be an important tool for addressing data partitioning problems, assuming more developers and architects are willing to investigate a mathematical foundation for managing data. That was done by a generation of developers and architects who accepted Codd’s relational model and the SQL databases that followed.