这学期修的数据库系统需要写一篇有关NoSQL数据库的文章。这两天查阅了不少资料,终于憋出了2000字来。虽然理解并不深入,但也算对这方面内容有所了解了。文章先介绍了NoSQL数据库兴起的动机,以及在哪些场合下推荐或不支持使用NoSQL数据库。之后介绍了NoSQL数据库中的几个基本的概念,包括NoSQL应该具备怎样的特点,数据模型的分类,还有一致性模型的区别。然后通过介绍 MongoDB 和 Google BigTable 两个具体数据库的CRUD(创建/读取/更新/删除)操作讨论他们跟传统关系型数据库的优劣。文笔不好,仅够浅谈。有兴趣的同学可以参考reference中的文章。
1 Introduction
Relational databases have been around for more than 30 years, and have been essential to variousfields, such as business, education, etc. Almost all database system we use today are RDBMS,including those of Oracle, SQL Server, MySQL and so on. The reason for the dominance of relationaldatabases are not trivial. In all, with various constraints as well as normalization model, relationaldatabases can continually offered the best mix of simplicity, robustness, flexibility, performance,scalability and compatibility in managing generic data [1]. As a result, even though there havebeen several so-called revolutions flared up briefly, all of them fizzled out, without making a dentin the dominance of relational databases.
However, the good mix of those benefits does not mean that the performance of RDBMS ineach of these areas is better than that of an alternate solution pursing one of these benefits inisolation. This concern has not been much of a problem before because the universal dominance ofRDBMS has outweighed the need to push any of these boundaries. But recently, especially withthe rise of Web 2.0 applications, one of these benefits is becoming more and more critical, that is,scalability. One of the most significant differences between Web 2.0 and the traditional www is thegreater collaboration among Internet users, content providers and enterprises [2]. This leads to ascalability requirements that can, first of all, change very quickly and, secondly, grow very large.As scaling a traditional relational DBMS is hard, we need a data management system that canscale well horizontally, that is, scale OLTP-syle workloads to thousands or millions of users, usinghundreds or thousands of nodes. This is the most important motivation of ”NoSQL” databases, socalled, ”Not only SQL” database.
There’re also many other motivations of NoSQL databases, which reveal the advantages of them,too. For example, one motivation is agility or speed of development. Companies has always lookedto adapt to the market more quickly and embrace agile development methodologies. In this way,NoSQL databases have far more relaxed, or even nonexistent, data model restriction compared withRDBMS. The result is that application changes and database schema changes do not have to bemanaged as one complicated change unit, which will allow application to iterate faster in theory[3]. In addition, in many cases companies are driven by the desire to identify viable alternativesto expensive proprietary software of RDBMS. NoSQL database, on the other hand, is much moreeconomics because they use clusters of cheap commodity servers. This leads to the cost per gigabyteor transaction/second for NoSQL can be many times less than the cost for RDBMS, allowing youto store and process more data at a much lower price point [3].