Schemaless

“Schemaless”

In the NoSQL world it is common to talk about schemaless databases or data models.

It would be more precise to say “dynamic schema”.  In MongoDB, there are databases; a system catalog of collections; documents within collections; explicitly declared indexes for a collection.  The big difference is that “columns”, or rather fields in the document data model, are not predeclared.  Each field/value in the document is dynamic and can be present or missing.  Each value has a datatype too, so it isn’t typeless but rather dynamic or what some might call duck typing.

Here’s an example in the mongo shell.  We may have a couple docs:

> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }

We could then add a new person with an extra attribute:

> db.persons.insert({name:’julie’,age:28,likes:’baseball’})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “ben”, “age” : 30 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }

No “alter table” necessary.  This is very helpful with agile development methodologies. 

We can take it a step further however.  The value of a field need not be consistent from document to document.  Now, in practice, it is very very common for the contents of a collection to be homogeneous.  But we have the option.  For example suppose we want to add “likes” for ben, but ben likes a couple things.  What to do?

> db.persons.update({name:’ben’},{$set:{likes:[‘math’,’baseball’]}})
> db.persons.find()
{ “name” : “jane”, “age” : 25 }
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }

In this example, things work out particularly elegantly as even though one likes value is an array, and the other a string, we can still do some queries across them that are interesting.  This is because when querying for a value, if the value is an array, MongoDB looks into the array:

> db.persons.find({likes:’baseball’})
{ “name” : “julie”, “age” : 28, “likes” : “baseball” }
{ “name” : “ben”, “age” : 30, “likes” : [ “math”, “baseball” ] }

Likewise we can index the field:

> db.persons.ensureIndex( { likes : 1 } )

All very handy and useful.  But you might ask “won’t my data get rather dirty with no schema constraints?”  I had this concern when we started; I assumed we would just add some constraint rules later when needed.  Oddly, there hasn’t been a lot of demand for the feature, so far.  Empirically, it seems the data doesn’t get too noisy.

One other very important note: the dynamic schema is not just for developer friendliness!  There is another good reason for it.  Imagine changing the schema in a database cluster involving 2,000 servers.  It might be tricky to change that global state globally in a consistent manner.  One goal here is to store very big data sets.  Alter table is probably not going to fly with billions or trillions of documents.

P.S. For compactness, the examples above do not show the _id field MongoDB or its driver automically adds to all documents.

P.P.S. Dynamic schema is not unique to MongoDB — some other products in the space do it too…of course I’m biased this is my favorite.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值