关闭

Creating a noSql database, what is the best source code to look at?

400人阅读 评论(0) 收藏 举报
分类:

I have always wanted a nosql database that was purpose built for storing large volumes of nested/threaded comments. Implementation would probably be done in java because that is what I am best at. I really like how ElasticSearch is dead simple to set up a cluster and throw data into it, I want my product to share those same qualities. Here are the features I have in mind:

1) auto/manual sharding across clusters
2) auto/manual indexing across clusters
3) full text search (probably via lucene or elasticSearch)
4) REST/JSON API
5) retrieve any comment by ID
6) comments can be retrieved with or without child nodes
7) comment trees can be retrieved with a specified depth
8) comment tree can be retrieved can be filtered by time or rank
9) entire comment trees can be re-parented.

What I'm looking for are exceptional pieces of code or specific algorithms that I can study before digging into this project. Can anyone suggest a few places to get started?

shareimprove this question
 
1  
Much of this will be a feature of the app you build rather than the database you use. The rest (possibly excluding full-text search), any existing NoSQL database should be able to handle. Why exactly can't you use an already existing DB? –  cHao Aug 8 '12 at 3:30 
 
Do you want to write your own, or do you want to use one that you like and that is written in Java? –  EdmonAug 8 '12 at 3:30
 
About 80% of the reason for wanting to write my own is for fun, the other 20% is because I have never really been fully satisfied with the traditional solutions for storing nested comments. I think it would be cool to be able to fire up a cluster to store/search reddit scale volumes of comments. –  bostonBob Aug 8 '12 at 3:52

Since your tag in a question indicates Java, I suggest looking into OrientDB.

Here is a source code:

http://code.google.com/p/orient/source/browse/

and the architecture:

http://code.google.com/p/orient/wiki/Presentations

for the big boy stuff (clustering, hyper scaling take a look at HBase and Accumulo):

http://hbase.apache.org/source-repository.html

http://accumulo.apache.org/source.html

Hope this helps.
Edmon

shareimprove this answer
0
0

查看评论
* 以上用户言论只代表其个人观点,不代表CSDN网站的观点或立场
    个人资料
    • 访问:272817次
    • 积分:5099
    • 等级:
    • 排名:第5678名
    • 原创:156篇
    • 转载:587篇
    • 译文:0篇
    • 评论:2条
    文章分类