Creating a noSql database, what is the best source code to look at?

转载 2015年07月07日 13:23:49

I have always wanted a nosql database that was purpose built for storing large volumes of nested/threaded comments. Implementation would probably be done in java because that is what I am best at. I really like how ElasticSearch is dead simple to set up a cluster and throw data into it, I want my product to share those same qualities. Here are the features I have in mind:

1) auto/manual sharding across clusters
2) auto/manual indexing across clusters
3) full text search (probably via lucene or elasticSearch)
4) REST/JSON API
5) retrieve any comment by ID
6) comments can be retrieved with or without child nodes
7) comment trees can be retrieved with a specified depth
8) comment tree can be retrieved can be filtered by time or rank
9) entire comment trees can be re-parented.

What I'm looking for are exceptional pieces of code or specific algorithms that I can study before digging into this project. Can anyone suggest a few places to get started?

shareimprove this question
 
1  
Much of this will be a feature of the app you build rather than the database you use. The rest (possibly excluding full-text search), any existing NoSQL database should be able to handle. Why exactly can't you use an already existing DB? –  cHao Aug 8 '12 at 3:30 
 
Do you want to write your own, or do you want to use one that you like and that is written in Java? –  EdmonAug 8 '12 at 3:30
 
About 80% of the reason for wanting to write my own is for fun, the other 20% is because I have never really been fully satisfied with the traditional solutions for storing nested comments. I think it would be cool to be able to fire up a cluster to store/search reddit scale volumes of comments. –  bostonBob Aug 8 '12 at 3:52

Since your tag in a question indicates Java, I suggest looking into OrientDB.

Here is a source code:

http://code.google.com/p/orient/source/browse/

and the architecture:

http://code.google.com/p/orient/wiki/Presentations

for the big boy stuff (clustering, hyper scaling take a look at HBase and Accumulo):

http://hbase.apache.org/source-repository.html

http://accumulo.apache.org/source.html

Hope this helps.
Edmon

shareimprove this answer

相关文章推荐

DeepLearning论文笔记(一):What is the Best Muti-Stage Architecture for object Recognition

>     一篇比较老的文章,主要描述了目标识别常用模型的layer组成,通过在Caltech-101、NORB、MNIST等数据集上的测试实验体现了non-linearities、Random Fi...

What is the Best Multi-Stage Architecture for Object Recognition?

Deep Learning论文笔记之(六)Multi-Stage多级架构分析 zouxy09@qq.com http://blog.csdn.net/zouxy09      ...

What is the Best Git GUI (Client) for Windows?

From: http://kylecordes.com/2010/git-gui-client-windows I adopted Git as my primary source cont...

深度学习论文理解1:what is the best multi-stage architecture for Object Recognition

本文是09年,Yann LeCun团队的一篇论文,论文主要讨论了卷积结构网络中各个layer的作用,进而探讨一个好的深度结构分类网络应该是什么样子的。 摘要:在很多目标识别系统的特征提取阶段主要分为...

Best 23 Open Source iPhone Apps & Source Code

The iPhone platform now has more than 100,000 apps, and they're almost all available through the iTu...

编程之美-How To Organize Template Source Code

转载自:点击打开链接 Introduction Often I get asked whether programming with templates is hard or easy. The ...
内容举报
返回顶部
收藏助手
不良信息举报
您举报文章:Creating a noSql database, what is the best source code to look at?
举报原因:
原因补充:

(最多只允许输入30个字)