Creating a noSql database, what is the best source code to look at?

转载 2015年07月07日 13:23:49

I have always wanted a nosql database that was purpose built for storing large volumes of nested/threaded comments. Implementation would probably be done in java because that is what I am best at. I really like how ElasticSearch is dead simple to set up a cluster and throw data into it, I want my product to share those same qualities. Here are the features I have in mind:

1) auto/manual sharding across clusters
2) auto/manual indexing across clusters
3) full text search (probably via lucene or elasticSearch)
5) retrieve any comment by ID
6) comments can be retrieved with or without child nodes
7) comment trees can be retrieved with a specified depth
8) comment tree can be retrieved can be filtered by time or rank
9) entire comment trees can be re-parented.

What I'm looking for are exceptional pieces of code or specific algorithms that I can study before digging into this project. Can anyone suggest a few places to get started?

shareimprove this question
Much of this will be a feature of the app you build rather than the database you use. The rest (possibly excluding full-text search), any existing NoSQL database should be able to handle. Why exactly can't you use an already existing DB? –  cHao Aug 8 '12 at 3:30 
Do you want to write your own, or do you want to use one that you like and that is written in Java? –  EdmonAug 8 '12 at 3:30
About 80% of the reason for wanting to write my own is for fun, the other 20% is because I have never really been fully satisfied with the traditional solutions for storing nested comments. I think it would be cool to be able to fire up a cluster to store/search reddit scale volumes of comments. –  bostonBob Aug 8 '12 at 3:52

Since your tag in a question indicates Java, I suggest looking into OrientDB.

Here is a source code:

and the architecture:

for the big boy stuff (clustering, hyper scaling take a look at HBase and Accumulo):

Hope this helps.

shareimprove this answer


629.Which option is best practice for creating a recovery catalog owner in the catalog database? A...
  • rlhua
  • rlhua
  • 2013年10月28日 09:56
  • 4983


Fang, Hui, and M. Zhang. “Creatism: A deep-learning photographer capable of creating professional wo...
  • shenxiaolu1984
  • shenxiaolu1984
  • 2018年01月04日 15:51
  • 140

mongodb error creating initial database config information 问题处理

开发同事说应用里面mongodb写入报错,自己进服务器查看,报错信息如下:[mongodb@azure_d1_dbm1_3_11 ~]$ /usr/local/mongodb-linux-x86_64...
  • mchdba
  • mchdba
  • 2016年06月25日 00:59
  • 5806


358.What command is used to reset a database to a previous incarnation? A. reset incarnation B. in...
  • rlhua
  • rlhua
  • 2013年11月29日 11:13
  • 5744


560.What is the proper command to shut down the database in a consistent manner? A. Shutdown abort ...
  • rlhua
  • rlhua
  • 2013年11月01日 10:15
  • 3255

What is the Best Multi-Stage Architecture for Object Recognition?(经典文章阅读)

一.文献名字和作者      What is the Best Multi-Stage Architecture for Object Recognition?, Kevin Jarrett, Ko...
  • shengno1
  • shengno1
  • 2014年11月03日 20:14
  • 1109


516.In a Database Replay workload capture, what client request information is gathered? (Choose all ...
  • rlhua
  • rlhua
  • 2014年02月23日 13:53
  • 4930

深度学习论文理解1:what is the best multi-stage architecture for Object Recognition

本文是09年,Yann LeCun团队的一篇论文,论文主要讨论了卷积结构网络中各个layer的作用,进而探讨一个好的深度结构分类网络应该是什么样子的。 摘要:在很多目标识别系统的特征提取阶段主要分为...
  • whiteinblue
  • whiteinblue
  • 2015年01月24日 11:04
  • 3087

You are creating an additional database by using the Database Configuration Assistant (DBCA). You op

You are creating an additional database by using the Database Configuration Assistant (DBCA). You op...
  • dwj19830118
  • dwj19830118
  • 2016年08月01日 23:09
  • 242

What are the problems that a NoSQL database tries to solve?

Edmond Lau在quora上给的comment,总结的很好,更多的comments点击链接查看。 文章来源:
  • macyang
  • macyang
  • 2011年02月14日 20:42
  • 606
您举报文章:Creating a noSql database, what is the best source code to look at?