database/nosql
文章平均质量分 79
macyang
Chance is waiting for prepared people and my Status is read the fucking source code.
展开
-
Oracle Advanced Queuing
1. What Is Advanced Queuing?When Web-based business applications communicate witheach other, producer applications enqueue messages and consumerapplications dequeue messages. Advanced Queuing原创 2009-11-26 23:48:00 · 899 阅读 · 0 评论 -
HBase的这些配置参数你都懂了吗?
<br />测试时发现理解这些参数都代表什么意义非常的重要,而且通过参数调优可以提高性能,希望仔细阅读一下每个属性代表的意义!<br /> <br /><?xml version="1.0"?><?xml-stylesheet type="text/xsl" href="configuration.xsl"?><!--/** * Copyright 2009 The Apache Software Foundation * * Licensed to the Apache Softw原创 2011-02-27 13:14:00 · 23405 阅读 · 1 评论 -
Thoughts on Redis
<br />本文翻墙来自于:Thoughts on Redis<br /> <br />Like many people already doing so, I’ve been digging into Redis and other NoSQL products.<br />Unlike all those anti-SQL fanboys, I have 15+ years of experience in RDBMS, and I know the relational algebra has a l原创 2011-03-02 13:12:00 · 1339 阅读 · 0 评论 -
HBase Memstore Flush
我之前一直以为HBase 只会对那些达到了阈值的Memstore进行flush,虽然知道有个配置参数hbase.hregion.memstore.flush.size是控制这个阈值的,但是没想到level是HRegion,但是所有这些理解在读了Visualizing HBase Flushes And Compactions给出的测试结果后才发现当初自己理解错了,于是google一下,发现apache jire上有这样一个ticket,提出的问题如下:Today, the flush decision is原创 2011-03-01 22:58:00 · 4648 阅读 · 0 评论 -
HBase Region的路由、分配与拆分
打算从下面三个部分仔细了解一下HBase Region的三种操作:Region的路由问题:比如当你要读一个key/value的时候,你要确定它存储在哪台Region Server上面,这就涉及到region的路由问题,有一篇文章“HBase中的Client如何路由到正确的RegionServer”已经非常详细的介绍了这部门内容,这里面就不多说了,有兴趣进入链接仔细阅读以下。表中最主要的Family:info,info里面包含三个Column:regioninfo, server, serverstartco原创 2011-03-05 00:11:00 · 6729 阅读 · 0 评论 -
NoSQL Solution: Evaluation and Comparison: MongoDB vs Redis, Tokyo Cabinet, and Berkeley DB [CHART]
<br />文章来源: http://perfectmarket.com/blog/not_only_nosql_review_solution_evaluation_guide_chart<br />在原文下面,redis的作者也加了自己的comments,可以仔细查看一下!<br /> <br />You may think this is yet another blog on NoSQL (Not Only SQL) hype.<br />Yes, it is.<br />But if at thi原创 2011-03-02 20:20:00 · 1692 阅读 · 0 评论 -
Automating partitioning, sharding and failover with MongoDB
<br />文章来源:http://blog.boxedice.com/2010/08/03/automating-partitioning-sharding-and-failover-with-mongodb/<br /> <br />Two of the most eagerly anticipated features for MongoDB, the database backend we use for our server monitoring service, Server Density原创 2011-02-14 17:33:00 · 1140 阅读 · 0 评论 -
Notes on MongoDB
<br />作者写这篇文章的时候mongodb的版本是1.3,问题阐述的很清楚,可以认真阅读一下。<br /> <br />文章来源:http://www.paperplanes.de/2010/2/25/notes_on_mongodb.html<br /> <br />For an article in a German magazine I've been researching MongoDB over the last week or so. While I didn't need a lot原创 2011-02-15 14:12:00 · 1059 阅读 · 0 评论 -
MongoDB Monitoring: Keep in it RAM
<br />文章来源:http://blog.boxedice.com/2010/12/13/mongodb-monitoring-keep-in-it-ram/<br /> <br />大概的意思就是使用Mongodb的时候要保证索引所占用的空间小于物理内存空间,如果索引所占用的空间太大就想办法考虑使用sharding吧(不然读索引的时候就可能需要从disk读,岂不是更慢!),另外如果索引大小+数据大小大于了物理内存,读性能也好不到哪去,我用Mongodb提供的一个工具mongostat查看page fa原创 2011-03-03 18:41:00 · 1354 阅读 · 0 评论 -
MongoDB: Index Size and Memory & Possible performance impact Options
<br />这个是google group mongodb user小组的一则讨论,提问者首先提出如下的问题:Hi, <br /><br />on our main mongo box, we have 4G memory (wohooo). Our data size is <br />1G and index size is 5.6G. This box is 1 node of a 3 node replica <br />set. We have 1 collection and a f原创 2011-03-03 19:03:00 · 1420 阅读 · 0 评论 -
Is there a limit to the number of columns in an HBase row?
<br />Quora上一个哥们提出下面这个问题,之前其实读过的不过没怎么在意comments,而且自己还亲自测试了一下这个问题确定当一个row很大的时候不会发生拆分的。今天又看了一遍,其实这里面已经包含了很多信息:I am wondering if I should have lot of rows or lot of columns in a row? which is better if I have to index them as wellTodd主要从三个方面说明不推荐将大量的column塞到一原创 2011-03-04 09:28:00 · 1413 阅读 · 0 评论 -
What are the problems that a NoSQL database tries to solve?
<br />Edmond Lau在quora上给的comment,总结的很好,更多的comments点击链接查看。<br /> <br />文章来源:http://www.quora.com/What-are-the-problems-that-a-NoSQL-database-tries-to-solve?redirected_qid=194774<br /><br /> <br />The main problems that a NoSQL aims to solve typically revolv原创 2011-02-14 20:42:00 · 787 阅读 · 0 评论 -
Allow regions of specific table to be load-balanced
Description:From our experience, cluster can be well balanced and yet, one table's regions may be badly concentrated on few region servers.For example, one table has 839 regions (380 regions at time of table creation) out of which 202 are on one server.I原创 2011-02-15 18:18:00 · 1408 阅读 · 0 评论 -
NoSQL数据库笔谈v0.2
<br />颜开总结一套比较完整的NoSQL数据库资料:<br /> <br />下面的链接是其博客原文链接,但是现在访问其中的link总是需要输入username/password<br /><br /><br />http://www.yankay.com/nosql%E6%95%B0%E6%8D%AE%E5%BA%93%E7%AC%94%E8%B0%88v0-2/<br /> <br /> <br />所以还是直接看其在google doc上的pdf吧!<br /> <br />https://doc原创 2011-02-16 09:23:00 · 1228 阅读 · 0 评论 -
How are bloom filters used in HBase?
HBase的两位主要代码提交者Lars George和Nicolas Spiegelberg针对issue: https://issues.apache.org/jira/browse/HBASE-1200在quora上给出的关于bloom filter是如何在HBase中使用的。文章来源:http://www.quora.com/How-are-bloom-filters-used-in-HBaseLars George, HBase CommitterThe bloom filters in HBa原创 2011-02-13 18:29:00 · 3333 阅读 · 0 评论 -
HFile: A Block-Indexed File Format to Store Sorted Key-Value Pairs
Schubert Zhang也是HBase代码提交者,他在这篇文章除了介绍HFile以外,更重要的他给出了一个关于Hbase 2.0的性能测试报告并对测试结果进行了简单的分析:正文文章来源:http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html性能测试报告: http://www.slideshare.net/schubertzhang/hbase-0200-performance-evaluation1. I原创 2011-02-16 11:11:00 · 1709 阅读 · 0 评论 -
BigTable Model with Cassandra and HBase
<br />这篇文章来自:http://aio4s.com/blog/2010/11/08/technology/bigtable-model-cassandra-hbase.html <br /> <br />里面有一些内容讲的有些问题(比如region的路由过程),需要读者自己分辨里面的对与错,但是整个文章写的还是非常不错的。<br /> <br />当然还是推荐认真仔细多阅读几遍google bigtable paper,每次的收获肯定不同:<br /> <br />http://labs.googl原创 2011-02-16 14:35:00 · 1061 阅读 · 0 评论 -
Finding the most accessed Table/Region on an HBase Region Server
<br />来源: http://whynosql.com/finding-the-most-accessed-tableregion-on-an-hbase-region-server/<br /> <br />要翻墙!<br /> <br />Each HBase region server hosts many regions – possibly hundreds or even thousands. How do you find out which one of them is a hots原创 2011-02-17 12:46:00 · 841 阅读 · 0 评论 -
How does HBase perform load balancing?
MauMau提出下面这样一个问题(hbase的版本应该是0.20.xx):[Q1] Load balancingDoes HBase move regions to a newly added region server (logically, not physically on storage) immediately? If not immediately, what timing?On what criteria does the master unassign and assign regions原创 2011-03-07 14:13:00 · 1061 阅读 · 0 评论 -
Write Performance: HBase VS Cassandra with consistency level ALL
<br />文章来源:http://www.quora.com/How-does-HBase-write-performance-differ-from-write-performance-in-Cassandra-with-consistency-level-ALL<br /> <br />While setting the a write consistency level of ALL with a read level of ONE in Cassandra provides a strong原创 2011-03-07 22:39:00 · 7292 阅读 · 0 评论 -
关于学习redis的一点想法
1) 简单试用了一下redis,通过网页版命令行测试了常见的redis命令,给自己一个直观的感受,redis都能做哪些事情。2) 因为redis是一个data structure server,所以需要了解每种structure提供了哪些功能(命令),根据自己实际的情况选择合适的redis data structure。3) 深入了解一下redis设计方面的知识,主要通过官方document,这有助于你理解redis架构方面的东西,看它提供了哪些feature(如replication, snaps原创 2011-03-10 22:40:00 · 814 阅读 · 0 评论 -
Hbase not support MemStore to BlockCache
<br />原文地址: http://web.archiveorange.com/archive/v/kaweJLpl8BCoXAfi3S8p<br /> <br />提问题的这个哥们还是蛮有想法的,今天还想HBase的写操作会不会经过LruBlockCache,看完代码以后,答案是否定地~we are trying to read efficiently a hot column family (in_memory=true,<br />blockcaching=true) that get writ原创 2011-02-21 21:40:00 · 916 阅读 · 0 评论 -
Visualizing HBase Flushes And Compactions
<br />看到这么好的东西实在不敢独享,作者Bruno Dumon写的非常好!<br /> <br />文章来源:http://outerthought.org/blog/465-ot.html<br /> <br />I was looking into more detail at how HBase compactions work, and given myexperience collecting metrics forLily, and also inspired byth原创 2011-02-22 22:05:00 · 1327 阅读 · 0 评论 -
How to Choose a MongoDB Shard Key
话说这边文章写的还是不错的,当你要使用Mongodb Sharding模式的时候,选择一个好的shard key是多么重要的一件事情,它将影响balancer接下来要为你需要移动多少chunk,为客户端的请求选择几个shard等等。文章来源:http://techojito.posterous.com/how-to-choose-a-mongodb-shard-keyHow to Choose a MongoDB Shard KeyOne of the benefits of MongoDB is it原创 2011-03-12 23:44:00 · 2024 阅读 · 0 评论 -
MongoDB Sharding: A Detailed Overview and 15 Minute High Speed Read
Scaling is a key feature of MongoDB. And even though manual sharding is supported by most databases, MongoDB supports the concept of autosharding. This 15 minute high speed post provides a detailed overview of autosharding in MongoDB and, speci原创 2011-03-13 19:32:00 · 1007 阅读 · 0 评论 -
MongoDB sharding: understand it first
1. what is shard?2. When to Shard?1)磁盘空间不够的时候2)单个mongod处理不过来client发送的请求3)想让更多的数据存储在内存中3. Incrementing Shard Keys Versus Random Shard Keys?选择递增的shard key是不利于写操作的,如选择timestamp,因为这样会导致所以的写请求全部发送给某个shard,造成这个shard负载很重,而其他shard无所事事。另外也不要选择shard key对原创 2011-03-13 22:56:00 · 1214 阅读 · 0 评论 -
Translate SQL to MongoDB MapReduce
I keep hearing people complaining that MapReduce is not as easy as SQL. But there are others saying SQL is not easy to grok. I’ll keep myself away from this possible flame war and just point you out to this ☞ SQL to MongoDB translation PDF put toge原创 2011-03-14 19:39:00 · 2104 阅读 · 0 评论 -
HBase read tuning tip
<br />Joel ask the following question:<br />=================<br /> <br />Hi All,<br /><br />I have an application with two HBase tables.<br /><br />One table is written to frequently, by a crawler writing web pages.<br /><br />Another table is wri原创 2011-02-21 13:22:00 · 765 阅读 · 0 评论 -
从mysql导入数据到mongodb的方法
<br />目前我知道的几种从mysql导入数据到mongodb的方法,如果发现新方法会继续添加。<br /> <br />1)自己写一个程序,从mysql select数据,然后调用insert,插入数据到mongodb中。<br /><br />2)通过mysql工具将数据导出为csv/json格式的文件,然后使用mongodb自带的mongoimport导入数据。<br />(当数据量非常大的时候,可以pre-spliting + multi-mongoimport加快导入速度)<br /> <b原创 2011-03-17 00:18:00 · 14367 阅读 · 1 评论 -
Mongodb --- Manual sharding
最近在google group看到一个关于manual sharding的讨论,虽然暂时还没亲自去实践一下,但是觉得办法可行,大家都知道google group是要翻墙的,所以贴在这里方便查看.Zer0提出的问题:-----------------------------Sorry for my English I 've read all the documents at home page and search many other sites but I still can not c原创 2011-03-16 23:42:00 · 1673 阅读 · 0 评论 -
Redis为persistent付出2倍的memory,值得不?
<br />下面是redis作者写的一篇文章,里面阐述了为什么redis不使用compact方法去合并aof文件!<br /> <br />文章来源:A few key problems in Redis persistenceSaturday, 02 October 10<br /> <br />推荐先阅读: http://redis.io/topics/persistence<br /><br /> <br />Redis: the strength is the data model, and the原创 2011-03-17 23:10:00 · 2814 阅读 · 0 评论 -
MongoDB MapReduce实现的group问题
用MapReduce写的group速度不行啊!!1)来源MongoDB权威指南The price of using MapReduce is speed: group is not particularly speedy, butMapReduce is slower and is not supposed to be used in “real time.” You runMapReduce as a background job, it creates a collection of res原创 2011-03-17 18:56:00 · 4725 阅读 · 0 评论 -
Building indexes using HBase: mapping strings, numbers and dates onto bytes
下面这篇文章是Bruno Dumon给出的如何在Hbase上面搭建二级索引,除了这篇文章以外,他还给出了具体的library,接下来我会首先使用他提供的library进行一些简单的功能上的验证。本文来源:http://brunodumon.wordpress.com/2010/02/17/building-indexes-using-hbase-mapping-strings-numbers-and-dates-onto-bytes/具体实现:http://www.lilyproject.org/lily/原创 2011-05-09 23:15:00 · 1960 阅读 · 0 评论 -
MongoDB Schema Design
从关系型数据库转型到文档型存储,最容易犯的一个错误也许就是延用了老的表设计思维来对待新的结构。MongoDB 文档阅读笔记 —— 优雅的 NoSQLSchema DesignSchema Design with MongoDB - April 27MongoDB Schema Design原创 2011-03-21 15:08:00 · 1317 阅读 · 0 评论 -
HBase File Locality in HDFS
<br />罪过啊,之前的几篇翻墙文章已经全部都转过来了,但是这篇却给忘记了。<br /> <br />文章的大意就是hbase是否会保证RegionServer所管理的数据在本地就可以拿到,或者到最近的地方就可以拿到。<br /> <br />文章来源:http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html (要翻墙)<br /> <br />中文版翻译:http://www.spnguru.com/?p=42<br />原创 2011-03-23 00:06:00 · 2140 阅读 · 0 评论 -
Excessive Disk Space
<br />You may notice that for a given set of data the MongoDB datafiles in /data/db are larger than the data set inserted into the database. There are several reasons for this.local.* files and replication<br />The replication oplog is preallocate原创 2011-03-23 12:23:00 · 640 阅读 · 0 评论 -
Secondary indexes in HBase
<br />Creating secondary indexes in HBase-0.19.3:<br /><br />You need to enable indexing in HBase before you can create a secondary index on columns. Edit the file $HBASE_INSTALL_DIR/conf/hbase-site.xml and add the following property to it.<br /><br原创 2011-05-11 23:58:00 · 2143 阅读 · 0 评论 -
在Hbase中选择多少个column family才合适呢?
<br /><br />下面主要说的是在设计Hbase schema的时候,要尽量只有一个column family,至于为什么主要从flush和compaction说起,它们触发的基本单位都是Region级别,所以当一个column family有大量的数据的时候会触发整个region里面的其他column family的memstore(其实这些memstore可能仅有少量的数据,还不需要flush的)也发生flush动作;另外compaction触发的条件是当store file的个数(不是总的sto原创 2011-05-14 20:56:00 · 9025 阅读 · 1 评论 -
Hbase manual split
<br />Hbase 0.90.2提供了一个ReplitSplitter类(org.apache.hadoop.hbase.util.RegionSplitter)用于manual split region,下面主要讲述了如果关掉auto-split region的功能、为什么要使用manual split、manual split有什么好处和设置多少数量的pre-split regions合适等问题。<br /> <br /> <br /><br />The RegionSplitter class p原创 2011-05-14 19:58:00 · 2647 阅读 · 0 评论 -
App Engine datastore tip: monotonically increasing values are bad
<br />题目的关键词是monotonically increasing values are bad,这个在我知道到nosql数据库中的hbase/mongodb都会存在这个问题,所以如果处理单调递增型的row-key很关键,另外作者Ikai Lan画的图很有意思,超赞啊!<br /> <br />原文地址:http://ikaisays.com/2011/01/25/app-engine-datastore-tip-monotonically-increasing-values-are-bad/<br原创 2011-05-14 21:23:00 · 1633 阅读 · 0 评论