MongoDB Sharding: A Detailed Overview and 15 Minute High Speed Read

最新推荐文章于 2024-07-23 19:46:13 发布

macyang

最新推荐文章于 2024-07-23 19:46:13 发布

阅读量999

点赞数

分类专栏： database/nosql 文章标签： sharding mongodb database collections server application

本文链接：https://blog.csdn.net/macyang/article/details/6246729

版权

database/nosql 专栏收录该内容

102 篇文章 0 订阅

订阅专栏

配置过程是启动config server, route server，然后才是添加shard server。我平时的习惯都是先启动shard server，呵呵！

Scaling is a key feature of MongoDB. And even though manual sharding is supported by most databases, MongoDB supports the concept of autosharding . This 15 minute high speed post provides a detailed overview of autosharding in MongoDB and, specifically, how to create shards supporting autosharding in MongoDB.

The process of splitting up data and storing portions of data on different machines is called sharding . By splitting up data across machines, it becomes possible to store more data and handle much more load without requiring large or powerful machines, e.g., machines that consist of powerful CPU’s and/or massive amounts of RAM.

Two types are sharding can occur. Manual sharding and autosharding .

In manual sharding , the application code manages storing different data on different servers and querying the appropriate server to get it back. Manual sharding can be done with virtually any database software package.

In MongoDB autosharding , some of the administrative overhead required in manual sharding is eliminated. The cluster of database servers, or shards , handles splitting up of data and rebalancing of data automatically.

Autosharding

The basic concept behind MongoDB’s sharding is to break up collections into smaller chunks, or documents . These documents can be distributed across shards so that each shard is responsible for a subset of the total dataset.

As an example, consider the following. When you set up sharding you choose a key from a collection and use that key’s values to split up the data. This key is called the shard key .

Suppose we had a collection of contacts. If we chose “lastName” as our shard key, one shard could hold documents where “lastName” starts with A-F, the next shard could hold last names from G-P, and the final shard could hold last names Q-Z. As you add or remove shards, MongoDB would rebalance this data so that each shard was getting a balanced amount of traffic and a practical amount of data.

So when should you decide to start sharding? Consider the following reasons:

When you’ve run out of disk space on your current machine.
You want to write data faster than a single mongod can handle.
You want to keep a larger portion of data in memory to improve performance.

Setting up Sharding

Three different components are involved in sharding as follows:

shard

A shard is a container that holds a subset of a collection’s data. A shard is either a single mongod server (for development/testing), or a replica set (for production).

mongos

This is the router process. It routes requests and aggregates responses. It doesn’t store any data or configuration information, although it does cache information from the config servers.

config server

Config servers store the configuration of the cluster. For example, which data is located on which shard. Used by mongos to determine request routing.

Starting the Servers

First we need to strat up our config server and mongos. We need to start a config server because mongos uses it to get its configuration.

$ mkdir -p ~/dbs/config $ ./mongod --dbpath ~/dbs/config --port 20000

Now we start a mongos process for an application to connect to. Routing servers do not even need a data directory, but they need to know the location of the config server.

$ ./mongos --port 30000 --configdb localhost:20000

Shard administration is always done through a mongos.

Adding a Shard

Start a normal mongod instance (or replica set), since this is what a shard naturally is

$ mkdir -p ~/dbs/shard1 $ ./mongod --dbpath ~/dbs/shard1 --port 10000

Now connect to the mongos process started earlier and add the shard to the cluster.

First, start up a shell connected to the mongos process as follows:

Now add this shard with the addshard database command:

The “allowLocal” key is necessary only if you are running the shard on localhost and lets MongoDB know that you’re in development and know what you are doing.

Sharding Data

In order to allow MongoDB to distribute data, you have to explicitly turn sharding on at both the database and collection levels. For example, the following enaables sharding for the database acme :

Once you’ve enabled on the acme database, a collection is sharded by running the shardcollection command as follows:

> db.runCommand({"shardcollection" : "acme.products", "key", : {"_id" : 1}})

Now the collection will be sharded by the “_id” key. When data is added to acme , it will be automatically distributed across the shards based on the values of “_id”.

I hope this post enlightens you on the possibilities that MongoDB’s auto-sharding feature provides for ease of scaling.

文章来源：http://blog.brianbuikema.com/2011/01/mongodb-sharding-a-detailed-overview-and-15-minute-high-speed-read/

macyang

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
MongoDB Sharding: A Detailed Overview and 15 Minute High Speed Read

Scaling is a key feature of MongoDB. And even though manual sharding is supported by most databases, MongoDB supports the concept of autosharding. This 15 minute high speed post provides a detailed overview of autosharding in MongoDB and, speci
复制链接

扫一扫

专栏目录