CMS - Configuration management service based on MongoDb

本文链接：https://blog.csdn.net/ebay/article/details/43529485

CMS是eBay云服务的配置管理服务，基于MongoDb构建，提供REST服务和自定义查询语言。它存储从资产到应用服务的各种配置项，每天处理千万级别的请求。CMS采用MongoDb是因为其内存存储、读取性能、MVCC和灵活的文档设计。文章详细介绍了CMS的架构考虑、设计，包括元数据模块、实体管理和查询模块等。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Author: Su, Ralph

Abstract

Configuration management database (CMDB) is commonly used to store the management items inside an organization/company. CMDB typically designed as a centralized database access points.

As CMDB of eBay cloud service, CMS is a configuration management service built on top of MongoDb. It provides rest service, and with its own query language. CMS now stores eBay marketplaces configuration items range from asset/network to application service topologies. With peak request of 10millions request per day, CMS now serves as a reliable infrastructure service for eBay cloud service.

In this article, we present the architecture consideration and design of CMS.

Architecture Consideration

Why MongoDb?

Compare to most of current CMDB which built on relational database, CMS choose MongoDb as its background database. There are a couple of pros & cons

Pros

1. CMS design target is to store configuration items, which means its data size would be capable to store in memory, which could fit to the MongoDb best usage.

2. CMS designed to serve the more read requests than write requests. Combine with #1, mongo could provide easily maintained read scalability through its replica set deployment.

3. CMS not designed to provide RDB’s strong transaction. Instead, to ensure data consistency, CMS provides MVCC (multi version concurrency control) on object level.

4. CMDB require schema change/evolution frequently (compare to typically RDB migration). MongoDB’s schema less document design make it feasible as CMDB option.

Cons

1. MongoDB is a schema-less document storage. Although configuration items need flexible schema, they are not schema-less. Solution: CMS provide metadata definition to help the schema definition.

2. No transaction. Solution: CMS implements own MVCC on top of mongo.

3. Mongo provide only single collection query based on key/value. Solution: CMS provide its own query language to support query join.

CMS as a CMDB, why there are repositories/metadata concept which is not a typical CMDB scope?

As a software product, CMS is designed as configuration management service based on its core component of metadata management, entity management, and query services. This design makes CMS not only a product that could serve the requirement of CMDB. And it also makes CMS a more general persistent service to provide user flexible metadata define, and store data inside CMS according his/her metadata definition.

Design

CMS is a metadata-driven system. Metadata definition describes how data is stored and fetch in CMS.

<<CMS Arch - Digram>>

Metadata module

Metadata is the “table” definition for data stored in CMS. This module is simple and intuitive; it read/store the metadata definition from/to backend database. The main concept of metadata module is the metaclass and relationship; it also provides the definition of indexes, which is used in query service to improve the database query performance. All the metadata information is cached in the memory using a simple write-through cache.

Entity management module and data access module

The entity management module provides the CRUD operation on CMS runtime data. A runtime data (called entity) is a json structure stored in background database while intercepted by the metadata definition. This module control the data storage strategy; provide MVCC check; provide data relationship check (strong reference and dangling check); default value handling; access control; A typical visitor pattern is used here to process the data.

Data Storage

A couple of storage strategy has been taken into consideration.

Data distribution

“Every repository would have a mongo database as its storage.”

All data in same collection

This is trivial solution for small data set. Some limitations for this solution:

Indexes on different metaclass would need to avoid naming confliction since they are having same namespace from storage point of view.
Unique index must also be sparse since
When the one of the collection’s data set grows, it will also impact other metaclass’ access cost (query need to search for more documents).

Data in different collection per metaclass

This is much more RDB style data store. Every metaclass would have a dedicated collection for its data. Thus the different metaclass could have independent indexes definition. This is suggested data storage distribution stragtegy.

Separate metaclass into different database/replicate-set

This is CMS capability to overcome the mongo limitation on database-level write lock. In case, some of the metaclass grow too quick, and impact other collection in the database.

Storage Format

To store the data in mongoDb, CMS introduce an encoded storage format. Every field of entity would have a dbname. This dbname is treated as storage inside, and this design makes the field name change as easy as an update to the metadata but keeping the dbname unmodified.

Hierarchy format

Hierarchy format have different json key inside entity for each field, including the field property like _lastmofied and _length, this design is easy to manipulate the data, but there would more java maps thus more memory consumption when load this data from mongodb. Thus flatten format is introduce to reduce this overhead.

{

"E6v": {

"v": "Staging",