Berkeley DB 和关系型数据库的比较(1)

最新推荐文章于 2022-10-23 15:11:59 发布

bakers

最新推荐文章于 2022-10-23 15:11:59 发布

阅读量3.4k

点赞数

分类专栏： [8]鸟语世界 [13]数据库 [12]信息检索文章标签：数据库 database oracle system 存储 file

[13]数据库同时被 3 个专栏收录

10 篇文章 0 订阅

订阅专栏

[12]信息检索

3 篇文章 0 订阅

订阅专栏

[8]鸟语世界

2 篇文章 0 订阅

订阅专栏

A Comparison of Oracle Berkeley DB and

Relational Database Management Systems

Berkeley DB 和关系型数据库的对比

INTRODUCTION

介绍

Oracle is well known for its industry-leading database engine, Oracle Database.

Oracle Database is an extremely reliable, highly scalable, client-server, relational

database management system (RDBMS). It serves as the information repository for

a wide range of applications in a large variety of industries around the world.

Oracle 以其领先的工业级数据库引擎而著称，Oracle 数据库是一种非常可靠的，高速发展的关系数据库管理系统，在全世界很多工业化应用中，Oracle 为广大应用程序提供了信息仓储服务。

In addition to Oracle Database, the company offers a number of other database

products for systems and applications that have special performance, resource or

run-time requirements. Oracle’s TimesTen database is a high-performance inmemory

engine that provides relational database services with outstanding

responsiveness and throughput. The Oracle Lite database serves the mobile and

device markets, providing easy-to-use storage with synchronization services that

allow occasionally-connected devices to share information with a centralized

Oracle Database server. Oracle Berkeley DB is the company’s only non-relational

storage engine, and is aimed at applications and devices that need fast local storage

without using SQL for data access.

除了Oracle数据库之外，公司也提供了大量的数据库产品，针对存在特殊的平台，资源，时间性能的需求。Oracle’s TimesTen 数据库拥有杰出的响应速度和系统吞吐量。Oracle Lite 数据库服务于移动和驱动市场，提供了简单易用的存储，并且支持允许发生偶然连接用于共享集中Oracle数据库服务器的同步数据服务。Oracle Berkeley DB是公司仅有的非关系数存储引擎，旨在解决本地数据在没有使用SQL进行数据访问的情况下满足快速的存储需求。

Because Oracle Berkeley DB is non-relational, it is very different from Oracle’s

other database products. Users familiar with Oracle Database are often surprised

by Berkeley DB, and wonder how and where it can best be used. This paper

explores both the similarities and the differences between RDBMS products and

Berkeley DB, in order to help customers decide when Berkeley DB is, and is not, a

good choice for the system that they are building.

由于 Oracle Berkeley DB是非关系型的，它很土同于Oracle公司的其他数据库产品，熟悉Oracle数据的用户经常对Berkeley DB感到惊奇，疑惑如何，怎样使用它，这篇文章探索了RDBMS产品与Berkeley DB之间的相似和不同之处，从而帮助客户决定何时能够对Berkeley DB作出选择，何时不能选择，在我们创建的系统中。

WHAT IS BERKELEY DB?

Berkeley DB 是什么？

Berkeley DB is a high performance, scalable and reliable non-relational storage

system. An outgrowth of research performed initially at the University of

California at Berkeley , and later continued at Harvard University , Berkeley DB was

designed from the very beginning to be an embeddable database engine, so no

separate server is required, and no runtime human administration is needed.

Berkeley DB 是一种具有高效，快速升级，高可靠性等众多优点的非关系型存储系统。最开始的研究成功来之加州大学伯克利分校。后来在哈佛大学得到了发展，一开始 Berkeley DB就被设计成一种嵌入式的数据库引擎，因此不需要单独的服务器，在线管理员也不需要。

Philosophically, Berkeley DB is a tool for software developers, not for IT

professionals or DBAs. It is intended to provide fast, reliable data storage for

applications. The only way to use Berkeley DB is to write code. There is no

standalone server and no SQL query tool for working with Berkeley DB databases.

Developers often find it easiest to think of Berkeley DB not as a database system,

but rather as a library with a good B-tree, hash table and persistent queue for

storing records in an application. That is a reasonable simplification. Unlike simple

B-trees or hash tables, though, Berkeley DB offers enterprise-grade, concurrent,

transactional storage services, so that multiple threads or processes can operate on

the same collection of B-trees and hash tables at the same time without risk of data

corruption or loss.

从哲学上说，Berkeley DB是针对软件开发人员的一种工具，而不是为IT专业人员或数据库管理员，它更倾向于为应用程序提供快速，可靠的数据存储。使用Berkeley DB的唯一方法就是编写代码，没有长久的服务器和SQL 查询工具当使用Berkeley DB 存出数据记录的时候。开发者经常发现把Berkeley DB 想象成一种具有优秀B-Tree,hash table 和persistent queue的库比当成一个数据库系统更加简单、那是很合理的简化，不像简单的B-Trees或hash tables, 然而Berkeley DB提供了企业级，支持并发，事务型存储服务，因此在B-Trees 和hash tables上的多线程或多个进程同时操作不会引起数据丢失或冲突。

The Oracle Berkeley DB product line consists of three different embeddable

offerings.

Oracle Berleley DB产品线包含了三个可嵌入的组件。

In the lower left corner, the product named Berkeley DB is an embeddable library

written in the C programming language. It supports different on-disk storage

structures, provides full transactional storage services and offers replication for

fault tolerance and high availability. The product supports C, C++ and Java

language access, as well as a wide variety of popular scripting languages.

在下脚左边，名为 Berkeley DB 的产品是一个可嵌入的库，该库使用C语言完成，它支持不同的硬盘存储结构，提供了完全的事务存储服务和对容错的复制和可用性，这个产品支持C,C++和JAVA 访问，也支持许多流行脚本语言。

Developers building XML-based applications can choose the Berkeley DB XML

product, layered on top of the C engine. Berkeley DB XML understands XML

schemas, is able to parse and index XML documents, and uses XPATH and

XQuery for data retrieval. It supports C++, Java and a variety of popular scripting

languages.

开发者创建基于XML的应用程序能选择Berkeley DB XML 这个产品，处于C 引擎的顶端，Berkeley DB XML支持XML模式，能够解析和索引XML文档，也可以使用XPATH和XQUERY进行数据查询，它支持C++ ,Java 和大量流行脚本语言。

Java developers that want the same embeddable storage services as C programmers

get, but who need to deploy a 100% pure Java system, can choose Berkeley DB

Java Edition. This embeddable engine runs as a single JAR in a Java Virtual

Machine. It provides the same transactional storage and retrieval services as the C

product, but is implemented entirely in Java.

那些想要和C程序员具有相同的嵌入式存储服务的Java开发人员，当他们想要开发一个100%的纯Java系统时，他们可以选择Berkeley DB Java Edition, 他和个嵌入式引擎以一个JAR文件在Java虚拟机上运行，它提供相同的事务存储和检索服务就像C产品一样，但完全在Java环境下执行。

CHOOSING THE RIGHT TOOL FOR THE JOB

为你的工作选择正确工具

Sometimes, it is easy to decide how to store data in an application.

有时候，在应用程序中决定怎样存储数据是很容易的。

For example, large accounting or customer support systems often come already

integrated with a relational database product. These systems use the relational

engine to store and retrieve the data they manage, and allow administrators and

others to query and manipulate that data separately, using SQL and SQL-based

reporting tools.

例如，大账户或客户支持系统经常和关系型数据库产品相集成，那些系统使用关系型数据库引擎存储或检索他们所管理的数据，并且允许管理员和其他人单独查询和操作数据，使用SQL，或基于SQL的报告工具。

In other cases, a simple file system is all that is needed. Word processing

applications, spreadsheets and software development tools like editors and

compilers read from, and write to, the file system. Users of these applications work

with files by remembering what folders they appear in, and what they are named.

No more elaborate query services are required.

然而，有时应用程序不需要所有的关系型数据库引擎，但一些低级的文件系统又达不到要求，在这种情况下，RDBMS是冗余的，它提供了太多功能，并且太大并且复杂，对比起来，文件系统又显得力不从心，它的功能太少，并缺乏数据库系统所拥有的完整，并发和性能优势。、

Berkeley DB is aimed squarely at that middle ground.

Berkeley DB 正好处于中间状态。

Important Services

重要的服务

When choosing a repository for a new application or service, an architect must

think about the data storage and retrieval services that the application requires.

当为一个应用程序或服务选择知识库的时候，架构师必须考虑所需要的数据存储和检索服务。

Storage

存储

File systems offer fairly rudimentary storage services, as a rule. They allow

applications to read or write any kind of data, but make no guarantees about how

simultaneous updates to the same region of a single file are handled. Applications

that need to make several interdependent changes to different files – for example,

adding a password file entry for a new user, creating a home directory and

allocating a mailbox – get no help from the file system, and must handle that

interdependency themselves. If the system loses power or suffers a software crash

in the middle of a change to a file, most file systems make no promises about what

data will survive after restarting.

文件系统提供很基本的存储服务，他们允许应用程序读写任何数据，但当该数据被访问的时候，不能保证同步更新，需要对几个不同文件创建互相依赖的变化，例如为一个用户添加密码文件，创建主目录和分配一个邮箱- 文件系统将不能奏效，不需访问互相依赖的文件自身，如果系统对现存的操作失去控制或软件出现异常，大部分文件系统不能保证重启之后那些数据仍然存在。

Most relational databases, by contrast, offer much richer storage services.

Competing updates to the same information in a table are handled according to

simple rules that guarantee the correct outcome. If an application needs to make

several related changes to different tables, it simply starts a transaction, makes the

changes, and commits the transaction. The database system guarantees that none

of the changes will be permanent until the transaction commits, but once it

commits, the changes are all guaranteed to be present, even after a system crash

and restart. Most relational systems do impose strict rules about the kinds of

information they will store – applications must turn their data into records in

tables, and must translate from between the application’s internal representation,

and the database representation, at runtime.

相对照，大部分关系型数据库提供了更丰富的存储服务，完成对一张表中的相同信息进行更新，通过简单的规则就可以保证正确的结果。如果应用程序需要更新多个不同的表，仅仅需要创建一个事务，更新数据库，提交事务，数据库系统就可以保证直到事务结束后才接收其他数据更新，然而，一旦更新提交了，新结果保证可以被呈现，甚至系统出现瘫痪掉或重启后，大部分关系型系统对它们存储的信息强加一些严格的规则-应用程序必须把数据转换到表中，在运行时，应用程序必须在程序内部展现和数据库展现中做出翻译/转换。

Queries

查询

File systems organize information by filename. Files are stored in folders. In order

to use a file, the application must know the name of the file, and of the (likely

nested) folder in which it is stored. Of course applications exist that can search a

file system for particular file contents, owners and so on, the only built-in query

facility in most file systems is name-based lookup.

文件系统以文件名组织信息，文件存储在文件夹中，为了使用文件，应用程序必须知道文件的名称和文件路径，当然应用程序可以用文件名方式来查询文件内容。

Relational systems allow applications and users to search for information by typing

queries in SQL. These queries can examine the contents of individual records in

individual tables, and can combine information from different tables easily. Table

names are important, but otherwise, a query in a relational system describes the data

it wants instead of naming it.

关系型数据库系统允许应用程序和用户使用SQL进行数据查询，查询能够在每个表中的每个字段上进行，也可以在不同的表中合并数据记录，表名是不同的，除此之外，在关系型数据库中的查询描述他想要的数据而不是为它命名。

Relational systems, by virtue of SQL, provide an important kind of flexibility: They

allow arbitrary queries in the future. In a file system, a file name is fixed and is the

only way to look up a file. In a relational system, any combination of columns in

any combination of tables can be used to search for, and to combine, data. Rather

than prescribing search criteria in advance, relational systems support ad hoc

queries.

关系型系统，在SQL的帮助下，提供了一种重要的特性：他们支持任意可预见的查询，在文件系统中，文件名是固定的，文件名也是查询一个文件的仅有方式，在关系型系统中，不管是不是来自同一个表，任意列都可以用来查询，也可以组合成数据，比起预先指定查询标准，关系型系统支持多表联合查询(ad boc)

Berkeley DB

Berkeley DB offers many of the storage services – transactions, concurrency, and

recovery – of relational database engines. It is closer to a file system in that it stores

any kind of data in any format, but provides no easy tools for ad hoc queries.

Some applications need transactions and recovery, but not the ability to support

arbitrary end-user searches against the data they store. For those applications,

Berkeley DB is often a better choice than either the file system or a relational

database engine. Berkeley DB combines the simplicity of file system-style data

storage and retrieval with the enterprise-grade scalability, reliability and

transactional guarantees of a high-end relational system like Oracle Database.

Berkeley DB 提供许多存储服务- 事务，并发，恢复，这些关系型数据库引擎，更贴切的说它和文件系统很相近，因为可以任一种格式存储数据，然而没有提供很简单的工具在ad boc查询的时候。

开发人员必须编写代码在Berkeley DB中进行那个数据查询。

一些应用程序需要事务和数据恢复，但不支持任意的最终用户在数据库再次查询数据，对于这些应用来说，Berkeley DB是一种不错的选择。Berkeley DB 组合了文件系统的简单特性和像Oracle等高端关系库的复杂，高级特性。

KEY DIFFERENCES

主要区别

Berkeley DB is different from relational systems in several important ways.

Berkeley DB不同于关系型系统在以下几点。

Berkeley DB Is a Library

Berkeley DB是一个库

Most enterprise-grade relational database engines are heavyweight servers. They are

often installed on a special purpose machine somewhere in the data center.

Applications that want to store data must connect, usually over a network, to the

server, and must read and write data remotely.

大多数企业级关系型数据库都是重量级的服务器。他们被安装在数据中心的特殊目标机器上，想去存储数据的应用程序必须连接，通过网络访问服务器，并且只能远程读写数据。

Berkeley DB, by contrast, is a library. Each of the three Berkeley DB products

links into the same address space as the application that uses it. If multiple

applications want to share a single database, they can do so. Each links the

Berkeley DB code into its address space, and Berkeley DB uses shared memory

and operating system primitives for mutual exclusion to be sure that each thread of

control cooperates with the others.

对比起来，Berkeley DB 只是一个库，三个Berkeley DB产品中的每个都连接到应用程序使用的地址空间，如果多个应用程序想要分享单独的数据库，他们可以这样做，共享内存和操作系统而不会产生冲突。

bakers

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录