7024-1.16版

本文探讨了数据库环境的核心概念,包括数据的定义、不同类型的数据结构、数据库管理系统的角色,以及数据仓库的诞生原因和特点。文章详细讲述了数据库方法如何解决传统文件处理系统的局限性,以及数据库应用在企业中的角色,如ERP和数据仓库。同时涵盖了数据库开发的生命周期方法和原型化方法论。
摘要由CSDN通过智能技术生成

Chapter 1: The Database Environment

Definition of data, metadata, database, information, database management system

  • Data: stored representations of meaningful objects and events (Stored representations of objects and events that have meaning and importance in the user’s environment.)
    • Structured: numbers, text, dates 
    • Unstructured: images, video, documents
  • Database: organized collection of logically related data
  • Relational database: A database that represents data as a collection of tables in which all data relationships are represented by common values in related tables.
  • Data Warehouse (DW, DWH): 顾名思义,是一个很大的数据存储集合,出于企业的分析性报告和决策支持目的而创建,对多样的业务数据进行筛选与整合。它为企业提供一定的BI(商业智能)能力,指导业务流程改进、监视时间、成本、质量以及控制。数据仓库将各个异构的数据源数据库的数据给统一管理起来,并且完成了质量较差的数据的剔除、格式转换,最终按照一种合理的建模方式来完成源数据组织形式的转变,以更好的支持到前端的可视化分析。数据仓库的输入方式是各种各样的数据源,最终的输出用于企业的数据分析、数据挖掘、数据报表等方向。
    • 诞生原因:
      • 历史数据积存:历史数据使用频率过低,堆积在业务数据库中,会导致查询性能下降。
      • 企业数据分析需要:各个部门自己建立独立的数据抽取系统,导致数据不一致,资源 浪费严重,数据库权限也会存在风险。
    • 主要特征: 
      1. 面向主题的(Subject-Oriented )
      2. 集成的(Integrated)
      3. 稳定的(Non-Volatile)
      4. 时变的(Time-Variant )数据集合,面向数据分析,用以支持管理决策。
  • Information: data processed to be useful

Graphical displays turn data into useful information that managers can use for decision making and interpretation 图形显示将数据转化为管理者可以用于决策和解释的有用信息

Descriptions of the properties or characteristics of the data, including data types, field sizes, allowable values, and data context 数据属性或特征的描述,包括数据类型、字段大小、允许值和数据上下文

  • Metadata: data that describes the properties and context of user data
  • Database Management System: A software system that is used to create, maintain, and provide controlled access to user databases. DBMS manages data resources like an operating system manages hardware resources. 用于创建、维护并提供对用户数据库有控制访问的软件应用。 

The primary purpose of a DBMS is to provide a systematic method of creating, updating, storing, and retrieving the data stored in a database. It enables end users and application programmers to share data, and it enables data to be shared among multiple applications rather than propagated and stored in new files for every new application. A DBMS also provides facilities for controlling data access, enforcing data integrity, managing concurrency control, and restoring a database. 数据库管理系统 (DbMs) 是一种支持使用数据库方法的软件系统。 DBMS 的主要目的是提供创建、更新、存储和检索数据库中存储的数据的系统方法。它使最终用户和应用进程进程员能够共享数据,并且使数据能够在多个应用进程之间共享,而不是为每个新应用进程传播和存储在新文档中。 DBMS 还提供用于控制数据访问、强制数据完整性、管理并发控制和恢复数据库的设施。

  • Context: helps users understand data

DB, DBMS, DBS的区别

Database: 按一定结构组织并长期储存在计算机内的、可共享的大量数据的有机集合。

  1. 数据库中的数据是按一定的结构——数据模型来进行组织的(数据间有一定的联系,以及数据有语义解释。数据与对数据的解释是密不可分的)
  2. 数据库管理系统 (Data Base Management System, DBMS): 是管理和维护数据库的系统软件,是数据库和用户之间的一个接口,起主要作用是在数据库建立、运行和维护时对数据库进行统一管理和控制。
    1. (从操作系统角度看:DBMS是使用者,它建立在操作系统的基础之上,需要操作系统提供底层服务,eg: 创建进程,读写磁盘文件,CPU和内存管理等)
    2. 从数据库角度:DBMS是管理者,是数据库系统的核心,是为数据库的建立、使用和维护而配置的系统软件,负责对数据库进行统一的管理和控制
    3. 从用户角度:DBMS是工具或桥梁,是位于操作系统与用户之间的一层数据管理软件。用户发出的或应用程序中的各种操作数据库的命令,都要通过DBMS来执行。
    4. DBMS主要功能有:对用户提供数据库定义、建立、操作和维护功能;对数据库系统提供事务运行、安全控制、组织与存储管理功能。
    5. 常见的DBMS有:Oracle, DB2, SQL Server, SyBase, FoxPro, MySQL等
  3. 数据库系统 (Data Base System, DBS) 是实现有组织地、动态地储存大量关联数据、方便多用户访问的计算机软件、硬件和数据资源组成的系统。在一般计算机系统中引入数据库技术后形成数据库系统。可表示为:DBS=计算机系统(硬件、软件平台、人)+DBMS+DB,数据库系统包含了数据库,数据库管理系统,软件与硬件支撑环境以及各类人员;数据库管理系统在操作系统(OS)支持下,对数据库进行管理与维护,并提供用户对数据库的操作接口。

Issues with file processing systems 传统文件处理系统的缺点

  • Program-Data Independence 程序 — 数据依赖性
    • All programs maintain metadata for each file they use 
    • 对文件结构作出的任何改变都需要改变访问该文件的所有程序的文件描述。

Problems with Data Independence

  1. Each application programmer must maintain his/her own data
  2. Each application program needs to include code for the metadata of each file
  3. Each application program must have its own processing routines for reading, inserting, updating, and deleting data
  4. Lack of coordination and central control
  5. Non-standard file formats
  • Duplication of Data 数据重复
    • Different systems/programs have separate copies of the same data
    • 独立开发的应用程序导致无计划的重复数据文件。重复的数据文件,一方面需要附加存储空间,另一方面,为使所有文件保持最新状态还要增加工作量。

Problems with Data Redundancy

  1. Waste of space to have duplicate data
  2. Causes more maintenance headaches
  3. The biggest problem:
    1. Data changes in one file could cause inconsistencies. 一个文档中的数据更改可能会导致不一致。
    2. Compromises in data integrity. 损害数据完整性。
  • Limited Data Sharing 数据共享有限
    • No centralized control of data 
    • 每个应用都有它自己专用的文件,用户不大可能共享自己应用之外的数据。
  • Lengthy Development Times 开发时间长
    • Programmers must design their own file formats
    • 每个新应用都要求开发者必须从设计新文件的格式和描述开始,然后编写每个新程序的文件访问逻辑。
  • Excessive Program Maintenance 过多的程序维护
    • 80% of information systems budget 
    • 前面所有的因素的组合使依赖传统文件处理系统的组织产生了沉重的程序维护负担。

SOLUTION: The DATABASE Approach

  • Central repository of shared data
  • Data is managed by a controlling agent
  • Stored in a standardized, convenient form

数据库方法:强调整个组织数据的集成和共享。这个方法要求在思考过程中,需要从顶层管理开始,改变考虑问题的方向或转换考虑问题的角度。


Advantages of the database approach 

  • Program-data independence 进程数据独立性
  • Planned data redundancy 计划数据冗余
  • Improved data consistency 提高数据一致性
  • Improved data sharing 改进的数据共享
  • Increased application development productivity 提高应用进程开发效率
  • Enforcement of standards 标准的执行
  • Improved data quality 提高数据质量
  • Improved data accessibility and responsiveness 提高数据可访问性和响应能力
  • Reduced program maintenance 减少进程维护
  • Improved decision support 改进的决策支持

Cost/risk of the database approach 

  1. Specialized personnel 专业人员
  2. Installation and management cost and complexity 安装和管理成本和复杂性
  3. Conversion costs 转换成本
  4. Need for explicit backup and recovery 需要显式备份和恢复
  5. Organizational conflict 组织冲突

Elements of the database approach 

  • Data models
    • Graphical system capturing nature and relationship of data
    • Enterprise Data Model – high-level entities and relationships for the organization
    • Project Data Model – more detailed view, matching data structure in database or data warehouse
  • Database management system
    • A software system that is used to create, maintain, and provide controlled access to user databases
  • Use of Internet Technology
    • Networks and telecommunications, distributed databases, client-server, and 3-tier architectures
  • Database Applications
    • Application programs used to perform database activities (create, read, update, and delete) for database users
       

One-to-many/many-to-many relationships 

Components of the database environment 

  • CASE Tools: computer-aided software engineering
  • Repository: centralized storehouse of metadata
  • Database Management System (DBMS): software for managing the database
  • Database: storehouse of the data
  • Application Programs: software using the data
  • User Interface: text and graphical displays to users
  • Data/Database Administrators: personnel responsible for maintaining the database
  • System Developers: personnel responsible for designing databases and software
  • End Users: people who use the applications and databases

Database applications 

1. 创建 2. 读取 3. 更新 4. 删除

The Range of Database Applications

  • Personal databases 
  • Workgroup databases
    • with wireless local area network
  • Departmental/divisional databases
    • Three-tiered client/server database architecture
  • Enterprise database

  • Enterprise Database Applications
    • Enterprise Resource Planning (ERP)
      • Integrate all enterprise functions (manufacturing, finance, sales, marketing, inventory, accounting, human resources)
    • Data Warehouse
      • Integrated decision support system derived from various operational databases
    • Big Data and Business Analytics
      • Massive amounts of real-time and multimedia data processed by computer clusters in data center for decision support and business forecasting

An enterprise data warehouse

Enterprise data model 企业数据模型

high-level entities and relationships for the organization

是一个显示该组织的高层实体及实体间联系的图形化模型。

特点:

  1. 这是一个提供了关于组织功能的有用信息和重要限制的组织模型。
  2. 通过关注实体,联系和业务规则,企业数据模型强调数据和处理的集成业务规则。

不考

Two Major Approaches to Database and System Development

SDLC: System Development Life Cycle

  • Detailed, well-planned development process
  • Time-consuming, but comprehensive
  • Long development cycle
  1. Planning: 
    1. Purpose: preliminary understanding
    2. Deliverable: request for study
    3. Database activity: enterprise modeling and early conceptual data modeling
  2. Analysis: 
    1. Purpose: thorough requirements analysis and structuring
    2. Deliverable: functional system specifications
    3. Database activity: Thorough and integrated conceptual data modeling
  3. Logical Design: 
    1. Purpose: information requirements elicitation and structure
    2. Deliverable: detailed design specifications
    3. Database activity: logical database design(transactions, forms,displays, views, data integrity and security)
  4. Physical Design:
    1. Purpose: develop technology and organizational specifications 
    2. Deliverable: program/data structures, technology purchases,organization redesigns
    3. Database activity: physical database design (define database to DBMS, physical data organization, database processing programs)
  5. Implementation:
    1. Purpose: programming, testing, training,installation, documenting
    2. Deliverable: operational programs,documentation, training materials
    3. Database activity: database implementation,including coded programs, documentation, installation and conversion
  6. Maintenance: 
    1. Purpose: monitor, repair, enhance
    2. Deliverable: periodic audits
    3. Database activity: database maintenance,performance analysis and tuning, error corrections

Prototyping Database Methodology

  • Rapid application development (RAD)
  • Cursory attempt at conceptual data modeling
  • Define database during development of initial prototype
  • Repeat implementation and maintenance activities with new prototype versions

Conceptual data modeling

  • Analyze requirements
  • Develop preliminary data model

Database maintenance  

  • Tune database for improved performance
  • Fix errors in database

Logical database design

  • Analyze rsquirements in detail
  • Integrate database views into conceptual data model

Physical database design and definition

  • Define new database Contents to DBMS
  • Decide on physical organisation for new data
  • Design database processing programs

Database implementation

  • Code database processing
  • Install new database contents, usually from existing data sources

Database maintenance

  • Analyze database to ensure it meets application needs 
  • Fix errors in database
  • 19
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值