Oracle Performance Tuning 11g2 (2-1)

在windows live writer中编辑的时候,以及预览的时候都是整整齐齐的,但是一上传到博客中就变得乌七八黑了

看的时候觉得很简单,但是一翻译起来就头大了,这样一篇文件,花了2天才翻译完。2天时间足够看4篇文章了。总结一下就是看的时候不一定真的看懂了,大概的明白意思就是了。但是翻译的时候就要逐字逐字的,逐词逐句的思考,有时还要前后对照着翻译。常干这活,会折寿的!

 

延续上一篇的Oracle Performance Tuning 11g2 (2-0)

2.5 Application Design Principles         应用程序设计的原则

This section describes the following design decisions that are involved in building applications: 下面介绍设计的一些方法:

  • Simplicity In Application Design                 简化应用设计

  • Data Modeling                                         数据模型

  • Table and Index Design                           表和索引设计

  • Using Views                                           使用视图

  • SQL Execution Efficiency                         SQL执行的效率

  • Implementing the Application                   编写代码

  • Trends in Application Development           掌控应用开发进度

2.5.1 Simplicity In Application Design              简化应用设计

Applications are no different than any other designed and engineered product. Well-designed structures, computers, and tools are usually reliable, easy to use and maintain, and simple in concept. In the most general terms, if the design looks correct, then it probably is. This principle should always be kept in mind when building applications.

应用设计和其他的工程设计是一样的。被良好设计的架构,计算机,工具是通常可以依赖的,是容易使用、维护,和理解的,至少在概念上是这样子的。在多数的项目组中,假如架构设计看起来正确的话,项目也会基本上会成功。这种理念应该永记心底。 ---&gt简单来讲,要向乔大爷学习,简单的想甜一下!(或者是美的想甜一下)

Consider the following design issues:

  • If the table design is so complicated that nobody can fully understand it, then the table is probably poorly designed.

  • If SQL statements are so long and involved that it would be impossible for any optimizer to effectively optimize it in real time, then there is probably a bad statement, underlying transaction, or table design.

  • If there are indexes on a table and the same columns are repeatedly indexed, then there is probably a poor index design.

  • If queries are submitted without suitable qualification for rapid response for online users, then there is probably a poor user interface or transaction design.

  • If the calls to the database are abstracted away from the application logic by many layers of software, then there is probably a bad software development method.

  • 如果表设计的没人能看懂,就说明设计太差了

  • 如果SQL语句写的又臭又长,以至于优化器都不知道如何去实时的优化,那么这是一个很差的语句,很差的事务和表的设计

  • 假如一个表上的某一个列存在在两个索引上,(两个参数都引用了一个列的话),那这个索引设计就是有问题的。(个人经验是表设计的问题)

  • 如果一个查询条件无法快速的响应给联机使用者,那么这个用户接口或事务设计是糟糕的。(oracle自己的certview慢的能死人的,还好意思在这里讲)

  • 假如数据库访问功能被许多中间层包裹,太多的抽象以至于脱离了应用逻辑,那这个应用设计是糟糕的(这个一般容易出现在面向对象语言中,特别是JAVA中)


2.5.2 Data Modeling

Data modeling is important to successful relational application design. You must perform this modeling in a way that quickly represents the business practices. Heated debates may occur about the correct data model. The important thing is to apply greatest modeling efforts to those entities affected by the most frequent business transactions. In the modeling phase, there is a great temptation to spend too much time modeling the non-core data elements, which results in increased development lead times. Use of modeling tools can then rapidly generate schema definitions and can be useful when a fast prototype is required.

在关系型数据库中,数据模型是非常重要的。你必须要能快速的将这个模型反应到业务需求上。在设计这个数据模型过程出现争论是正常的(没有讨论的模型设计,结果一般比较差,维护类项目中新增功能除外)。最重要的事情是这个模型要能代表最频繁,最关键业务的应用。在这个模型设计阶段,很容易出现在非关键业务数据上花费过多的时间,导致开发延期(在银行的项目开发中这事经常出现的)。使用一些模型开发工具可以加快原型设计。

 



2.5.3 Table and Index Design

Table design is largely a compromise between flexibility and performance of core transactions. To keep the database flexible and able to accommodate unforeseen workloads, the table design should be very similar to the data model, and it should be normalized to at least 3rd normal form. However, certain core transactions required by users can require selective denormalization for performance purposes.

Examples of this technique include storing tables pre-joined, the addition of derived columns, and aggregate values. Oracle Database provides numerous options for storage of aggregates and pre-joined data by clustering and materialized view functions. These features allow a simpler table design to be adopted initially.

Again, focus and resources should be spent on the business critical tables, so that optimal performance can be achieved. For non-critical tables, shortcuts in design can be adopted to enable a more rapid application development. However, if prototyping and testing a non-core table becomes a performance problem, then remedial design effort should be applied immediately.

Index design is also a largely iterative process, based on the SQL generated by application designers. However, it is possible to make a sensible start by building indexes that enforce primary key constraints and indexes on known access patterns, such as a person's name. As the application evolves, and as you perform testing on realistic amounts of data, you may need to improve the performance of specific queries by building a better index. Consider the following list of indexing design ideas when building a new index:

表的设计多数是一个在性能和灵活性之间做的一个折衷方案结果,应该主要关注那些核心业务(次要业务交给毕业生设计都可以)。为了保持数据库的灵活性,以及能够适应不可预知的压力,表的设计应该尽可能的和数据模型相似,同时应该至少符合第三范式(大学学完就忘完了,当年那个用数学算啊算的,数字一忘,哪里还记得怎么做)。然而在一些核心的业务上,用户需要极高的响应时间时,可以为了性能可以不按常规来设计。

具个例子就是通过表的预先合并,加上额外附加列或者是根据某种规则预先演算出列值,聚合值。oracle通过cluster和MV提供了支持此功能的函数。这些特性简化了表的设计。

再次说明:一定要把精力和各种人力等资源放到业务的关键业务的核心表设计上,这样就能保证基本上业务系统能正常运转。对于非核心业务,虽然设计阶段未完整设计,但是可以在后期进行加快处理。如果原型和非核心表的设计上出现性能问题,通过补救设计来立刻修正它。(在银行核心系统业务中,80%的时间是和行方在那些非核心业务上周旋。也许觉得不可思议,但是行方的理念通常是核心业务本来就应该是你们做好的,所以他们只能在八杆子打不着的地方乱捅!)

索引的设计也是一个迭代的过程。但是从一开始就可以在一些很明显的列上,比如姓名上,加上主键约束或索引。随着项目的推进,各种功能的增加,以及在上线前对系统在相当数量的真实数据上进行测试时(将生产数据导到测试环境中),可以去增加或调整索引。当设计时使用下面的方法进行设计:

  • Appending Columns to an Index or Using Index-Organized Tables   将一些列加到索引上,或者使用IOT表

  • Using a Different Index Type                    根据不同的情况使用不同的索引类型

  • Finding the Cost of an Index                    找出成本最高的索引

  • Serializing within Indexes                         通过索引进行序列化

  • Ordering Columns in an Index                  通过索引进行排序

2.5.3.1 Appending Columns to an Index or Using Index-Organized Tables

One of the easiest ways to speed up a query is to reduce the number of logical I/Os by eliminating a table access from the execution plan. This can be done by appending to the index all columns referenced by the query. These columns are the select list columns, and any required join or sort columns. This technique is particularly useful in speeding up online applications response times when time-consuming I/Os are reduced. This is best applied when testing the application with properly sized data for the first time.

The most aggressive form of this technique is to build an index-organized table (IOT). However, you must be careful that the increased leaf size of an IOT does not undermine the efforts to reduce I/O.

加快查询速度的最容易的办法之一就是,在执行计划中通过避免全表扫描来以最大程序的减少IO消耗。可以通过将查询的列全部放到索引里面。在索引里的这些列是你select时要查询的那些列,加上需要排序或者join时用到的列。这种技术通过减少IO消耗,加速联机应用程序的响应时间非常有帮助。当数据量达到一定程序时再去使用。

聚集度最高的方法是建立IOT表(排好的数据挤在一个或相邻的块中,扫描非常快)。但是这种方法对于要经常插入数据的业务而言,所造成的IO开销也是不少的。(也就是说对于IOT表而言,最好是以查询为主的应用或者是参数表的应用)

 



2.5.3.2 Using a Different Index Type

There are several index types available, and each index has benefits for certain situations. The following list gives performance ideas associated with each index type.

oracle给我们提供许多类型的索引,每一种都有它强劲的地方。下面给出每种索引他们的优点



2.5.3.2.1 B-Tree Indexes

These indexes are the standard index type, and they are excellent for primary key and highly-selective indexes. Used as concatenated indexes, the database can use B-tree indexes to retrieve data sorted by the index columns.

B树索引:这是最标准的索引类型,它们对于需要按主键或者选择度很高的列进行查询时,这种索引表现出色。通过使用索引,数据库可以直接通过B树,获取到已经排序好数据。

注: 选择度最高的就是主键的,因为他们从来不会重复;一般而言在OLTP中,这个选择度要尽可能地达到或接近主键级别,一般有2~5条记录重复是正常的。

比如一个POS机系统(去超市时刷卡的机器),每天产生200W的流水,但是多数客户每天可能只有一笔消费记录,或者称消费流水;有钱人可能会有好几笔,甚至几十笔。这都没有问题,因为消费再多,与200万相比也是非常少的。在银行中会有一个风险控制系统(简称风控系统),这个系统会根据POS的编号去查看一个POS机每天的记录数(当然从各种维度下去检查的,不至这一个条件的),同时会对每张卡消费都做记录,因此如果一个人疯狂刷卡时,银行的风控系统就会发现的。一旦某个人交易数太多,或者POS刷卡数太多,达到银行设置的警戒线,就会被发现。但是如好又多之类的大超市,他们客流量大,所以交易量也大,银行会对这些客户做额外的阈值设置。对于这些大客户,他们交易量比较大,所以选择度就低、效率也低吗? 非也,因为在查询的时候都有一个条件----时间,一般精确到秒的,时间域也是在索引列中放置的,所以在OLTP系统中,对于大客户一样是选择度非常高的。慢的地方是在做批量处理时,要将这些信息取出来,会稍慢一些的。

 



2.5.3.2.2 Bitmap Indexes

These indexes are suitable for low cardinality data. Through compression techniques, they can generate a large number of rowids with minimal I/O. Combining bitmap indexes on non-selective columns allows efficient AND and OR operations with a great number of rowids with minimal I/O. Bitmap indexes are particularly efficient in queries with COUNT(), because the query can be satisfied within the index.

位图索引:这种索引适合建立在低基数数据列上(一个列可能只有个别几个值,如男或女,狗或猫等等)。通过压缩的技术,它们可以通过最小的IO获取到最大的ROWID值。如果查询条件是选择度低比较低,同时又使用AND/OR去过滤时,oracle会通过合并bitmap索引(或是位运算)达到减少IO,又提高检索ROWID的效果。bitmap索引对于要查询count值的SQL语句时效率额外的高,因为他们都直接通过索引去获取了(不会再通过rowid去反查原表数据了。这里讲的是如果select count(*) from t1 where id1 = ‘狗男女’这样的方式时,就一定会走到索引上,而且交易非常的高。如果什么条件也没有,那还是要全表扫描的了。那么这种效率有多高呢?我觉得是第二高,第一名是直接把count结果通过group by放到一个mv中,oracle通过query rewrite enable直接跳到mv中,将查询结果取出来)

 



2.5.3.2.3 Function-based Indexes

These indexes allow access through a B-tree on a value derived from a function on the base data. Function-based indexes have some limitations with regards to the use of nulls, and they require that you have the query optimizer enabled.

Function-based indexes are particularly useful when querying on composite columns to produce a derived result or to overcome limitations in the way data is stored in the database. An example is querying for line items in an order exceeding a certain value derived from (sales price - discount) x quantity, where these were columns in the table. Another example is to apply the UPPER function to the data to allow case-insensitive searches.

函数索引: 这种索引和B树索引是一样的,只不过它允许在查询的列上使用函数(即由函数根据基值演算出的值)。它有许多的限制,特别是NULL值,同时需要我们打开查询优化器来使用。

函数索引对于那些需要根据一些组合列算出一个值,这个值已经存在在数据库中时它非常有用。具个例子:当需要根据【(销售金额-回扣)*销售量】来从订单中获取商品值时,因为这个销售的价格已经是在销售后就存在在数据库中了。另外一个例子是不知道某个字段是大写还是小写的,这时直接将所有的都按UPPER来查询,函数索引就派上用场了

注:理解了如何创建之后,都不用看上面的话了。函数索引有两种创建方法,一种是传统的方式,一种是11g中新增的虚拟列(这个应该是学DB2的,因为DB2没有函数索引,只能使用称之为生成列的方式实现,这个生成列和虚拟列除了要不要存储到硬盘的区别外,其他一样,连创建的语法都是generated always as,呵呵。有很大可能是oracle抄DB2的。这样DB2的项目在搬迁到oracle时就不需要再做太多的改动了)

 



2.5.3.2.4 Partitioned Indexes

Partitioning a global index allows partition pruning to take place within an index access, which results in reduced I/Os. By definition of good range or list partitioning, fast index scans of the correct index partitions can result in very fast query times.

全局索引分区允许在访问索引时就进行分区裁减,这样大大减少了IO。通过合理的定义范围(range)或者值(list)分区,可以在极快的速度进行快速索引扫描(还有hash分区的)。

注:不知道哪个DBA以前告诉我们领导说分区对性能没什么提升,导致后来我们一说要分区,领导就觉得在干没意义的工作!花了我N长的演示他才相信以前的DBA错了。但是他还是觉得以前的DBA很好,看来情商才是最重要的啊,智商算个屌啊!

 



2.5.3.2.5 Reverse Key Indexes

These indexes are designed to eliminate index hot spots on insert applications. These indexes are excellent for insert performance, but they are limited because the database cannot use them for index range scans.

这种索引的使用是为了在插入数据时消除索引的热区竞争。这种索引在插入密集的应用中表现出色,但是它们缺点是无法进行范围查询。

 



2.5.3.3 Finding the Cost of an Index

Building and maintaining an index structure can be expensive, and it can consume resources such as disk space, CPU, and I/O capacity. Designers must ensure that the benefits of any index outweigh the negatives of index maintenance.

Use this simple estimation guide for the cost of index maintenance: each index maintained by an INSERT, DELETE, or UPDATE of the indexed keys requires about three times as much resource as the actual DML operation on the table. Thus, if you INSERT into a table with three indexes, then the insertion is approximately 10 times slower than an INSERT into a table with no indexes. For DML, and particularly for INSERT-heavy applications, the index design should be seriously reviewed, which might require a compromise between the query and INSERT performance.

See Also:

Oracle Database Administrator's Guide to learn how to monitor index usage

创建和维护索引是一种非常昂贵的工作,它会消耗IO,CPU,以及硬盘空间。设计人员要确保你创建这个索引带来的收益要远远高于维护它的成本时再使用。

使用这个简单的方法来评估索引维护成本:在插入删除更新索引列值时,大概会比不使用时多消耗3倍的资源。因此当你插入一个有3个索引的表时(1个表上有3个索引),通常会比没有索引时慢10倍。因此,对于那些需要大量插入的表或者应用,设计索引时要非常的谨慎的,你需要在查询与插入之间做一个权衡。

注:我曾经在给一个银行做维护时,把一个索引干掉之后,性能立刻提升了100笔左右。但是项目组人员很不满意,觉得这个索引未来可能会有用,真是无语了!如果你知道他们在这个表上建了10个索引的话,而且是要做大量插入修改动作,你会怎么想呢?

如果监控索引呢?使用alter index xxxx_index_name monitoring usage; 通过查询V$OBJECT_USAGE视图看索引是否真的被执行计划引用到

这个视图的解释如下: V$OBJECT_USAGE displays statistics about index usage gathered from the database for the indexes owned by the current user. You can use this view to monitor index usage. All indexes that have been used at least once can be monitored and displayed in this view.

 



2.5.3.4 Serializing within Indexes

Use of sequences, or timestamps, to generate key values that are indexed themselves can lead to database hotspot problems, which affect response time and throughput. This is usually the result of a monotonically growing key that results in a right-growing index. To avoid this problem, try to generate keys that insert over the full range of the index. This results in a well-balanced index that is more scalable and space efficient. You can achieve this by using a reverse key index or using a cycling sequence to prefix and sequence values.

如果一个索引列使用sequence,或者时间域的话,就可能会造成数据库“热区”问题,这是会影响响应时间和吞吐量的。这是因为它们不断的单调性增长造成索引的右侧倾斜(sequence从小到大的增长,时间也是)。为了解决这个问题,尽可能将索引值能分布到整个index范围内,而不是这种连续性的。当索引平衡时,系统的空间使用及效率都会更好。你可以通过使用反向索引或再创建一个循环sequence加到以前自己的索引头上组成一个新值来达到平衡的目的。

 



2.5.3.5 Ordering Columns in an Index

Designers should be flexible in defining any rules for index building. Depending on your circumstances, use one of the following two ways to order the keys in an index:

  • Order columns with most selectivity first. This method is the most commonly used because it provides the fastest access with minimal I/O to the actual rowids required. This technique is used mainly for primary keys and for very selective range scans.

  • Order columns to reduce I/O by clustering or sorting data. In large range scans, I/Os can usually be reduced by ordering the columns in the least selective order, or in a manner that sorts the data in the way it should be retrieved. See Chapter 14, "Using Indexes and Clusters".

设计人员应该在定义index时保持足够的灵活性。根据你的环境使用下面的两种方法来在index中排序:

在索引中的第一列应该是最具有选择度的(也就是说这一列最好能过滤过95%的数据)。这样就能以最快的方式,以最少的IO,根据ROWID获取数据。这种技术主要用于根据主键或者有极强的选择度的范围查询

通过cluster或者排序数据减少需要排序列的IO消耗。在一个大的范围查询中,IO可能被尽可能降低,通过对低选择度的列排序或者使用需要按顺序取数据时预先排这些数据。(低选择度不是说你要取表中30%数据这样的概念,如果是这样的话,oracle会直接表全表扫描的)

 



2.5.4 Using Views

Views can speed up and simplify application design. A simple view definition can mask data model complexity from the programmers whose priorities are to retrieve, display, collect, and store data.

However, while views provide clean programming interfaces, they can cause sub-optimal, resource-intensive queries. The worst type of view use is when a view references other views, and when they are joined in queries. In many cases, developers can satisfy the query directly from the table without using a view. Usually, because of their inherent properties, views make it difficult for the optimizer to generate the optimal execution plan.

视图可以加速和简化应用的开发。一个简单的视图定义可以隐藏复杂的数据模型,让主要从事增删改查的程序员不要过多的关心。

虽然视图可以提供一个简单明了的接口,但是他们也可能造成低效,特别是资源消耗敏感的查询语句。最差的视图是一个视图又使用了另一个视图,以及视图之间的join查询。许多时候开发人员可以直接从表里取数据而不是通过视图做。通常来说,由于视图与生俱来的特点,优化器很难去做出最优的执行计划。

 



2.5.5 SQL Execution Efficiency

In the design and architecture phase of any system development, care should be taken to ensure that the application developers understand SQL execution efficiency. To achieve this goal, the development environment must support the following characteristics:

在系统开发的任何设计阶段,请注意一定要确保开发人员要理解SQL的执行效率。为了达到这个目标,开发环境中必须要满足以下几个特点:

  • Good database connection management            良好的数据库连接管理模块

    Connecting to the database is an expensive operation that is highly unscalable. Therefore, the number of concurrent connections to the database should be minimized as much as possible. A simple system, where a user connects at application initialization, is ideal. However, in a Web-based or multitiered application, where application servers are used to multiplex database connections to users, this can be difficult. With these types of applications, design efforts should ensure that database connections are pooled and are not reestablished for each user request.

        连接数据库是一个非常昂贵的过程(连接mysql除外)。因此并发连接数据库应该越少越好。只有一个用户连接上来的简单的系统是最理想的。然而在一个WEB或者多层应用架构中,应用服务器多路连接到数据库给用户使,情况变得非常困难。在这种类型的应用中,设计人员应该确保数据库连接是使用“连接池”,并且确保每个用户连接上来时不是真正的再重新连接到数据库上(也就是从连接池中取一个连接使用)。

注:其实java连接基本上由weblogic解决,已经实现了连接池了;C程序基本上都是长连接,不会轻易断开的,除了shell脚本定期会连接一下,这个问题不大,毕竟几分钟才连接一次嘛。php好像在连接上是会有些问题的,不知道有没有框架去解决,这个尚未去研究。不过PHP作为网站语言,主要和MYSQL使用。和oracle打交道并不多。

  • Good cursor usage and management                 良好的游标使用和管理

    Maintaining user connections is equally important to minimizing the parsing activity on the system. Parsing is the process of interpreting a SQL statement and creating an execution plan for it. This process has many phases, including syntax checking, security checking, execution plan generation, and loading shared structures into the shared pool. There are two types of parse operations:  

    最大程度的降低SQL解析的重要性不亚于维护连接管理。解析就是翻译SQL语句以及创建一个执行计划的过程。这个过程有很多的阶段,包括了语法的检查,安全检查,执行计划产生 ,加载共享结构到shared pool中。oracle有两种类型的解析:

    • Hard parsing     硬解析

      A SQL statement is submitted for the first time, and no match is found in the shared pool. Hard parses are the most resource-intensive and unscalable, because they perform all the operations involved in a parse.          当SQL语句第一次执行的时候,在shared pool中找不到些执行历史记录。硬解析是非常消耗资源的,但又不可避免,因为处女解析总是要来一次的。

    • Soft parsing     软解析

      A SQL statement is submitted for the first time, and a match is found in the shared pool. The match can be the result of previous execution by another user. The SQL statement is shared, which is good for performance. However, soft parses are not ideal, because they still require syntax and security checking, which consume system resources.                               当SQL语句第一次执行的时候,在shared pool中找到了历史执行的记录。这个执行的记录有可能是其他用户执行的。为了性能考虑,这些SQL语句是共享的。然而软解析不是最理想的,因为他们仍然需要语法和安全检查,消耗资源。

    Because parsing should be minimized as much as possible, application developers should design their applications to parse SQL statements once and execute them many times. This is done through cursors. Experienced SQL programmers should be familiar with the concept of opening and re-executing cursors.

    Application developers must also ensure that SQL statements are shared within the shared pool. To achieve this goal, use bind variables to represent the parts of the query that change from execution to execution. If this is not done, then the SQL statement is likely to be parsed once and never re-used by other users. To ensure that SQL is shared, use bind variables and do not use string literals with SQL statements. For example:

        因为解析的过程应该尽可能的少,所以应用开发人员应该做到一次解析,多次执行。这个是可以通过游标来实现的。高手应该会非常熟悉这个概念的,包括了打开和重新执行游标。应用开发人员同样应该确保SQL是共享在shared pool中的。为了实现这个目标,绑定变量就代表了查询条件中那么变来变去的条件。假如不是使用绑定变量的话,可能一个SQL执行完后再也不会去执行了(OLTP中很多时候最多3次)。为了确保SQL是共享的,确保使用绑定变量和不要使用字符串值作为SQL的查询条件。(可问题是开发人员很多都不知道什么叫绑定变量,或许他们在使用,但是有可能是从网上抄的一个用法,国内公司培训培训啊)例如:

    Statement with string literals:        

    SELECT * FROM employees 
      WHERE last_name LIKE 'KING'; ---&gt使用字面值的做法

    Statement with bind variables:

    SELECT * FROM employees 
      WHERE last_name LIKE :1; ---&gt使用绑定变量的做法

    The following example shows the results of some tests on a simple OLTP application: 下面的例子显示了一个在OLTP在的测试结果

    Test                         #Users Supported
    No Parsing all statements           270 
    Soft Parsing all statements         150
    Hard Parsing all statements          60
    Re-Connecting for each Transaction   30

    These tests were performed on a four-CPU computer. The differences increase as the number of CPUs on the system increase. See Chapter 16, "SQL Tuning Overview" for information about optimizing SQL statements.  上面的测试是在一个4核电脑上测试的。随着CPU个数的增加,差异也会增加

 



2.5.6 Implementing the Application       应用开发实现

The choice of development environment and programming language is largely a function of the skills available in the development team and architectural decisions made when specifying the application. There are, however, some simple performance management rules that can lead to scalable, high-performance applications.

应用开发实现:开发环境和程序语言的选择很大程度上是由开发团队的技能决定的,以及当指定了开发环境时架构决定的。然而有些简单的性能管理方法可以让应用程序变得更高效。

  1. Choose a development environment suitable for software components, and do not let it limit your design for performance decisions. If it does, then you probably chose the wrong language or environment.  为软件模块选择一个合理的开发环境,不要限制你的设计和性能决策。如果不能的话,那可能是选择了错误的开发语言和环境(在银行除了C和JAVA很难做选择的。想用C++?前提是银行愿意为你买C++编译器;想用bash,python?前提是银行愿意给你安装这些软件;因此很多时候没得选择的,除了C和JAVA没得选择)

    • User interface       用户界面

      The programming model can vary between HTML generation and calling the windowing system directly. The development method should focus on response time of the user interface code. If HTML or Java is being sent over a network, then try to minimize network volume and interactions.

            开发模型或者是使用html的或者是使用传统的windows窗口。开发的重点应该集中在响应时间上。假如使用JAVA语言BS架构来开发,请确保尽量降低每次网页获取的数据量和交互的次数。简单说就是别一打开任何窗口或功能时,都先从数据库里扫一堆数据回来展示,那样业务人员在使用界面时就很不顺畅,不顺畅他们就会骂人。

    • Business logic      程序逻辑

      Interpreted languages, such as Java and PL/SQL, are ideal to encode business logic. They are fully portable, which makes upgrading logic relatively easy. Both languages are syntactically rich to allow code that is easy to read and interpret. If business logic requires complex mathematical functions, then a compiled binary language might be needed. The business logic code can be on the client computer, the application server, and the database server. However, the application server is the most common location for business logic.

            像JAVA和PLSQL这种解释型语言(C是编译型语言,编译语言通常要会解释型语言运行快几倍到几十几百倍),是非常适合去做这种逻辑业务处理的。他们非常的轻便,所以修改升级都很容易。两种语言拥有完善的语法,非常方便阅读和解析。如果业务需要非常复杂的数学级运算或者非常精确的功能,那么编译型语言(C,C++)将更合适。业务逻辑处理可以在客户端电脑(PC上,如JAVASCRIPT语言),应用服务器上或者数据库服务器上。通常来说应用服务器上处理是最普遍的。

    • User requests and resource allocation     用户请求和资源分配

      Most of this is not affected by the programming language, but tools and fourth generation languages that mask database connection and cursor management might use inefficient mechanisms. When evaluating these tools and environments, check their database connection model and their use of cursors and bind variables.

            这种情况一般不会影响编程语言的选择,但是一些工具和第4代语言(汇编是第一代,C是第二代,JAVA第三代,PLSQL是第四代语言),他们掩盖了数据库的连接机制以及游标管理,有可能会导致低效的运行。当评估这些工具和环境时,请检查一下他们的连接管理、游标、绑定变量等使用机制。

    • Data management and transactions        数据及事务管理

      Most of this is not affected by the programming language.        这种对编程语言的选择是没有任何影响的

  2. When implementing a software component, implement its function and not the functionality associated with other components. Implementing another component's functionality results in sub-optimal designs and implementations. This applies to all components.

    当实现一个软件模块时, 尽可能只要实现这个功能就行了,不要去和其他模块有太多的连接(我们通常称之为弱耦合). 因为其他模块设计和实现可能不太理想. 耦合度高的话,这被应用到所有的模块中,即所有模块效率都低了。


  3. Do not leave gaps in functionality or have software components under-researched in design, implementation, or testing. In many cases, gaps are not discovered until the application is rolled out or tested at realistic volumes. This is usually a sign of poor architecture or initial system specification. Data archival and purge modules are most frequently neglected during initial system design, build, and implementation.

    不要在模块中留下一些缺陷,或者导致软件模块需要重新设计,开发和测试的情况. 在许多情况下, 缺口或许直到应用程序上线或者在生产环境中测试时才能被发现.这通常是一个差劲的架构或者最初系统设计有问题的一个征兆.  数据归档以及清理模块通常是在最初设计、开发和实现中最容易被忽略的. (通常这种情况发生在批量程序中, 批量被拉起来以后,通常就先做历史归档,再清理旧数据,再跑当天数据)


  4. When implementing procedural logic, implement in a procedural language, such as C, Java, or PL/SQL. When implementing data access (queries) or data changes (DML), use SQL. This rule is specific to the business logic modules of code where procedural code is mixed with data access (nonprocedural SQL) code. There is great temptation to put procedural logic into the SQL access. This tends to result in poor SQL that is resource-intensive. SQL statements with DECODE case statements are very often candidates for optimization, as are statements with a large amount of OR predicates or set operators, such as UNION and MINUS.

    当实现一个程序逻辑时,使用例如C,java,或者plsql来做。 当实现查询、修改时使用SQL做。这种特殊的逻辑处理方法由各种语言混合组成(即让各种语言做他们擅长的事情)。但会出现将程序逻辑搬到SQL语言的风险。因为SQL语言是资源敏感型的,所以这时写的SQL语句就可能比较差。使用DECODE方式的SQL通常可以被优化成包括大量的OR谓词或者union和minus组成的操作。(也就是说DECODE效率低,应该用or 加上 union或minus去写一个等价的SQL语句。那为什么开发人员会写大量的DECODE呢?因为DECODE的逻辑和程序开发的IF/ELSEIF/ELSE是很类似的,他们更喜欢这种逻辑处理的方式,而忘记了SQL就是SQL,不能把其他的东西加进来。可能有人会说,那你oracle为什么要提供这些呢?不提供的话开发人员不就不去用了吗?这就好比菜刀一样,在中国很危险,要实名制;而在美国,多数人吃饭全是刀叉却没出现人在饭桌上被杀一样。不能因为它危险所以就不提供。)


  5. Cache frequently accessed, rarely changing data that is expensive to retrieve on a repeated basis. However, make this cache mechanism easy to use, and ensure that it is indeed cheaper than accessing the data in the original method. This is applicable to all modules where frequently used data values should be cached or stored locally, rather than be repeatedly retrieved from a remote or expensive data store.

    当一些数据要被频繁访问的话尽可能去缓存它,很少去修改的数据从数据库访问是非常昂贵的。这种实现缓存的机制实现起来是非常简单的,确保它的确是直接访问数据库要效率更高时再使用。这种机制主要是针对那些所有模块都频繁去访问的数据应该缓存到本地来,而不是频繁地从远程或者昂贵的数据存储中获取。(11G中多了一些result cache,就是对付这些懒人的;但是问题又来了,DBA通过配置将访问的性能提升后,有些项目经理竟然认为实现缓存倒不如直接从数据库上取,没文化的领导最可怕啊!)


    The most common examples of candidates for local caching include the following:   下面是一些需要本地缓存的例子:

    • Today's date. SELECT SYSDATE FROM DUAL can account for over 60% of the workload on a database.  今天的日期,sysdate很消耗资源的

    • The current user name.      当前的用户名

    • Repeated application variables and constants, such as tax rates, discounting rates, or location information. 重复的常量如税率,贴现率,或位置信息

    • Caching data locally can be further extended into building a local data cache into the application server middle tiers. This helps take load off the central database servers. However, care should be taken when constructing local caches so that they do not become so complex that they cease to give a performance gain.

      可以将缓存的机制放到中间层实现。这可以减缓数据库的压力。但是设计的时候要注意别设计的太复杂,否则为了那一点的性能而得不偿失。

    • Local sequence generation.       本地的sequence生成(如果是多模块,多机器并发时,使用数据库更方便些)

    The design implications of using a cache should be considered. For example, if a user is connected at midnight and the date is cached, then the user's date value becomes invalid.

    但是如果使用缓存的话,设计时要注意些。比如缓存了日期的话,结果到了晚上12:00左右时,如果直接访问缓存的话,这个日期就可能会失效了。(一般情况下,我们很少去缓存这个日期,但是如果真要想去缓存的话,最好的办法是让应用服务器和数据库服务器做时间同步,就像RAC一样,那直接从本地取就更方便一些了)

  6. Optimize the interfaces between components, and ensure that all components are used in the most scalable configuration. This rule requires minimal explanation and applies to all modules and their interfaces.

    当设计模块之间的接口时,确保所有的模块可以运行在合适的范围内。这不需要再解释了,oracle说你懂的。


  7. Use foreign key references. Enforcing referential integrity through an application is expensive. You can maintain a foreign key reference by selecting the column value of the child from the parent and ensuring that it exists. The foreign key constraint enforcement supplied by Oracle—which does not use SQL—is fast, easy to declare, and does not create network traffic.

    使用外键约束。使用程序去管理这种约束通常是比较昂贵的。通过约束可以保证数据的一致性。通过oracle的约束机制--不是使用SQL实现的,所以运行速度很快,定义有很简单,还不会造成频繁访问时的网络问题。(但是中兴和广发两个变态的银行全用DB2数据库,我的程序就可能要运行在DB2数据库的?如何迁移呢?DB2慢的话我还要重写程序逻辑。所以在银行中多数项目基本上看不到约束的。----如果是参数表的定义,基本上放到JAVA端去做验证,后台C在运行时也会再验证一次,确保没问题。如果是业务逻辑处理的话,全由C去验证的。很多臭长臭长的代码都在处理这种一致性,如果大家都使用oracle该多好,oracle自己应该也这么想)


  8. Consider setting up action and module names in the application to use with End-to-End Application Tracing. This allows greater flexibility in tracing workload problems. See "End-to-End Application Tracing".

    考虑使用action和module名(这两个变量在v$session中,可以由我们来定义),以便进行end-to-end应用跟踪。这增强了应用程序的跟踪测试灵活性。(以前从来没用到啊,未来一定去用一下,毕竟是DBA级别的人了,不能再跟小三那样混)


 



2.5.7 Trends in Application Development

The two biggest challenges in application development today are the increased use of Java to replace compiled C or C++ applications, and increased use of object-oriented techniques, influencing the schema design.

Java provides better portability of code and availability to programmers. However, there are several performance implications associated with Java. Because Java is an interpreted language, it is slower at executing similar logic than compiled languages, such as C. As a result, resource usage of client computers increases. This requires more powerful CPUs to be applied in the client or middle-tier computers and greater care from programmers to produce efficient code.

Because Java is an object-oriented language, it encourages insulation of data access into classes not performing the business logic. As a result, programmers might invoke methods without knowledge of the efficiency of the data access method being used. This tends to result in minimal database access and uses the simplest and crudest interfaces to the database.

在现在的应用设计中,有两个大的挑战,一个是JAVA不断的在替换C/C++之类的编译型语言(不是C不好,主要开发效率低,开发人员能力差),一个是面向对象的设计理念。

JAVA语言开发迁移都比较的方便灵活。但是JAVA作为一个解释型语言,它要比C语言这种编译型语言慢很多。因此,客户端的程序资源占用就会增加。这就要求客户端拥有更强的CPU以及开发人员要编码更好的代码。(很多JAVA人员他们认为JAVA的性能和C已经差不多了,说这话的基本上我都认为他们水平很差。有次我给他们演示性能差距有多大:我处理一个好像5万多行的XML文件。使用C语言处理总共需要0.7秒,使用python需要1.2秒,使用JAVA处理需要4.2秒。因为PYTHON使用了C的底层库,所以和C是一路货,性能还可以。但是JAVA慢了刚好6倍,结果那JAVA的哥们说,4秒多钟你还接受不了啊,这就是他们的思维!不做解释的!)

因为JAVA是一个面向对象的语言,它鼓励将数据访问的封闭,而不执行业务逻辑(先封闭再通过方法执行)。结果程序员可能在没有任何高效的中间数据访问层知识的情况下去调用“方法”处理这些数据,也就是实现业务逻辑。这降低了数据库的访问以及使用最简单的数据库接口。

 

With this type of software design, queries do not always include all the WHERE predicates to be efficient, and row filtering is performed in the Java program. This is very inefficient. In addition, for DML operations—and especially for INSERTs—single INSERTs are performed, making use of the array interface impossible. In some cases, this is made more inefficient by procedure calls. More resources are used moving the data to and from the database than in the actual database calls.

In general, it is best to place data access calls next to the business logic to achieve the best overall transaction design.

The acceptance of object-orientation at a programming level has led to the creation of object-oriented databases within the Oracle Server. This has manifested itself in many ways, from storing object structures within BLOBs and only using the database effectively as an indexed card file to the use of the Oracle Database object-relational features.

If you adopt an object-oriented approach to schema design, then ensure that you do not lose the flexibility of the relational storage model. In many cases, the object-oriented approach to schema design ends up in a heavily denormalized data structure that requires considerable maintenance and REF pointers associated with objects. Often, these designs represent a step backward to the hierarchical and network database designs that were replaced with the relational storage method.

In summary, if you are storing your data in your database for the long-term, and if you anticipate a degree of ad hoc queries or application development on the same schema, then the relational storage method probably gives the best performance and flexibility.

使用这种类型的软件设计,查询语句不总是包括那么高效的where条件,直接通过JAVA程序进行数据库过滤。这种方式效率是不高的。另外,对于像insert之类的DML操作,通常都是单行插入操作,使用数组接口不太容易。有些情况下,存储过程的调用更加低效。因为多数的资源被“发送数据到数据库”和“从数据库接收”这种工作中消耗了。

通常来说,最好的方式最将数据访问层移到业务逻辑处理中实现,以达到最佳的总体事务设计

这种面向对象的设计已经引领着数据库服务器上创建面向对象的数据库。(不太理解什么是面向对象数据库,postgres好像是这种数据库,我没什么研究)这已经通过许多方式证明了,从存储到BLOBS中的对象字段和高效的索引卡目录到使用面向对象的数据库特性。

假如你采用了面向对象的设计模型,那么确信你没有丢失关系存储模型的灵活性。在许多情况下,面向对象的设计常常以需要相当大维护和REF指针关联对象的此类违反正常数据结构设计收场。(具体涵义我也不太懂,大致应该是设计层次过多,引用过多,最终导致无法维护,C++多级多层继承,容易出现混乱)很多时候,这种设计模型像倒退到以前的那种层次型或网状层的数据库管理系统一样。(我没见过层次型数据库,但是我见过IBM以前的一些文档,他们将表从层次型升级到关系型时还有一些是此类的图表在里面)

总的来说,假如你希望能将数据存储的久一些,同时希望能按需去查询或设计相似的应用,那么关系存储方式能给你最大的性能和灵活性。

 



2.6 Workload Testing, Modeling, and Implementation

This section describes workload estimation, modeling, implementation, and testing. This section covers the following topics:

  • Sizing Data

  • Estimating Workloads

  • Application Modeling

  • Testing, Debugging, and Validating a Design

2.6.1 Sizing Data

You could experience errors in your sizing estimates when dealing with variable length data if you work with a poor sample set. As data volumes grow, your key lengths could grow considerably, altering your assumptions for column sizes.

When the system becomes operational, it becomes more difficult to predict database growth, especially for indexes. Tables grow over time, and indexes are subject to the individual behavior of the application in terms of key generation, insertion pattern, and deletion of rows. The worst case is where you insert using an ascending key, and then delete most rows from the left-hand side but not all the rows. This leaves gaps and wasted space. If you have index use like this, then ensure that you know how to use the online index rebuild facility.

DBAs should monitor space allocation for each object and look for objects that may grow out of control. A good understanding of the application can highlight objects that may grow rapidly or unpredictably. This is a crucial part of both performance and availability planning for any system. When implementing the production database, the design should attempt to ensure that minimal space management takes place when interactive users are using the application. This applies for all data, temp, and rollback segments.

在处理很差的可变长的样板数据集时,你也许有错误估算的经历。随着数据量的增长,你关键数据长度可能相应增加,超出了你预想的列长度。(没看懂,自个看上面原文理解)

当你系统上线后,它可能变得更加难以预测,特别是索引值。表不断的增长,索引也增长,索引结点不断的产生,插入或删除。最差的情况是当使用一个递增的关键字作为索引,然后却又从最小的值开始删除一部分数据(不是删除所有的值)。这将导致缺口或空间浪费。假如你的索引是这种方式创建的,那你一定要知道如何去在线的重建索引。(alter index xxxx rebuild online; 即可)

DBA应该监控每个对象的空间分配情况以确保没有超出控制。如果对应用程序有较好的理解,就可以了解重要的对象快速增长或不可预测情况。这个是对于性能和可用性方面都非常重要的设计内容之一。在配置生产数据库时,当有交互式用户使用此应用时要确保要最少的管理空间。包括了所有的数据文件,临时文件,UNDO段。

 



2.6.2 Estimating Workloads

Considering the number of variables involved, estimation of workloads for capacity planning and testing purposes is extremely difficult. However, designers must specify computers with CPUs, memory, and disk drives, and eventually roll out an application. There are several techniques used for sizing, and each technique has merit. When sizing, it is best to use the following two methods to validate your decision-making process and provide supporting documentation:

考虑到有各种的可能性,评估压力容量计划和测试是相当困难的。然而设计人员必须要计划CPU,内存,硬盘,最终展开应用。有许多方法可以去做评估,每种都自己的优点。可以使用下面的两种方法来帮助评估:

  • Extrapolating From a Similar System    从类似的系统中推断

  • Benchmarking                                    用基准问题测试

2.6.2.1 Extrapolating From a Similar System

This is an entirely empirical approach where an existing system of similar characteristics and known performance is used as a basis system. The specification of this system is then modified by the sizing specialist according to the known differences. This approach has merit in that it correlates with an existing system, but it provides little assistance when dealing with the differences.

This approach is used in nearly all large engineering disciplines when preparing the cost of an engineering project, such as a large building, a ship, a bridge, or an oil rig. If the reference system is an order of magnitude different in size from the anticipated system, then some components may have exceeded their design limits.

这是一个完全靠经验的做的方式,当存在一个相似的系统,根据这个类似系统的性能就大概推断一下自己系统性能是多少了。然后根据两个系统不同的处再进行相应的修正一下。这个方法对于那么有相似处的地方可以借鉴一下,但是对于差异处就不好做判断了!

这种方法在许多大型的项目预算中广泛的使用,比如大厦建设,船舶,大桥,石油钻探设备等等。假如比较的是一个不同数量级的项目,那么有些模块就有可能超出设计的范畴。

注:这个比较的方法在软件中用于更加多了,基本上架构师认为一个模块,比如根据平安银行经验,可能在2天内能做完,那就按2天的工作量报价了。平安银行假如说项目金额是120万,那么在广州银行中相似的项目,如果多增加了一些其他的功能,报价就加到150万等等。但是有时候也不一定,因为要看销售人员的水平了,如果他们只想着完成任务,不管项目人员的死活,可能80W就敢做的。

 



2.6.2.2 Benchmarking   基准测试(这部分我翻译很差劲,最好自己去看原文)

The benchmarking process is both resource and time consuming, and it might not produce the correct results. By simulating an application in early development or prototype form, there is a danger of measuring something that has no resemblance to the actual production system. This sounds strange, but over the many years of benchmarking customer applications with the database development organization, Oracle has yet to see reliable correlation between the benchmark application and the actual production system. This is mainly due to the number of application inefficiencies introduced in the development process.

However, benchmarks have been used successfully to size systems to an acceptable level of accuracy. In particular, benchmarks are very good at determining the actual I/O requirements and testing recovery processes when a system is fully loaded.

Benchmarks by their nature stress all system components to their limits. As the benchmark stresses all components, be prepared to see all errors in application design and implementation manifest themselves while benchmarking. Benchmarks also test database, operating system, and hardware components. Because most benchmarks are performed in a rush, expect setbacks and problems when a system component fails. Benchmarking is a stressful activity, and it takes considerable experience to get the most out of a benchmarking exercise.

基准测试的过程是一个资源和时间消耗的过程,同时也可能不会产生一个正确的结果。通过模拟一个早期开发的应用程序或者设计原型,存在着可能测试了一个与生产系统没有相似处的风险(测试的很high,但是生产上根本不用这个功能)。听起来很奇怪,但是根据从事多年的基准测试应用和开发人员的经验,oracle已经能可靠地识别出基准与实际生产的关系。这主要是由于前面开发阶段介绍的低效的应用。

然而,基准测试已经被成功地使用在相对准确的可接受的评测系统中。特别是基准测试对于评测实际的IO请求和测试系统满荷进行时恢复进程数是非常有效的。

基准测试根据自己的特点,对系统中所有的模块进行加压。通过这个压力测试,去发现应用程序中设计时要求的错误日志信息及看看自己到底是半斤还是八两。基准测试也会去测试数据库,操作系统,和硬件部件。因为多数的基准测试时间都使用狂风暴雨般的方式期望能测出一点问题出来。基准测试是一个压力测试行动,它需要许多的经验。

 



2.6.3 Application Modeling

Modeling the application can range from complex mathematical modeling exercises to the classic simple calculations performed on the back of an envelope. Both methods have merit, with one attempting to be very precise and the other making gross estimates. The downside of both methods is that they do not allow for implementation errors and inefficiencies.

The estimation and sizing process is an imprecise science. However, by investigating the process, some intelligent estimates can be made. The whole estimation process makes no allowances for application inefficiencies introduced by poor SQL, index design, or cursor management. A sizing engineer should build in margin for application inefficiencies. A performance engineer should discover the inefficiencies and make the estimates look realistic. The Oracle performance method describes how to discover the application inefficiencies.

设计模型可以是一个非常复杂的数学模型,或者是一个粗略的估算(或计划)。两者各有优点,一个希望极度的精确,一个是总的估算。两者的底线是都不允许出现低效和错误。

这个评估和估算过程是一个不太准确的方式。通审查这个过程,就可以大概的得出一些评估。整个评估的过程不对那么低效的SQL,索引,游标管理做限制。一个分析工程师应该建立一个低效应用的界线。一个性能工程师应该能发现这些低效以及使这些评估更加真实。oracle性能优化技术介绍了如何去发现这些低效的应用。

 



2.6.4 Testing, Debugging, and Validating a Design

The testing process mainly consists of functional and stability testing. At some point in the process, performance testing is performed.

The following list describes some simple rules for performance testing an application. If correctly documented, then this list provides important information for the production application and the capacity planning process after the application has gone live.

测试主要集中在功能和稳定性上。性能或压力测试有时也是在这个时候才做的。

下面几项描述了一些简单的方法去测试应用程序的性能。假如准确地书面化,那么下面的几项提供了应用开发中非常重要的一些信息和已经上线系统的计划处理。

  • Use the Automatic Database Diagnostic Monitor (ADDM) and SQL Tuning Advisor for design validation  使用ADDM和STA去验证

  • Test with realistic data volumes and distributions     使用真实的数据量和分布情况进行测试

    All testing must be done with fully populated tables. The test database should contain data representative of the production system in terms of data volume and cardinality between tables. All the production indexes should be built and the schema statistics should be populated correctly.  所有的测试都应该覆盖到整个表。测试数据库应该包括能代表生产系统的相对真实数据和相应的数据量。所有生产上的索引也应该建立起来,以及收集了相应的统计数据。(注:如果测试的数据和生产数据差别比较大的话,这个索引有时候就失效了。比如计划的是查询类交易可能会占30%左右,但是结果在压力测试时90%都是金融类交易,这样就无法体现查询交易的真实情况,如果此时再使用AWR,ADDM,STA都越做越偏离的远)

  • Use the correct optimizer mode       使用正确的优化方式

    Perform all testing with the optimizer mode that you plan to use in production. All Oracle Database research and development effort is focused on the query optimizer. Therefore, the use of the query optimizer is recommended.  使用你计划在生产上使用的那些优化模式去做相应测试。所有的数据库研究和开发都应该集中在查询的优化上。因此建议使用查询优化器。  (这个是方法论的概念了!)

  • Test a single user performance       测试单一用户的性能

    Test a single user on an idle or lightly-used database for acceptable performance. If a single user cannot achieve acceptable performance under ideal conditions, then multiple users cannot achieve acceptable performance under real conditions. 在一个小型数据库上测试单一用户时可接受的性能。假如在理想情况下单用户都达不到要求,那么在真实条件下多用户更不可能满足要求了。

  • Obtain and document plans for all SQL statements   获取和记录所有语句的执行计划

    Obtain an execution plan for each SQL statement. Use this process to verify that the optimizer is obtaining an optimal execution plan, and that the relative cost of the SQL statement is understood in terms of CPU time and physical I/Os. This process assists in identifying the heavy use transactions that require the most tuning and performance work in the future.     获取所有语句的执行计划。这个确认优化器使用了最佳的执行计划,以及相关的CPU和IO等资源消耗情况。这个过程有助于了解那些重要的或者是重量级的事务,以便于在未来重点去调试。

  • Attempt multiuser testing              尝试多用户测试

    This process is difficult to perform accurately, because user workload and profiles might not be fully quantified. However, transactions performing DML statements should be tested to ensure that there are no locking conflicts or serialization problems.  这个过程很难精确的执行,因为用户压力和属性无法完整的描述。然而通过大量的DML事务处理应该能确认没有锁的冲突或者导致串行性价比问题。

  • Test with the correct hardware configuration     测试硬件配置

    Test with a configuration as close to the production system as possible. Using a realistic system is particularly important for network latencies, I/O subsystem bandwidth, and processor type and speed. Failing to use this approach may result in an incorrect analysis of potential performance problems.    尽可能和生产一样去配置系统。使用一个较真实系统对于测试网络延迟,IO带宽,CPU速度是非常重要的。如果没有这样做的话可能导致不正确的分析潜在的性能问题。(这个在银行是很难做到的,基本上测试环境很糟糕,或者有的银行是测试机非常的厉害如16核768G内存,但是生产系统8核16G内存等等)

  • Measure steady state performance                   测量稳定状态性能

    When benchmarking, it is important to measure the performance under steady state conditions. Each benchmark run should have a ramp-up phase, where users are connected to the application and gradually start performing work on the application. This process allows for frequently cached data to be initialized into the cache and single execution operations—such as parsing—to be completed before the steady state condition. Likewise, at the end of a benchmark run, there should be a ramp-down period, where resources are freed from the system and users cease work and disconnect.    当基准测试时,在一个稳定的状态下进行性能测试是非常重要的。基准测试开始时都有一个上升阶段,用户连接到系统上并开始执行一些工作。这个过程允许那些需要被频繁访问的数据现在开始初始化,也就是开始缓存进来并进行单一用户执行操作,例如解析过程,在真正进入稳定状态时已经全部做过一次了。同样的,在基准测试进入尾声,将会有一个下降的过程,因为用户停止进行测试以及断开连接,资源将会被释放出来。

 



2.7 Deploying New Applications   应用程序部署

This section describes the following design decisions involved in deploying applications:

  • Rollout Strategies             展示或推广策略

  • Performance Checklist       性能检查项

2.7.1 Rollout Strategies

When new applications are rolled out, two strategies are commonly adopted:

  • Big Bang approach - all users migrate to the new system at once                     大爆炸方式,所有用户一次性迁移过来

  • Trickle approach - users slowly migrate from existing systems to the new one   涓涓细流方式,用户慢慢地慢慢地迁移

Both approaches have merits and disadvantages. The Big Bang approach relies on reliable testing of the application at the required scale, but has the advantage of minimal data conversion and synchronization with the old system, because it is simply switched off. The Trickle approach allows debugging of scalability issues as the workload increases, but might mean that data must be migrated to and from legacy systems as the transition takes place.

It is difficult to recommend one approach over the other, because each method has associated risks that could lead to system outages as the transition takes place. Certainly, the Trickle approach allows profiling of real users as they are introduced to the new application, and allows the system to be reconfigured while only affecting the migrated users. This approach affects the work of the early adopters, but limits the load on support services. This means that unscheduled outages only affect a small percentage of the user population.

The decision on how to roll out a new application is specific to each business. Any adopted approach has its own unique pressures and stresses. The more testing and knowledge that you derive from the testing process, the more you realize what is best for the rollout.

每种方式都有自己的优缺点。快速全部的迁移方式需要对系统有强有力的测试,以及有系统稳定的信念,这种方式的优点是需要最少的数据转换工程和与旧系统的同步过程,因为一开始迁移时就全部迁移完了。而逐步迁移的方式则能在慢慢迁移中发现系统的不足,这样可以保证旧系统在这段时间内还可以发挥自己的余热,慢慢过渡。

建议使用哪种方式是比较难的,每种方式都有自己过渡时停运的风险。慢慢迁移方式允许对真实的用户做一些试运行,这样在系统重新更新时仅影响小部件用户。这种方式仅影响前期的试用者,但是对于压力是不够的。反正就是仅仅影响个别用户。

采用哪种方式进行迁移,要根据自己的业务需要决定。任何一种决定的途径都有自己的压力及着重点。随着你对测试过程了解的越多,你将会知道哪种迁移方式最合理。

(在银行中,这两种方式我都使用过。本质上没有什么太大区别。慢慢迁移可以做到缓解自己压力,因为新系统上线很难做到不出问题的,这样给自己的一个缓冲期是最明智的了;但是许多银行人员比较喜欢一次性上线,这样省得下次还得再迁移,再迁移,挺麻烦的(上线停运通常都是晚上23点以后,许多系统还是星期六做,要人命的)。不过我还使用过第三种方式,就是将生产数据同时在新旧系统上跑,新系统仅仅是为了验证使用,等跑了个把个月之后,发现和系统对比没有区别时,如果有问题就迅速修正,等一切都稳定之后,再将旧系统停掉,这样就无缝的迁移过来了)

 



2.7.2 Performance Checklist

To assist in the rollout, build a list of tasks that increase the chance of optimal performance in production and enable rapid debugging of the application. Do the following:  为了协助升迁,按下面的方式建立一个任务列表,增加你的上线成功机率,以及快速的问题跟踪:

  1. When you create the control file for the production database, allow for growth by setting MAXINSTANCES, MAXDATAFILES, MAXLOGFILES, MAXLOGMEMBERS, and MAXLOGHISTORY to values higher than what you anticipate for the rollout. This technique results in more disk space usage and larger control files, but saves time later should these need extension in an emergency.      当你创建控制文件时,增加MAXINSTANCES, MAXDATAFILES, MAXLOGFILES, MAXLOGMEMBERS, and MAXLOGHISTORY这些值参数,也就是尽量参数调大一些。虽然一开始多占用了一些空间,但是如果在未来需要紧急调整时,这将会节约不少时间。

  2. Set block size to the value used to develop the application. Export the schema statistics from the development or test environment to the production database if the testing was done on representative data volumes and the current SQL execution plans are correct. block size设置要和开发测试机一致。如果测试环境上的数据量和生产上是一致的,同时当年的SQL执行计划是正确的话,将测试机统计数据导出来,然后导到生产系统上。

  3. Set the minimal number of initialization parameters. Ideally, most other parameters should be left at default. If there is more tuning to perform, then this appears when the system is under load. See Chapter 4, "Configuring a Database for Performance" for information about parameter settings in an initial instance configuration.   设置最小数据的初始化参数。理想情况下,其他的参数默认设置就可以了。假如有许多需要调优的地方,那么在相应的压力下很快问题就可以会出现。

  4. Be prepared to manage block contention by setting storage options of database objects. Tables and indexes that experience high INSERT/UPDATE/DELETE rates should be created with automatic segment space management. To avoid contention of rollback segments, use automatic undo management. See Chapter 4, "Configuring a Database for Performance" for information about undo and temporary segments.   设置存储参数,管理块竞争。尽量使用ASSM以及自动UNDO_MANAGEMENT参数。

  5. All SQL statements should be verified to be optimal and their resource usage understood.  所有的SQL语句都应该检查过,并且知道他们对资源的消耗程序。(在银行里,任何一个小的账务系统都要比像阿里巴巴之类的互联网企业要复杂的多,有可能SQL语句有上千个之多,所以这个相对是比较难的。但是通常情况下,在开发阶段要迭代式的对SQL进行检查)

  6. Validate that middleware and programs that connect to the database are efficient in their connection management and do not logon or logoff repeatedly.   检查一下你的中间层软件,看看他们在连接数据库的策略是不是高效,确保他们不会频繁的连接退出,连接退出。

  7. Validate that the SQL statements use cursors efficiently. The database should parse each SQL statement once and then execute it multiple times. The most common reason this does not happen is because bind variables are not used properly and WHERE clause predicates are sent as string literals. If you use precompilers to develop the application, then make sure to reset the parameters MAXOPENCURSORS, HOLD_CURSOR, and RELEASE_CURSOR from the default values before precompiling the application.  验证你的SQL语句是否有效的使用了游标。数据库应该一次解析,多次执行。很多情况下,由于开发人员没有使用绑定变量以及直接在where查询条件中使用字面值参数导致。如果你使用预编译方式开发应用程序,确保在重新编译时重置了MAXOPENCURSORS, HOLD_CURSOR, and RELEASE_CURSOR 值

  8. Validate that all schema objects have been correctly migrated from the development environment to the production database. This includes tables, indexes, sequences, triggers, packages, procedures, functions, Java objects, synonyms, grants, and views. Ensure that any modifications made in testing are made to the production system.   确保你所有的表、索引、视图、sequence、synonym、存储过程、函数、触发器、授权等等已经从开发环境迁移到了生产环境。确保测试环境中的任何小补丁都在生产环境中同样应用。(一般我们是在上线前把所有的建表、索引等语句全部像mysql里创建库一样,写到xxx.sql文件中,然后在测试环境中建立一个“上线演练环境”,这个环境主要是演练上线执行脚本的,不对业务做测试的。执行完xxx.sql脚本后,对比一下UAT测试环境就知道了。)(银行里每个项目组都有:开发环境:供开发人员随意的修改的;SIT环境,相对稳定的开发环境,一般是开发模块稳定后,可以和其他人编写的模块进行联调时就放到这个环境下,供自己项目组的测试人员测试,这个模块下的代码要使用SVN进行控制的;UAT环境:供行方测试人员测试,这个阶段代码控制更严格;上线演练环境:对上线前的执行脚本或者迁移脚本进行演练,这个时候代码已经被冻结,一般不会让改动的; 压力测试环境:这个更简单了,就是做压力测试的,通常是有经验的专家来做,他可以直接对代码修改,以查询调优性能,最终将新的修改合并到SIT上重新进行测试。     通常情况下,开发环境,SIT,UAT,上线演练环境这几个都在开发服务器上,通过不同的用户区分开来;而压力测试环境会抽一个性能比较好的机器临时供项目组进行压力测试。一般来说,除了全新的项目会在上线前进行压力测试,其他的维护类项目,比如到了二期,三期时项目基本上已经稳定,没人会再去做压力测试了,一二个新增功能即使效率低也不会有太大影响,所以直接在测试环境中大概测一下,通常SQL执行计划走到自己预想的索引或全表扫描上,就算成功了。具体的性能如何都是直接通过观察生产环境来做最终的判断。)

  9. As soon as the system is rolled out, establish a baseline set of statistics from the database and operating system. This first set of statistics validates or corrects any assumptions made in the design and rollout process.    一旦成功的上线后,建立一个操作系统和数据库的统计基线。首次的统计数据通常是用来验证或者修正在开发中的那个预想(假设)是否正确。(一般只要不出现大的性能和功能上的问题,都是在系统业务量和功能稳定后再做的)

  10. Start anticipating the first bottleneck (which is inevitable) and follow the Oracle performance method to make performance improvement. For more information, see Chapter 3, "Performance Improvement Methods".    开始查看你的性能瓶颈(一般是肯定有的),然后按照oracle性能方法去提升。(这个不一定的,我们一定要讲究一次性成功的,偶尔出问题的地方是上线脚本可能有问题,或者是参数设计问题导致某个业务运转不正常,很少有性能方面问题。可能有人觉得不可思议,怎么可能一个新项目没性能问题,毕竟你们的环境都不一样? 答案是在评估服务器时通常要比测试环境高几倍的方式估算,以应对未来业务量的增长。比如在8核的测试环境中性能是单进程每秒钟2000笔交易,单服务器起10进程,那么一台服务器一秒钟可以并发2000*10笔交易。上线后银行用2台机器做集群,每台机器是16核,那么通常情况一秒钟交易量基本上可以按5-6万来算了;有些非常重要的系统,集群数更多。我听过一个联通的系统,每天处理4000万交易,用了20多台机器做集群;前方8台机器做转发,后台N台处理)

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/1696240/viewspace-1169005/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/1696240/viewspace-1169005/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值