Linux page cache与buffer cache优化

page cache的优化以及如何回收cache


page cache的优化 要优化page cache,需要关注两个操作系统参数:

当发生一个读操作,首先去pagecache里面去找,找到了就直接返回了,没有找到就去磁盘读取文件,然后再写入这个pagecache,然后再读取,最终返回需要的数据。

当写入只是将数据暂时存入pagecache,并且置为dirty标志,写入pagecache的数据会被定期,批量的保存在文件系统上面,这样就减少了对磁盘的操作次数,减少系统的开销。

  • vm.dirty_background_ratio:这个参数指定了当文件系统缓存脏页(Page Cache中的数据称为脏页数据)数量达到系统内存百分之多少时(默认10%)就会触发pdflush/flush/kdmflush等后台回写进程运行,将一定缓存的脏页异步地刷入磁盘。增减这个值是最主要的调优手段。
  • vm.dirty_ratio:这个参数则指定了当文件系统缓存脏页数量达到系统内存百分之多少时(默认20%),系统不得不开始处理缓存脏页(因为此时脏页数量已经比较多,为了避免数据丢失需要将一定脏页刷入磁盘);在此过程中很多应用进程可能会因为系统刷新内存数据到磁盘而发生IO阻塞。

vm.dirty_background_ratio 这个是将脏数据刷到内存的标准值

vm.dirty_ratio 这个是最大值

最好不要让vm.dirty_background_ratio达到vm.dirty_ratio值,那么刷新的过程当中会出现阻塞。

这两个系统参数对应的文件为:

vm.dirty_background_ratio:/proc/sys/vm/dirty_background_ratio vm.dirty_ratio:/proc/sys/vm/dirty_ratio

作为通用优化设置,建议将vm.dirty_background_ratio设置为5%,vm.dirty_ratio设置为10%。具体的设置根据不同环境,需要进行测试、再测试。(参考值,设置为这两个值效果还是不错的)

 

 

 

page cache的优化以及如何回收cache


如何回收cache linux提供了几个参数,用来释放cache,具体如下: 要释放page cache,可执行如下命令:     echo 1 >  /proc/sys/vm/drop_caches

要释放文件节点(inodes)缓存和目录项缓存(dentries),大部分缓存数据都是用的page cache,执行如下命令:(释放的是大部分的pagecache,不是所有的)     echo 2 >  /proc/sys/vm/drop_caches

要释放page cache、dentries和inodes缓存,执行如下命令:(释放了所有的pagecache,目录项缓存以及文件节点缓存)     echo 3 >  /proc/sys/vm/drop_caches 

执行上面这些的时候,最好执行一下sync,也就是将脏页数据刷到硬盘上面去,避免缓存当中的数据丢失。

 

 

关于swap的使用与优化 


swap交换分区的使用 创建交换空间所需的交换文件是一个普通的文件,但是,创建交换文件与创建普通文件不同,必须通过dd命令来完成,同时这个文件必须位于本地硬盘上。

[root@localhost ~]# dd if=/dev/zero of=/data/swapfile bs=1024 count=65536

要使用swap,首先要激活swap,通过mkswap命令指定作为交换空间的设备或者文件:

[root@localhost ~]#mkswap  /data/swapfile

最后,通过swapon命令激活swap:

[root@localhost ~]#/usr/sbin/swapon /data/swapfile 

swap的优化 swappiness的值的大小对如何使用swap分区是有着很大的联系的。swappiness=0的时候表示最大限度使用物理内存,然后才是 swap空间,swappiness=100的时候表示积极的使用swap分区。

[root@slave034 ~]# cat /proc/sys/vm/swappiness 60

默认值为60,意思是说,系统的物理内存在使用到100-60=40%的时候,就可以开始使用交换分区了。此参数设置了使用交换分区的可能性大小。

在/etc/sysctl.conf文件中修改,加上如下内容: vm.swappiness=10 然后执行命令: sysctl  -p,永久生效。

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
书名: SQLServer2008查询性能优化 作者: 弗里奇(Grant Fritchey) 出版社: 人民邮电出版社 出版日期: 2010年8月1日 ISBN: 9787115230294 编辑推荐 《SQL Server 2008查询性能优化》为你提供了处理查询性能所需要的工具。建立、维护数据库和数据库服务器可能是个困难的工作。当服务器的运行越来越慢时,这个工作就变得更加困难。来自用户的愤怒的电话以及站在你办公桌周围的管理人员都使你很不快活。在开发代码的同时,如果你花费时间和精力来开发一个性能故障排错的方法。那么你就能避免这种情况——至少可以快速而有效地做出反应。《SQL Server 2008查询性能优化》指出的性能要点之一是数据库随着用户和数据的日益增多而进行扩展的必要性。你需要理解性能低下的起因。以及识别并修复它们的方法。《SQL Server 2008查询性能优化》将帮助你: 使用性能监视器、SQL Trace以及动态管理视图和函数建立性能基线 理解一般系统中发生瓶颈的地方。以及解决瓶颈的方法 识别常见性能问题以及对其快速处理的方法 实施修复甚至预防性能问题的T-SQL最佳实践 《SQL Server 2008查询性能优化》不是理论书籍,它的目的是帮助你避免数据库出现性能低下的状况,它还能帮助你保住你的工作。 内容提要 《SQL Server 2008查询性能优化》通过大量实例,详细介绍了SQL Server数据库系统优化的各种方法和技巧。内容涵盖了数据库应用系统中各种性能瓶颈的表现形式及其发生的根源和解决方法,从硬件瓶颈到查询、索引设计以及数据库管理等,贯穿了数据库系统知识的各个方面。最后以一个实际的工作负载将所有技巧联系起来,并且提供了“宝典”式的最佳实践列表。 《SQL Server 2008查询性能优化》适合于关心数据库应用系统性能的开发人员和数据库管理人员阅读。通过阅读《SQL Server 2008查询性能优化》,不仅可以学习到数据库性能管理的许多知识和技巧,还有助于养成良好的编程习惯,为实现高性能的数据库应用系统打下基础。 目录 第1章 SQL查询性能调整 1 1.1 性能调整过程 2 1.1.1 核心过程 2 1.1.2 迭代过程 4 1.2 性能vs.价格 7 1.2.1 性能目标 7 1.2.2 “足够好”的调整 7 1.3 性能基线 8 1.4 工作的重点 9 1.5 SQL Server性能杀手 10 1.5.1 低质量的索引 10 1.5.2 不精确的统计 11 1.5.3 过多的阻塞和死锁 11 1.5.4 不基于数据集的操作 11 1.5.5 低质量的查询设计 12 1.5.6 低质量的数据库设计 12 1.5.7 过多的碎片 12 1.5.8 不可重用的执行计划 13 1.5.9 低质量的执行计划 13 1.5.10 频繁重编译计划 13 1.5.11 游标的错误使用 13 1.5.12 错误配置数据库日志 14 1.5.13 过多使用或者错误配置tempdb 14 1.6 小结 14 第2章 系统性能分析 15 2.1 性能监视器工具 15 2.2 动态管理视图 17 2.3 硬件资源瓶颈 18 2.3.1 识别瓶颈 18 2.3.2 瓶颈解决方案 19 2.4 内存瓶颈分析 19 2.4.1 SQL Server内存管理 20 2.4.2 Available Bytes 23 2.4.3 Pages/sec和Page Faults/sec计数器 23 2.4.4 Buffer Cache Hit Ratio 24 2.4.5 Page Life Expectancy 24 2.4.6 Checkpoint Pages/sec 24 2.4.7 Lazy writes/sec 24 2.4.8 Memory Grants Pending 25 2.4.9 Target Server Memory(KB)和Total Server Memory(KB) 25 2.5 内存瓶颈解决方案 25 2.5.1 优化应用程序工作负载 26 2.5.2 为SQL Server分配更多内存 27 2.5.3 增加系统内存 27 2.5.4 更换32位处理器为64位处理器 27 2.5.5 启用3GB进程空间 28 2.5.6 在32位SQL Server中使用4GB以上内存 28 2.6 磁盘瓶颈分析 29 2.6.1 磁盘计数器 30 2.6.2 % Disk Time 30 2.6.3 Current Disk Queue Length 31 2.6.4 Disk Transfers/sec 31 2.6.5 Disk Bytes/sec 32 2.6.6 Avg. Disk Sec/Read和Avg. Disk Sec/Write 32 2.7 磁盘瓶颈解决方案 32 2.7.1 优化应用程序工作负载 33 2.7.2 使用更快的磁盘驱动器 33 2.7.3 使用一个RAID阵列 33 2.7.4 使用SAN系统 35 2.7.5 恰当地对齐磁盘 35 2.7.6 使用电池后备的控制器缓存 36 2.7.7 添加系统内存 36 2.7.8 创建多个文件和文件组 36 2.7.9 将表和索引放在不同的磁盘上 39 2.7.10 将日志文件保存到独立的物理磁盘 39 2.7.11 表的分区 40 2.8 处理器瓶颈分析 40 2.8.1 % Processor Time 41 2.8.2 % Privileged Time 41 2.8.3 Processor Queue Length 42 2.8.4 Context Switches/sec 42 2.8.5 Batch Requests/sec 42 2.8.6 SQL Compilations/sec 42 2.8.7 SQL Recompilations/sec 43 2.9 处理器瓶颈解决方案 43 2.9.1 优化应用程序工作负载 43 2.9.2 消除过多的编译/重编译 43 2.9.3 使用更多或更快的处理器 44 2.9.4 使用大的二级(L2)/三级(L3)缓存 44 2.9.5 运行更高效的控制器/驱动程序 44 2.9.6 不运行不必要的软件 45 2.10 网络瓶颈分析 45 2.10.1 Bytes Total/sec 45 2.10.2 % Net Utilization 46 2.11 网络瓶颈解决方案 46 2.11.1 优化应用程序工作负载 46 2.11.2 增加网络适配器 47 2.11.3 节制和避免中断 47 2.12 SQL Server总体性能 47 2.12.1 丢失索引 48 2.12.2 数据库阻塞 49 2.12.3 不可重用的执行计划 50 2.12.4 总体表现 50 2.13 创建一个基线 51 2.13.1 创建性能计数器的一个可重用列表 51 2.13.2 使用性能计数器列表创建一个计数器日志 54 2.13.3 最小化性能监视器开销 55 2.14 以基线为标准的系统状态分析 56 2.15 小结 57 第3章 SQL查询性能分析 58 3.1 SQL Profiler工具 58 3.1.1 Profiler跟踪 59 3.1.2 事件 60 3.1.3 数据列 62 3.1.4 过滤器 64 3.1.5 跟踪模板 65 3.1.6 跟踪数据 65 3.2 跟踪的自动化 66 3.2.1 使用GUI捕捉跟踪 66 3.2.2 使用存储过程捕捉跟踪 67 3.3 结合跟踪和性能监视器输出 68 3.4 SQL Profiler建议 69 3.4.1 限制事件和数据列 69 3.4.2 丢弃性能分析所用的启动事件 70 3.4.3 限制跟踪输出大小 70 3.4.4 避免在线数据列排序 71 3.4.5 远程运行Profiler 71 3.4.6 限制使用某些事件 71 3.5 没有Profiler情况下的查询性能度量 71 3.6 开销较大的查询 72 3.6.1 识别开销较大的查询 73 3.6.2 识别运行缓慢的查询 77 3.7 执行计划 78 3.7.1 分析查询执行计划 80 3.7.2 识别执行计划中开销较大的步骤 82 3.7.3 分析索引有效性 83 3.7.4 分析连接有效性 84 3.7.5 实际执行计划vs.估算执行计划 88 3.7.6 计划缓存 89 3.8 查询开销 90 3.8.1 客户统计 90 3.8.2 执行时间 91 3.8.3 STATISTICS IO 92 3.9 小结 94 第4章 索引分析 95 4.1 什么是索引 95 4.1.1 索引的好处 97 4.1.2 索引开销 98 4.2 索引设计建议 100 4.2.1 检查WHERE子句和连接条件列 100 4.2.2 使用窄索引 102 4.2.3 检查列的唯一性 103 4.2.4 检查列数据类型 106 4.2.5 考虑列顺序 107 4.2.6 考虑索引类型 109 4.3 聚簇索引 109 4.3.1 堆表 110 4.3.2 与非聚簇索引的关系 110 4.3.3 聚簇索引建议 112 4.4 非聚簇索引 117 4.4.1 非聚簇索引维护 117 4.4.2 定义书签查找 117 4.4.3 非聚簇索引建议 118 4.5 聚簇索引vs.非聚簇索引 118 4.5.1 聚簇索引相对于非聚簇索引的好处 119 4.5.2 非聚簇索引相对于聚簇索引的好处 120 4.6 高级索引技术 121 4.6.1 覆盖索引 122 4.6.2 索引交叉 124 4.6.3 索引连接 125 4.6.4 过滤索引 126 4.6.5 索引视图 128 4.6.6 索引压缩 132 4.7 特殊索引类型 134 4.7.1 全文索引 134 4.7.2 空间索引 135 4.7.3 XML 135 4.8 索引的附加特性 135 4.8.1 不同的列排序顺序 135 4.8.2 在计算列上的索引 136 4.8.3 BIT数据类型列上的索引 136 4.8.4 作为一个查询处理的CREATE INDEX语句 136 4.8.5 并行索引创建 136 4.8.6 在线索引创建 137 4.8.7 考虑数据库引擎调整顾问 137 4.9 小结 137 第5章 数据库引擎调整顾问 139 5.1 数据库引擎调整顾问机制 139 5.2 数据库引擎调整顾问实例 143 5.2.1 调整一个查询 143 5.2.2 调整一个跟踪工作负载 146 5.3 数据库引擎调整顾问的局限性 148 5.4 小结 149 第6章 书签查找分析 150 6.1 书签查找的目的 150 6.2 书签查找的缺点 152 6.3 分析书签查找的起因 153 6.4 解决书签查找 155 6.4.1 使用一个聚簇索引 155 6.4.2 使用一个覆盖索引 155 6.4.3 使用索引连接 158 6.5 小结 160 第7章 统计分析 161 7.1 统计在查询优化中的角色 161 7.2 索引列上的统计 162 7.2.1 更新统计的好处 162 7.2.2 过时统计的缺点 164 7.3 在非索引列上的统计 165 7.3.1 在非索引列上统计的好处 166 7.3.2 丢失非索引列上的统计的缺点 169 7.4 分析统计 172 7.4.1 密度 174 7.4.2 多列索引上的统计 174 7.4.3 过滤索引上的统计 175 7.5 统计维护 176 7.5.1 自动维护 177 7.5.2 人工维护 179 7.5.3 统计维护状态 181 7.6 为查询分析统计的有效性 182 7.6.1 解决丢失统计问题 182 7.6.2 解决过时统计问题 184 7.7 建议 186 7.7.1 统计的向后兼容性 186 7.7.2 自动创建统计 186 7.7.3 自动更新统计 187 7.7.4 自动异步更新统计 189 7.7.5 收集统计的采样数量 189 7.8 小结 190 第8章 碎片分析 191 8.1 碎片的成因 191 8.1.1 UPDATE语句引起的页面分割 193 8.1.2 INSERT语句引起的页面分割 196 8.2 碎片开销 197 8.3 分析碎片数量 200 8.4 碎片解决方案 204 8.4.1 卸载并重建索引 204 8.4.2 使用DROP_EXISTING子句重建索引 205 8.4.3 执行ALTER INDEX REBUILD语句 205 8.4.4 执行ALTER INDEX REORGANIZE语句 207 8.5 填充因子的重要性 209 8.6 自动维护 212 8.7 小结 217 第9章 执行计划缓冲分析 218 9.1 执行计划生成 218 9.1.1 解析器 219 9.1.2 代数化器 220 9.1.3 优化 221 9.2 执行计划缓冲 227 9.3 执行计划组件 227 9.3.1 查询计划 227 9.3.2 执行上下文 227 9.4 执行计划的老化 228 9.5 分析执行计划缓冲 228 9.6 执行计划重用 229 9.6.1 即席工作负载 230 9.6.2 预定义工作负载 231 9.6.3 即席工作负载的计划可重用性 231 9.6.4 预定义工作负载的计划可重用性 239 9.7 查询计划Hash和查询Hash 248 9.8 执行计划缓冲建议 251 9.8.1 明确地参数化查询的可变部分 252 9.8.2 使用存储过程实现业务功能 252 9.8.3 使用sp_executesql编程以避免存储过程维护 252 9.8.4 实现准备/执行模式以避免重传查询字符串 253 9.8.5 避免即席查询 253 9.8.6 对于动态查询sp_executesql优于EXECUTE 253 9.8.7 小心地参数化查询的可变部分 254 9.8.8 不要允许查询中对象的隐含解析 254 9.9 小结 254 第10章 存储过程重编译 256 10.1 重编译的好处和缺点 256 10.2 确认导致重编译的语句 258 10.3 分析重编译起因 260 10.3.1 架构或绑定变化 261 10.3.2 统计变化 261 10.3.3 延迟对象解析 264 10.3.4 SET选项变化 266 10.3.5 执行计划老化 266 10.3.6 显式调用sp_recompile 267 10.3.7 显式使用RECOMPILE子句 268 10.4 避免重编译 269 10.4.1 不要交替使用DDL和DML语句 270 10.4.2 避免统计变化引起的重编译 271 10.4.3 使用表变量 273 10.4.4 避免在存储过程中修改SET选项 275 10.4.5 使用OPTIMIZE FOR查询提示 276 10.4.6 使用计划指南 277 10.5 小结 281 第11章 查询设计分析 282 11.1 查询设计建议 282 11.2 在小结果集上操作 283 11.2.1 限制选择列表中的列数 283 11.2.2 使用高选择性的WHERE子句 284 11.3 有效地使用索引 284 11.3.1 避免不可参数化的搜索条件 285 11.3.2 避免WHERE子句列上的算术运算符 289 11.3.3 避免WHERE子句列上的函数 290 11.4 避免优化器提示 292 11.4.1 连接提示 293 11.4.2 索引提示 295 11.5 使用域和参照完整性 296 11.5.1 非空约束 297 11.5.2 声明参照完整性 299 11.6 避免资源密集型查询 301 11.6.1 避免数据类型转换 301 11.6.2 使用EXISTS代替COUNT(*)验证数据存在 303 11.6.3 使用UNION ALL代替UNION 304 11.6.4 为聚合和排序操作使用索引 305 11.6.5 避免在批查询中的局部变量 306 11.6.6 小心地命名存储过程 309 11.7 减少网络传输数量 311 11.7.1 同时执行多个查询 311 11.7.2 使用SET NOCOUNT 311 11.8 降低事务开销 312 11.8.1 减少日志开销 312 11.8.2 减少锁开销 314 11.9 小结 315 第12章 阻塞分析 316 12.1 阻塞基础知识 316 12.2 理解阻塞 317 12.2.1 原子性 317 12.2.2 一致性 320 12.2.3 隔离性 320 12.2.4 持久性 321 12.3 数据库锁 321 12.3.1 锁粒度 322 12.3.2 锁升级 325 12.3.3 锁模式 326 12.3.4 锁兼容性 332 12.4 隔离级别 332 12.4.1 未提交读 333 12.4.2 已提交读 333 12.4.3 可重复读 335 12.4.4 可序列化(Serializable) 338 12.4.5 快照(Snapshot) 343 12.5 索引对锁的作用 343 12.5.1 非聚簇索引的作用 344 12.5.2 聚簇索引的作用 346 12.5.3 索引在可序列化隔离级别上的作用 346 12.6 捕捉阻塞信息 347 12.6.1 使用SQL捕捉阻塞信息 347 12.6.2 Profiler跟踪和被阻塞进程报告事件 349 12.7 阻塞解决方案 351 12.7.1 优化查询 352 12.7.2 降低隔离级别 352 12.7.3 分区争用的数据 353 12.7.4 争用数据上的覆盖索引 354 12.8 减少阻塞的建议 354 12.9 自动化侦测和收集阻塞信息 355 12.10 小结 359 第13章 死锁分析 360 13.1 死锁基础知识 360 13.2 使用错误处理来捕捉死锁 361 13.3 死锁分析 362 13.3.1 收集死锁信息 362 13.3.2 分析死锁 364 13.4 避免死锁 368 13.4.1 按照相同的时间顺序访问资源 368 13.4.2 减少被访问资源的数量 369 13.4.3 最小化锁的争用 369 13.5 小结 370 第14章 游标开销分析 372 14.1 游标基础知识 372 14.1.1 游标位置 373 14.1.2 游标并发性 374 14.1.3 游标类型 376 14.2 游标开销比较 378 14.2.1 游标位置的开销比较 378 14.2.2 游标并发性上的开销比较 380 14.2.3 在游标类型上的开销比较 381 14.3 默认结果集 383 14.3.1 好处 384 14.3.2 缺点 384 14.4 分析SQL Server游标开销 386 14.5 游标建议 390 14.6 小结 392 第15章 数据库工作负载优化 393 15.1 工作负载优化基础知识 393 15.2 工作负载优化步骤 394 15.3 捕捉工作负载 397 15.4 分析工作负载 399 15.5 识别开销最大的查询 400 15.6 确定开销最大的查询的基线资源使用 402 15.6.1 总体资源使用 402 15.6.2 详细资源使用 402 15.7 分析和优化外部因素 405 15.7.1 分析应用程序使用的批级别选项 405 15.7.2 分析统计有效性 406 15.7.3 分析碎片整理需求 406 15.8 分析开销最大的查询的内部行为 410 15.8.1 分析查询执行计划 410 15.8.2 识别执行计划中开销较大的步骤 412 15.8.3 分析处理策略的效率 412 15.9 优化代价最大的查询 412 15.9.1 修改现有索引 413 15.9.2 分析连接提示的应用 415 15.9.3 避免聚簇索引扫描操作 417 15.9.4 修改过程 418 15.10 分析对数据库工作负载的影响 420 15.11 迭代各个优化阶段 421 15.12 小结 424 第16章 SQL Server优化检查列表 425 16.1 数据库设计 425 16.1.1 平衡不足和过多的规范化 426 16.1.2 从实体完整性约束中得利 427 16.1.3 从域和参照完整性约束中得利 428 16.1.4 采用索引设计最佳实践 430 16.1.5 避免在存储过程名称中使用sp_前缀 431 16.1.6 最小化触发器的使用 431 16.2 查询设计 432 16.2.1 使用SET NOCOUNT ON命令 432 16.2.2 显式定义对象所有者 432 16.2.3 避免不可参数化的搜索条件 432 16.2.4 避免WHERE子句列上的算术运算符 433 16.2.5 避免优化器提示 434 16.2.6 远离嵌套视图 434 16.2.7 确保没有隐含的数据类型转换 435 16.2.8 最小化日志开销 435 16.2.9 采用重用执行计划的最佳实践 435 16.2.10 采用数据库事务最佳实践 436 16.2.11 消除或减少数据库游标开销 437 16.3 配置设置 437 16.3.1 Affinity Mask 437 16.3.2 内存配置选项 437 16.3.3 并行性开销阈值 438 16.3.4 最大并行度 438 16.3.5 优化即席工作负载 438 16.3.6 查询调控器开销限制 439 16.3.7 填充因子(%) 439 16.3.8 被阻塞过程阈值 439 16.3.9 数据库文件布局 439 16.3.10 数据库压缩 440 16.4 数据库管理 440 16.4.1 保持统计最新 440 16.4.2 保持最小数量的索引碎片数量 441 16.4.3 循环使用SQL错误日志文件 441 16.4.4 避免像AUTO_CLOSE或AUTO_SHRINK这样的自动化数据库功能 441 16.4.5 最小化SQL跟踪开销 442 16.5 数据库备份 442 16.5.1 增量和事务日志备份频率 442 16.5.2 备份分布 443 16.5.3 备份压缩 444 16.6 小结 444 作者介绍 作者:(美国)弗里奇(Grant Fritchey) (美国)达姆(Sajal Dam) 译者:姚军 弗里奇(Grant Fritchey),为FM Global(一家行业领先的工程和保险公司)工作,担任首席DBA。他使用各种语言(如VB、C#和Java等)开发了许多大规模的应用程序,从版本6.0开始使用SQL Server。他曾经为3家失败的.com公司担任财务和咨询工作,还是Dissecting SQL Server Execution Plans一书的作者。 达姆(Sajal Dam),拥有位于印度班加罗尔的印度理工学院的计算机科学技术硕士学位,并且使用微软技术超过16年。他已经在设计数据库应用和管理软件开发方面拥有了很广泛的背景。Saial还在从前端网页到后端数据库的基于微软技术的应用程序上,具备了故障定位和性能优化的大量经验。他有许多为《财富》500强公司设计可伸缩的数据库解决方案和最大化数据库环境性能的经验。
书名: SQLServer2008查询性能优化 作者: 弗里奇(Grant Fritchey) 出版社: 人民邮电出版社 出版日期: 2010年8月1日 ISBN: 9787115230294 编辑推荐 《SQL Server 2008查询性能优化》为你提供了处理查询性能所需要的工具。建立、维护数据库和数据库服务器可能是个困难的工作。当服务器的运行越来越慢时,这个工作就变得更加困难。来自用户的愤怒的电话以及站在你办公桌周围的管理人员都使你很不快活。在开发代码的同时,如果你花费时间和精力来开发一个性能故障排错的方法。那么你就能避免这种情况——至少可以快速而有效地做出反应。《SQL Server 2008查询性能优化》指出的性能要点之一是数据库随着用户和数据的日益增多而进行扩展的必要性。你需要理解性能低下的起因。以及识别并修复它们的方法。《SQL Server 2008查询性能优化》将帮助你: 使用性能监视器、SQL Trace以及动态管理视图和函数建立性能基线 理解一般系统中发生瓶颈的地方。以及解决瓶颈的方法 识别常见性能问题以及对其快速处理的方法 实施修复甚至预防性能问题的T-SQL最佳实践 《SQL Server 2008查询性能优化》不是理论书籍,它的目的是帮助你避免数据库出现性能低下的状况,它还能帮助你保住你的工作。 内容提要 《SQL Server 2008查询性能优化》通过大量实例,详细介绍了SQL Server数据库系统优化的各种方法和技巧。内容涵盖了数据库应用系统中各种性能瓶颈的表现形式及其发生的根源和解决方法,从硬件瓶颈到查询、索引设计以及数据库管理等,贯穿了数据库系统知识的各个方面。最后以一个实际的工作负载将所有技巧联系起来,并且提供了“宝典”式的最佳实践列表。 《SQL Server 2008查询性能优化》适合于关心数据库应用系统性能的开发人员和数据库管理人员阅读。通过阅读《SQL Server 2008查询性能优化》,不仅可以学习到数据库性能管理的许多知识和技巧,还有助于养成良好的编程习惯,为实现高性能的数据库应用系统打下基础。 目录 第1章 SQL查询性能调整 1 1.1 性能调整过程 2 1.1.1 核心过程 2 1.1.2 迭代过程 4 1.2 性能vs.价格 7 1.2.1 性能目标 7 1.2.2 “足够好”的调整 7 1.3 性能基线 8 1.4 工作的重点 9 1.5 SQL Server性能杀手 10 1.5.1 低质量的索引 10 1.5.2 不精确的统计 11 1.5.3 过多的阻塞和死锁 11 1.5.4 不基于数据集的操作 11 1.5.5 低质量的查询设计 12 1.5.6 低质量的数据库设计 12 1.5.7 过多的碎片 12 1.5.8 不可重用的执行计划 13 1.5.9 低质量的执行计划 13 1.5.10 频繁重编译计划 13 1.5.11 游标的错误使用 13 1.5.12 错误配置数据库日志 14 1.5.13 过多使用或者错误配置tempdb 14 1.6 小结 14 第2章 系统性能分析 15 2.1 性能监视器工具 15 2.2 动态管理视图 17 2.3 硬件资源瓶颈 18 2.3.1 识别瓶颈 18 2.3.2 瓶颈解决方案 19 2.4 内存瓶颈分析 19 2.4.1 SQL Server内存管理 20 2.4.2 Available Bytes 23 2.4.3 Pages/sec和Page Faults/sec计数器 23 2.4.4 Buffer Cache Hit Ratio 24 2.4.5 Page Life Expectancy 24 2.4.6 Checkpoint Pages/sec 24 2.4.7 Lazy writes/sec 24 2.4.8 Memory Grants Pending 25 2.4.9 Target Server Memory(KB)和Total Server Memory(KB) 25 2.5 内存瓶颈解决方案 25 2.5.1 优化应用程序工作负载 26 2.5.2 为SQL Server分配更多内存 27 2.5.3 增加系统内存 27 2.5.4 更换32位处理器为64位处理器 27 2.5.5 启用3GB进程空间 28 2.5.6 在32位SQL Server中使用4GB以上内存 28 2.6 磁盘瓶颈分析 29 2.6.1 磁盘计数器 30 2.6.2 % Disk Time 30 2.6.3 Current Disk Queue Length 31 2.6.4 Disk Transfers/sec 31 2.6.5 Disk Bytes/sec 32 2.6.6 Avg. Disk Sec/Read和Avg. Disk Sec/Write 32 2.7 磁盘瓶颈解决方案 32 2.7.1 优化应用程序工作负载 33 2.7.2 使用更快的磁盘驱动器 33 2.7.3 使用一个RAID阵列 33 2.7.4 使用SAN系统 35 2.7.5 恰当地对齐磁盘 35 2.7.6 使用电池后备的控制器缓存 36 2.7.7 添加系统内存 36 2.7.8 创建多个文件和文件组 36 2.7.9 将表和索引放在不同的磁盘上 39 2.7.10 将日志文件保存到独立的物理磁盘 39 2.7.11 表的分区 40 2.8 处理器瓶颈分析 40 2.8.1 % Processor Time 41 2.8.2 % Privileged Time 41 2.8.3 Processor Queue Length 42 2.8.4 Context Switches/sec 42 2.8.5 Batch Requests/sec 42 2.8.6 SQL Compilations/sec 42 2.8.7 SQL Recompilations/sec 43 2.9 处理器瓶颈解决方案 43 2.9.1 优化应用程序工作负载 43 2.9.2 消除过多的编译/重编译 43 2.9.3 使用更多或更快的处理器 44 2.9.4 使用大的二级(L2)/三级(L3)缓存 44 2.9.5 运行更高效的控制器/驱动程序 44 2.9.6 不运行不必要的软件 45 2.10 网络瓶颈分析 45 2.10.1 Bytes Total/sec 45 2.10.2 % Net Utilization 46 2.11 网络瓶颈解决方案 46 2.11.1 优化应用程序工作负载 46 2.11.2 增加网络适配器 47 2.11.3 节制和避免中断 47 2.12 SQL Server总体性能 47 2.12.1 丢失索引 48 2.12.2 数据库阻塞 49 2.12.3 不可重用的执行计划 50 2.12.4 总体表现 50 2.13 创建一个基线 51 2.13.1 创建性能计数器的一个可重用列表 51 2.13.2 使用性能计数器列表创建一个计数器日志 54 2.13.3 最小化性能监视器开销 55 2.14 以基线为标准的系统状态分析 56 2.15 小结 57 第3章 SQL查询性能分析 58 3.1 SQL Profiler工具 58 3.1.1 Profiler跟踪 59 3.1.2 事件 60 3.1.3 数据列 62 3.1.4 过滤器 64 3.1.5 跟踪模板 65 3.1.6 跟踪数据 65 3.2 跟踪的自动化 66 3.2.1 使用GUI捕捉跟踪 66 3.2.2 使用存储过程捕捉跟踪 67 3.3 结合跟踪和性能监视器输出 68 3.4 SQL Profiler建议 69 3.4.1 限制事件和数据列 69 3.4.2 丢弃性能分析所用的启动事件 70 3.4.3 限制跟踪输出大小 70 3.4.4 避免在线数据列排序 71 3.4.5 远程运行Profiler 71 3.4.6 限制使用某些事件 71 3.5 没有Profiler情况下的查询性能度量 71 3.6 开销较大的查询 72 3.6.1 识别开销较大的查询 73 3.6.2 识别运行缓慢的查询 77 3.7 执行计划 78 3.7.1 分析查询执行计划 80 3.7.2 识别执行计划中开销较大的步骤 82 3.7.3 分析索引有效性 83 3.7.4 分析连接有效性 84 3.7.5 实际执行计划vs.估算执行计划 88 3.7.6 计划缓存 89 3.8 查询开销 90 3.8.1 客户统计 90 3.8.2 执行时间 91 3.8.3 STATISTICS IO 92 3.9 小结 94 第4章 索引分析 95 4.1 什么是索引 95 4.1.1 索引的好处 97 4.1.2 索引开销 98 4.2 索引设计建议 100 4.2.1 检查WHERE子句和连接条件列 100 4.2.2 使用窄索引 102 4.2.3 检查列的唯一性 103 4.2.4 检查列数据类型 106 4.2.5 考虑列顺序 107 4.2.6 考虑索引类型 109 4.3 聚簇索引 109 4.3.1 堆表 110 4.3.2 与非聚簇索引的关系 110 4.3.3 聚簇索引建议 112 4.4 非聚簇索引 117 4.4.1 非聚簇索引维护 117 4.4.2 定义书签查找 117 4.4.3 非聚簇索引建议 118 4.5 聚簇索引vs.非聚簇索引 118 4.5.1 聚簇索引相对于非聚簇索引的好处 119 4.5.2 非聚簇索引相对于聚簇索引的好处 120 4.6 高级索引技术 121 4.6.1 覆盖索引 122 4.6.2 索引交叉 124 4.6.3 索引连接 125 4.6.4 过滤索引 126 4.6.5 索引视图 128 4.6.6 索引压缩 132 4.7 特殊索引类型 134 4.7.1 全文索引 134 4.7.2 空间索引 135 4.7.3 XML 135 4.8 索引的附加特性 135 4.8.1 不同的列排序顺序 135 4.8.2 在计算列上的索引 136 4.8.3 BIT数据类型列上的索引 136 4.8.4 作为一个查询处理的CREATE INDEX语句 136 4.8.5 并行索引创建 136 4.8.6 在线索引创建 137 4.8.7 考虑数据库引擎调整顾问 137 4.9 小结 137 第5章 数据库引擎调整顾问 139 5.1 数据库引擎调整顾问机制 139 5.2 数据库引擎调整顾问实例 143 5.2.1 调整一个查询 143 5.2.2 调整一个跟踪工作负载 146 5.3 数据库引擎调整顾问的局限性 148 5.4 小结 149 第6章 书签查找分析 150 6.1 书签查找的目的 150 6.2 书签查找的缺点 152 6.3 分析书签查找的起因 153 6.4 解决书签查找 155 6.4.1 使用一个聚簇索引 155 6.4.2 使用一个覆盖索引 155 6.4.3 使用索引连接 158 6.5 小结 160 第7章 统计分析 161 7.1 统计在查询优化中的角色 161 7.2 索引列上的统计 162 7.2.1 更新统计的好处 162 7.2.2 过时统计的缺点 164 7.3 在非索引列上的统计 165 7.3.1 在非索引列上统计的好处 166 7.3.2 丢失非索引列上的统计的缺点 169 7.4 分析统计 172 7.4.1 密度 174 7.4.2 多列索引上的统计 174 7.4.3 过滤索引上的统计 175 7.5 统计维护 176 7.5.1 自动维护 177 7.5.2 人工维护 179 7.5.3 统计维护状态 181 7.6 为查询分析统计的有效性 182 7.6.1 解决丢失统计问题 182 7.6.2 解决过时统计问题 184 7.7 建议 186 7.7.1 统计的向后兼容性 186 7.7.2 自动创建统计 186 7.7.3 自动更新统计 187 7.7.4 自动异步更新统计 189 7.7.5 收集统计的采样数量 189 7.8 小结 190 第8章 碎片分析 191 8.1 碎片的成因 191 8.1.1 UPDATE语句引起的页面分割 193 8.1.2 INSERT语句引起的页面分割 196 8.2 碎片开销 197 8.3 分析碎片数量 200 8.4 碎片解决方案 204 8.4.1 卸载并重建索引 204 8.4.2 使用DROP_EXISTING子句重建索引 205 8.4.3 执行ALTER INDEX REBUILD语句 205 8.4.4 执行ALTER INDEX REORGANIZE语句 207 8.5 填充因子的重要性 209 8.6 自动维护 212 8.7 小结 217 第9章 执行计划缓冲分析 218 9.1 执行计划生成 218 9.1.1 解析器 219 9.1.2 代数化器 220 9.1.3 优化 221 9.2 执行计划缓冲 227 9.3 执行计划组件 227 9.3.1 查询计划 227 9.3.2 执行上下文 227 9.4 执行计划的老化 228 9.5 分析执行计划缓冲 228 9.6 执行计划重用 229 9.6.1 即席工作负载 230 9.6.2 预定义工作负载 231 9.6.3 即席工作负载的计划可重用性 231 9.6.4 预定义工作负载的计划可重用性 239 9.7 查询计划Hash和查询Hash 248 9.8 执行计划缓冲建议 251 9.8.1 明确地参数化查询的可变部分 252 9.8.2 使用存储过程实现业务功能 252 9.8.3 使用sp_executesql编程以避免存储过程维护 252 9.8.4 实现准备/执行模式以避免重传查询字符串 253 9.8.5 避免即席查询 253 9.8.6 对于动态查询sp_executesql优于EXECUTE 253 9.8.7 小心地参数化查询的可变部分 254 9.8.8 不要允许查询中对象的隐含解析 254 9.9 小结 254 第10章 存储过程重编译 256 10.1 重编译的好处和缺点 256 10.2 确认导致重编译的语句 258 10.3 分析重编译起因 260 10.3.1 架构或绑定变化 261 10.3.2 统计变化 261 10.3.3 延迟对象解析 264 10.3.4 SET选项变化 266 10.3.5 执行计划老化 266 10.3.6 显式调用sp_recompile 267 10.3.7 显式使用RECOMPILE子句 268 10.4 避免重编译 269 10.4.1 不要交替使用DDL和DML语句 270 10.4.2 避免统计变化引起的重编译 271 10.4.3 使用表变量 273 10.4.4 避免在存储过程中修改SET选项 275 10.4.5 使用OPTIMIZE FOR查询提示 276 10.4.6 使用计划指南 277 10.5 小结 281 第11章 查询设计分析 282 11.1 查询设计建议 282 11.2 在小结果集上操作 283 11.2.1 限制选择列表中的列数 283 11.2.2 使用高选择性的WHERE子句 284 11.3 有效地使用索引 284 11.3.1 避免不可参数化的搜索条件 285 11.3.2 避免WHERE子句列上的算术运算符 289 11.3.3 避免WHERE子句列上的函数 290 11.4 避免优化器提示 292 11.4.1 连接提示 293 11.4.2 索引提示 295 11.5 使用域和参照完整性 296 11.5.1 非空约束 297 11.5.2 声明参照完整性 299 11.6 避免资源密集型查询 301 11.6.1 避免数据类型转换 301 11.6.2 使用EXISTS代替COUNT(*)验证数据存在 303 11.6.3 使用UNION ALL代替UNION 304 11.6.4 为聚合和排序操作使用索引 305 11.6.5 避免在批查询中的局部变量 306 11.6.6 小心地命名存储过程 309 11.7 减少网络传输数量 311 11.7.1 同时执行多个查询 311 11.7.2 使用SET NOCOUNT 311 11.8 降低事务开销 312 11.8.1 减少日志开销 312 11.8.2 减少锁开销 314 11.9 小结 315 第12章 阻塞分析 316 12.1 阻塞基础知识 316 12.2 理解阻塞 317 12.2.1 原子性 317 12.2.2 一致性 320 12.2.3 隔离性 320 12.2.4 持久性 321 12.3 数据库锁 321 12.3.1 锁粒度 322 12.3.2 锁升级 325 12.3.3 锁模式 326 12.3.4 锁兼容性 332 12.4 隔离级别 332 12.4.1 未提交读 333 12.4.2 已提交读 333 12.4.3 可重复读 335 12.4.4 可序列化(Serializable) 338 12.4.5 快照(Snapshot) 343 12.5 索引对锁的作用 343 12.5.1 非聚簇索引的作用 344 12.5.2 聚簇索引的作用 346 12.5.3 索引在可序列化隔离级别上的作用 346 12.6 捕捉阻塞信息 347 12.6.1 使用SQL捕捉阻塞信息 347 12.6.2 Profiler跟踪和被阻塞进程报告事件 349 12.7 阻塞解决方案 351 12.7.1 优化查询 352 12.7.2 降低隔离级别 352 12.7.3 分区争用的数据 353 12.7.4 争用数据上的覆盖索引 354 12.8 减少阻塞的建议 354 12.9 自动化侦测和收集阻塞信息 355 12.10 小结 359 第13章 死锁分析 360 13.1 死锁基础知识 360 13.2 使用错误处理来捕捉死锁 361 13.3 死锁分析 362 13.3.1 收集死锁信息 362 13.3.2 分析死锁 364 13.4 避免死锁 368 13.4.1 按照相同的时间顺序访问资源 368 13.4.2 减少被访问资源的数量 369 13.4.3 最小化锁的争用 369 13.5 小结 370 第14章 游标开销分析 372 14.1 游标基础知识 372 14.1.1 游标位置 373 14.1.2 游标并发性 374 14.1.3 游标类型 376 14.2 游标开销比较 378 14.2.1 游标位置的开销比较 378 14.2.2 游标并发性上的开销比较 380 14.2.3 在游标类型上的开销比较 381 14.3 默认结果集 383 14.3.1 好处 384 14.3.2 缺点 384 14.4 分析SQL Server游标开销 386 14.5 游标建议 390 14.6 小结 392 第15章 数据库工作负载优化 393 15.1 工作负载优化基础知识 393 15.2 工作负载优化步骤 394 15.3 捕捉工作负载 397 15.4 分析工作负载 399 15.5 识别开销最大的查询 400 15.6 确定开销最大的查询的基线资源使用 402 15.6.1 总体资源使用 402 15.6.2 详细资源使用 402 15.7 分析和优化外部因素 405 15.7.1 分析应用程序使用的批级别选项 405 15.7.2 分析统计有效性 406 15.7.3 分析碎片整理需求 406 15.8 分析开销最大的查询的内部行为 410 15.8.1 分析查询执行计划 410 15.8.2 识别执行计划中开销较大的步骤 412 15.8.3 分析处理策略的效率 412 15.9 优化代价最大的查询 412 15.9.1 修改现有索引 413 15.9.2 分析连接提示的应用 415 15.9.3 避免聚簇索引扫描操作 417 15.9.4 修改过程 418 15.10 分析对数据库工作负载的影响 420 15.11 迭代各个优化阶段 421 15.12 小结 424 第16章 SQL Server优化检查列表 425 16.1 数据库设计 425 16.1.1 平衡不足和过多的规范化 426 16.1.2 从实体完整性约束中得利 427 16.1.3 从域和参照完整性约束中得利 428 16.1.4 采用索引设计最佳实践 430 16.1.5 避免在存储过程名称中使用sp_前缀 431 16.1.6 最小化触发器的使用 431 16.2 查询设计 432 16.2.1 使用SET NOCOUNT ON命令 432 16.2.2 显式定义对象所有者 432 16.2.3 避免不可参数化的搜索条件 432 16.2.4 避免WHERE子句列上的算术运算符 433 16.2.5 避免优化器提示 434 16.2.6 远离嵌套视图 434 16.2.7 确保没有隐含的数据类型转换 435 16.2.8 最小化日志开销 435 16.2.9 采用重用执行计划的最佳实践 435 16.2.10 采用数据库事务最佳实践 436 16.2.11 消除或减少数据库游标开销 437 16.3 配置设置 437 16.3.1 Affinity Mask 437 16.3.2 内存配置选项 437 16.3.3 并行性开销阈值 438 16.3.4 最大并行度 438 16.3.5 优化即席工作负载 438 16.3.6 查询调控器开销限制 439 16.3.7 填充因子(%) 439 16.3.8 被阻塞过程阈值 439 16.3.9 数据库文件布局 439 16.3.10 数据库压缩 440 16.4 数据库管理 440 16.4.1 保持统计最新 440 16.4.2 保持最小数量的索引碎片数量 441 16.4.3 循环使用SQL错误日志文件 441 16.4.4 避免像AUTO_CLOSE或AUTO_SHRINK这样的自动化数据库功能 441 16.4.5 最小化SQL跟踪开销 442 16.5 数据库备份 442 16.5.1 增量和事务日志备份频率 442 16.5.2 备份分布 443 16.5.3 备份压缩 444 16.6 小结 444 作者介绍 作者:(美国)弗里奇(Grant Fritchey) (美国)达姆(Sajal Dam) 译者:姚军 弗里奇(Grant Fritchey),为FM Global(一家行业领先的工程和保险公司)工作,担任首席DBA。他使用各种语言(如VB、C#和Java等)开发了许多大规模的应用程序,从版本6.0开始使用SQL Server。他曾经为3家失败的.com公司担任财务和咨询工作,还是Dissecting SQL Server Execution Plans一书的作者。 达姆(Sajal Dam),拥有位于印度班加罗尔的印度理工学院的计算机科学技术硕士学位,并且使用微软技术超过16年。他已经在设计数据库应用和管理软件开发方面拥有了很广泛的背景。Saial还在从前端网页到后端数据库的基于微软技术的应用程序上,具备了故障定位和性能优化的大量经验。他有许多为《财富》500强公司设计可伸缩的数据库解决方案和最大化数据库环境性能的经验。
Contents Module Overview 1 Lesson 1: Memory 3 Lesson 2: I/O 73 Lesson 3: CPU 111 Module 3: Troubleshooting Server Performance Module Overview Troubleshooting server performance-based support calls requires product knowledge, good communication skills, and a proven troubleshooting methodology. In this module we will discuss Microsoft® SQL Server™ interaction with the operating system and methodology of troubleshooting server-based problems. At the end of this module, you will be able to:  Define the common terms associated the memory, I/O, and CPU subsystems.  Describe how SQL Server leverages the Microsoft Windows® operating system facilities including memory, I/O, and threading.  Define common SQL Server memory, I/O, and processor terms.  Generate a hypothesis based on performance counters captured by System Monitor.  For each hypothesis generated, identify at least two other non-System Monitor pieces of information that would help to confirm or reject your hypothesis.  Identify at least five counters for each subsystem that are key to understanding the performance of that subsystem.  Identify three common myths associated with the memory, I/O, or CPU subsystems. Lesson 1: Memory What You Will Learn After completing this lesson, you will be able to:  Define common terms used when describing memory.  Give examples of each memory concept and how it applies to SQL Server.  Describe how SQL Server user and manages its memory.  List the primary configuration options that affect memory.  Describe how configuration options affect memory usage.  Describe the effect on the I/O subsystem when memory runs low.  List at least two memory myths and why they are not true. Recommended Reading  SQL Server 7.0 Performance Tuning Technical Reference, Microsoft Press  Windows 2000 Resource Kit companion CD-ROM documentation. Chapter 15: Overview of Performance Monitoring  Inside Microsoft Windows 2000, Third Edition, David A. Solomon and Mark E. Russinovich  Windows 2000 Server Operations Guide, Storage, File Systems, and Printing; Chapters: Evaluating Memory and Cache Usage  Advanced Windows, 4th Edition, Jeffrey Richter, Microsoft Press Related Web Sites  http://ntperformance/ Memory Definitions Memory Definitions Before we look at how SQL Server uses and manages its memory, we need to ensure a full understanding of the more common memory related terms. The following definitions will help you understand how SQL Server interacts with the operating system when allocating and using memory. Virtual Address Space A set of memory addresses that are mapped to physical memory addresses by the system. In a 32-bit operation system, there is normally a linear array of 2^32 addresses representing 4,294,967,269 byte addresses. Physical Memory A series of physical locations, with unique addresses, that can be used to store instructions or data. AWE – Address Windowing Extensions A 32-bit process is normally limited to addressing 2 gigabytes (GB) of memory, or 3 GB if the system was booted using the /3G boot switch even if there is more physical memory available. By leveraging the Address Windowing Extensions API, an application can create a fixed-size window into the additional physical memory. This allows a process to access any portion of the physical memory by mapping it into the applications window. When used in combination with Intel’s Physical Addressing Extensions (PAE) on Windows 2000, an AWE enabled application can support up to 64 GB of memory Reserved Memory Pages in a processes address space are free, reserved or committed. Reserving memory address space is a way to reserve a range of virtual addresses for later use. If you attempt to access a reserved address that has not yet been committed (backed by memory or disk) you will cause an access violation. Committed Memory Committed pages are those pages that when accessed in the end translate to pages in memory. Those pages may however have to be faulted in from a page file or memory mapped file. Backing Store Backing store is the physical representation of a memory address. Page Fault (Soft/Hard) A reference to an invalid page (a page that is not in your working set) is referred to as a page fault. Assuming the page reference does not result in an access violation, a page fault can be either hard or soft. A hard page fault results in a read from disk, either a page file or memory-mapped file. A soft page fault is resolved from one of the modified, standby, free or zero page transition lists. Paging is represented by a number of counters including page faults/sec, page input/sec and page output/sec. Page faults/sec include soft and hard page faults where as the page input/output counters represent hard page faults. Unfortunately, all of these counters include file system cache activity. For more information, see also…Inside Windows 2000,Third Edition, pp. 443-451. Private Bytes Private non-shared committed address space Working Set The subset of processes virtual pages that is resident in physical memory. For more information, see also… Inside Windows 2000,Third Edition, p. 455. System Working Set Like a process, the system has a working set. Five different types of pages represent the system’s working set: system cache; paged pool; pageable code and data in the kernel; page-able code and data in device drivers; and system mapped views. The system working set is represented by the counter Memory: cache bytes. System working set paging activity can be viewed by monitoring the Memory: Cache Faults/sec counter. For more information, see also… Inside Windows 2000,Third Edition, p. 463. System Cache The Windows 2000 cache manager provides data caching for both local and network file system drivers. By caching virtual blocks, the cache manager can reduce disk I/O and provide intelligent read ahead. Represented by Memory:Cache Resident bytes. For more information, see also… Inside Windows 2000,Third Edition, pp. 654-659. Non Paged Pool Range of addresses guaranteed to be resident in physical memory. As such, non-paged pool can be accessed at any time without incurring a page fault. Because device drivers operate at DPC/dispatch level (covered in lesson 2), and page faults are not allowed at this level or above, most device drivers use non-paged pool to assure that they do not incur a page fault. Represented by Memory: Pool Nonpaged Bytes, typically between 3-30 megabytes (MB) in size. Note The pool is, in effect, a common area of memory shared by all processes. One of the most common uses of non-paged pool is the storage of object handles. For more information regarding “maximums,” see also… Inside Windows 2000,Third Edition, pp. 403-404 Paged Pool Range of address that can be paged in and out of physical memory. Typically used by drivers who need memory but do not need to access that memory from DPC/dispatch of above interrupt level. Represented by Memory: Pool Paged Bytes and Memory:Pool Paged Resident Bytes. Typically between 10-30MB + size of Registry. For more information regarding “limits,” see also… Inside Windows 2000,Third Edition, pp. 403-404. Stack Each thread has two stacks, one for kernel mode and one for user mode. A stack is an area of memory in which program procedure or function call addresses and parameters are temporarily stored. In Process To run in the same address space. In-process servers are loaded in the client’s address space because they are implemented as DLLs. The main advantage of running in-process is that the system usually does not need to perform a context switch. The disadvantage to running in-process is that DLL has access to the process address space and can potentially cause problems. Out of Process To run outside the calling processes address space. OLEDB providers can run in-process or out of process. When running out of process, they run under the context of DLLHOST.EXE. Memory Leak To reserve or commit memory and unintentionally not release it when it is no longer being used. A process can leak resources such as process memory, pool memory, user and GDI objects, handles, threads, and so on. Memory Concepts (X86 Address Space) Per Process Address Space Every process has its own private virtual address space. For 32-bit processes, that address space is 4 GB, based on a 32-bit pointer. Each process’s virtual address space is split into user and system partitions based on the underlying operating system. The diagram included at the top represents the address partitioning for the 32-bit version of Windows 2000. Typically, the process address space is evenly divided into two 2-GB regions. Each process has access to 2 GB of the 4 GB address space. The upper 2 GB of address space is reserved for the system. The user address space is where application code, global variables, per-thread stacks, and DLL code would reside. The system address space is where the kernel, executive, HAL, boot drivers, page tables, pool, and system cache reside. For specific information regarding address space layout, refer to Inside Microsoft Windows 2000 Third Edition pages 417-428 by Microsoft Press. Access Modes Each virtual memory address is tagged as to what access mode the processor must be running in. System space can only be accessed while in kernel mode, while user space is accessible in user mode. This protects system space from being tampered with by user mode code. Shared System Space Although every process has its own private memory space, kernel mode code and drivers share system space. Windows 2000 does not provide any protection to private memory being use by components running in kernel mode. As such, it is very important to ensure components running in kernel mode are thoroughly tested. 3-GB Address Space 3-GB Address Space Although 2 GB of address space may seem like a large amount of memory, application such as SQL Server could leverage more memory if it were available. The boot.ini option /3GB was created for those cases where systems actually support greater than 2 GB of physical memory and an application can make use of it This capability allows memory intensive applications running on Windows 2000 Advanced Server to use up to 50 percent more virtual memory on Intel-based computers. Application memory tuning provides more of the computer's virtual memory to applications by providing less virtual memory to the operating system. Although a system having less than 2 GB of physical memory can be booted using the /3G switch, in most cases this is ill-advised. If you restart with the 3 GB switch, also known as 4-Gig Tuning, the amount of non-paged pool is reduced to 128 MB from 256 MB. For a process to access 3 GB of address space, the executable image must have been linked with the /LARGEADDRESSAWARE flag or modified using Imagecfg.exe. It should be pointed out that SQL Server was linked using the /LAREGEADDRESSAWARE flag and can leverage 3 GB when enabled. Note Even though you can boot Windows 2000 Professional or Windows 2000 Server with the /3GB boot option, users processes are still limited to 2 GB of address space even if the IMAGE_FILE_LARGE_ADDRESS_AWARE flag is set in the image. The only thing accomplished by using the /3G option on these system is the reduction in the amount of address space available to the system (ISW2K Pg. 418). Important If you use /3GB in conjunction with AWE/PAE you are limited to 16 GB of memory. For more information, see the following Knowledge Base articles: Q171793 Information on Application Use of 4GT RAM Tuning Q126402 PagedPoolSize and NonPagedPoolSize Values in Windows NT Q247904 How to Configure Paged Pool and System PTE Memory Areas Q274598 W2K Does Not Enable Complete Memory Dumps Between 2 & 4 GB AWE Memory Layout AWE Memory Usually, the operation system is limited to 4 GB of physical memory. However, by leveraging PAE, Windows 2000 Advanced Server can support up to 8 GB of memory, and Data Center 64 GB of memory. However, as stated previously, each 32-bit process normally has access to only 2 GB of address space, or 3 GB if the system was booted with the /3-GB option. To allow processes to allocate more physical memory than can be represented in the 2GB of address space, Microsoft created the Address Windows Extensions (AWE). These extensions allow for the allocation and use of up to the amount of physical memory supported by the operating system. By leveraging the Address Windowing Extensions API, an application can create a fixed-size window into the physical memory. This allows a process to access any portion of the physical memory by mapping regions of physical memory in and out of the applications window. The allocation and use of AWE memory is accomplished by  Creating a window via VirtualAlloc using the MEM_PHYSICAL option  Allocating the physical pages through AllocateUserPhysicalPages  Mapping the RAM pages to the window using MapUserPhysicalPages Note SQL Server 7.0 supports a feature called extended memory in Windows NT® 4 Enterprise Edition by using a PSE36 driver. Currently there are no PSE drivers for Windows 2000. The preferred method of accessing extended memory is via the Physical Addressing Extensions using AWE. The AWE mapping feature is much more efficient than the older process of coping buffers from extended memory into the process address space. Unfortunately, SQL Server 7.0 cannot leverage PAE/AWE. Because there are currently no PSE36 drivers for Windows 2000 this means SQL Server 7.0 cannot support more than 3GB of memory on Windows 2000. Refer to KB article Q278466. AWE restrictions  The process must have Lock Pages In Memory user rights to use AWE Important It is important that you use Enterprise Manager or DMO to change the service account. Enterprise Manager and DMO will grant all of the privileges and Registry and file permissions needed for SQL Server. The Service Control Panel does NOT grant all the rights or permissions needed to run SQL Server.  Pages are not shareable or page-able  Page protection is limited to read/write  The same physical page cannot be mapped into two separate AWE regions, even within the same process.  The use of AWE/PAE in conjunction with /3GB will limit the maximum amount of supported memory to between 12-16 GB of memory.  Task manager does not show the correct amount of memory allocated to AWE-enabled applications. You must use Memory Manager: Total Server Memory. It should, however, be noted that this only shows memory in use by the buffer pool.  Machines that have PAE enabled will not dump user mode memory. If an event occurs in User Mode Memory that causes a blue screen and root cause determination is absolutely necessary, the machine must be booted with the /NOPAE switch, and with /MAXMEM set to a number appropriate for transferring dump files.  With AWE enabled, SQL Server will, by default, allocate almost all memory during startup, leaving 256 MB or less free. This memory is locked and cannot be paged out. Consuming all available memory may prevent other applications or SQL Server instances from starting. Note PAE is not required to leverage AWE. However, if you have more than 4GB of physical memory you will not be able to access it unless you enable PAE. Caution It is highly recommended that you use the “max server memory” option in combination with “awe enabled” to ensure some memory headroom exists for other applications or instances of SQL Server, because AWE memory cannot be shared or paged. For more information, see the following Knowledge Base articles: Q268363 Intel Physical Addressing Extensions (PAE) in Windows 2000 Q241046 Cannot Create a dump File on Computers with over 4 GB RAM Q255600 Windows 2000 utilities do not display physical memory above 4GB Q274750 How to configure SQL Server memory more than 2 GB (Idea) Q266251 Memory dump stalls when PAE option is enabled (Idea) Tip The KB will return more hits if you query on PAE rather than AWE. Virtual Address Space Mapping Virtual Address Space Mapping By default Windows 2000 (on an X86 platform) uses a two-level (three-level when PAE is enabled) page table structure to translate virtual addresses to physical addresses. Each 32-bit address has three components, as shown below. When a process accesses a virtual address the system must first locate the Page Directory for the current process via register CR3 (X86). The first 10 bits of the virtual address act as an index into the Page Directory. The Page Directory Entry then points to the Page Frame Number (PFN) of the appropriate Page Table. The next 10 bits of the virtual address act as an index into the Page Table to locate the appropriate page. If the page is valid, the PTE contains the PFN of the actual page in memory. If the page is not valid, the memory management fault handler locates the page and attempts to make it valid. The final 12 bits act as a byte offset into the page. Note This multi-step process is expensive. This is why systems have translation look aside buffers (TLB) to speed up the process. One of the reasons context switching is so expensive is the translation buffers must be dumped. Thus, the first few lookups are very expensive. Refer to ISW2K pages 439-440. Core System Memory Related Counters Core System Memory Related Counters When evaluating memory performance you are looking at a wide variety of counters. The counters listed here are a few of the core counters that give you quick overall view of the state of memory. The two key counters are Available Bytes and Committed Bytes. If Committed Bytes exceeds the amount of physical memory in the system, you can be assured that there is some level of hard page fault activity happening. The goal of a well-tuned system is to have as little hard paging as possible. If Available Bytes is below 5 MB, you should investigate why. If Available Bytes is below 4 MB, the Working Set Manager will start to aggressively trim the working sets of process including the system cache.  Committed Bytes Total memory, including physical and page file currently committed  Commit Limit • Physical memory + page file size • Represents the total amount of memory that can be committed without expanding the page file. (Assuming page file is allowed to grow)  Available Bytes Total physical memory currently available Note Available Bytes is a key indicator of the amount of memory pressure. Windows 2000 will attempt to keep this above approximately 4 MB by aggressively trimming the working sets including system cache. If this value is constantly between 3-4 MB, it is cause for investigation. One counter you might expect would be for total physical memory. Unfortunately, there is no specific counter for total physical memory. There are however many other ways to determine total physical memory. One of the most common is by viewing the Performance tab of Task Manager. Page File Usage The only counters that show current page file space usage are Page File:% Usage and Page File:% Peak Usage. These two counters will give you an indication of the amount of space currently used in the page file. Memory Performance Memory Counters There are a number of counters that you need to investigate when evaluating memory performance. As stated previously, no single counter provides the entire picture. You will need to consider many different counters to begin to understand the true state of memory. Note The counters listed are a subset of the counters you should capture. *Available Bytes In general, it is desirable to see Available Bytes above 5 MB. SQL Servers goal on Intel platforms, running Windows NT, is to assure there is approximately 5+ MB of free memory. After Available Bytes reaches 4 MB, the Working Set Manager will start to aggressively trim the working sets of process and, finally, the system cache. This is not to say that working set trimming does not happen before 4 MB, but it does become more pronounced as the number of available bytes decreases below 4 MB. Page Faults/sec Page Faults/sec represents the total number of hard and soft page faults. This value includes the System Working Set as well. Keep this in mind when evaluating the amount of paging activity in the system. Because this counter includes paging associated with the System Cache, a server acting as a file server may have a much higher value than a dedicated SQL Server may have. The System Working Set is covered in depth on the next slide. Because Page Faults/sec includes soft faults, this counter is not as useful as Pages/sec, which represents hard page faults. Because of the associated I/O, hard page faults tend to be much more expensive. *Pages/sec Pages/sec represent the number of pages written/read from disk because of hard page faults. It is the sum of Memory: Pages Input/sec and Memory: Pages Output/sec. Because it is counted in numbers of pages, it can be compared to other counts of pages, such as Memory: Page Faults/sec, without conversion. On a well-tuned system, this value should be consistently low. In and of itself, a high value for this counter does not necessarily indicate a problem. You will need to isolate the paging activity to determine if it is associated with in-paging, out-paging, memory mapped file activity or system cache. Any one of these activities will contribute to this counter. Note Paging in and of itself is not necessarily a bad thing. Paging is only “bad” when a critical process must wait for it’s pages to be in-paged, or when the amount of read/write paging is causing excessive kernel time or disk I/O, thus interfering with normal user mode processing. Tip (Memory: Pages/sec) / (PhysicalDisk: Disk Bytes/sec * 4096) yields the approximate percentage of paging to total disk I/O. Note, this is only relevant on X86 platforms with a 4 KB page size. Page Reads/sec (Hard Page Fault) Page Reads/sec is the number of times the disk was accessed to resolve hard page faults. It includes reads to satisfy faults in the file system cache (usually requested by applications) and in non-cached memory mapped files. This counter counts numbers of read operations, without regard to the numbers of pages retrieved by each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Page Writes/sec (Hard Page Fault) Page Writes/sec is the number of times pages were written to disk to free up space in physical memory. Pages are written to disk only if they are changed while in physical memory, so they are likely to hold data, not code. This counter counts write operations, without regard to the number of pages written in each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. *Pages Input/sec (Hard Page Fault) Pages Input/sec is the number of pages read from disk to resolve hard page faults. It includes pages retrieved to satisfy faults in the file system cache and in non-cached memory mapped files. This counter counts numbers of pages, and can be compared to other counts of pages, such as Memory:Page Faults/sec, without conversion. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. This is one of the key counters to monitor for potential performance complaints. Because a process must wait for a read page fault this counter, read page faults have a direct impact on the perceived performance of a process. *Pages Output/sec (Hard Page Fault) Pages Output/sec is the number of pages written to disk to free up space in physical memory. Pages are written back to disk only if they are changed in physical memory, so they are likely to hold data, not code. A high rate of pages output might indicate a memory shortage. Windows NT writes more pages back to disk to free up space when physical memory is in short supply. This counter counts numbers of pages, and can be compared to other counts of pages, without conversion. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Like Pages Input/sec, this is one of the key counters to monitor. Processes will generally not notice write page faults unless the disk I/O begins to interfere with normal data operations. Demand Zero Faults/Sec (Soft Page Fault) Demand Zero Faults/sec is the number of page faults that require a zeroed page to satisfy the fault. Zeroed pages, pages emptied of previously stored data and filled with zeros, are a security feature of Windows NT. Windows NT maintains a list of zeroed pages to accelerate this process. This counter counts numbers of faults, without regard to the numbers of pages retrieved to satisfy the fault. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. Transition Faults/Sec (Soft Page Fault) Transition Faults/sec is the number of page faults resolved by recovering pages that were on the modified page list, on the standby list, or being written to disk at the time of the page fault. The pages were recovered without additional disk activity. Transition faults are counted in numbers of faults, without regard for the number of pages faulted in each operation. This counter displays the difference between the values observed in the last two samples, divided by the duration of the sample interval. System Working Set System Working Set Like processes, the system page-able code and data are managed by a working set. For the purpose of this course, that working set is referred to as the System Working Set. This is done to differentiate the system cache portion of the working set from the entire working set. There are five different types of pages that make up the System Working Set. They are: system cache; paged pool; page-able code and data in ntoskrnl.exe; page-able code, and data in device drivers and system-mapped views. Unfortunately, some of the counters that appear to represent the system cache actually represent the entire system working set. Where noted system cache actually represents the entire system working set. Note The counters listed are a subset of the counters you should capture. *Memory: Cache Bytes (Represents Total System Working Set) Represents the total size of the System Working Set including: system cache; paged pool; pageable code and data in ntoskrnl.exe; pageable code and data in device drivers; and system-mapped views. Cache Bytes is the sum of the following counters: System Cache Resident Bytes, System Driver Resident Bytes, System Code Resident Bytes, and Pool Paged Resident Bytes. Memory: System Cache Resident Bytes (System Cache) System Cache Resident Bytes is the number of bytes from the file system cache that are resident in physical memory. Windows 2000 Cache Manager works with the memory manager to provide virtual block stream and file data caching. For more information, see also…Inside Windows 2000,Third Edition, pp. 645-650 and p. 656. Memory: Pool Paged Resident Bytes Represents the physical memory consumed by Paged Pool. This counter should NOT be monitored by itself. You must also monitor Memory: Paged Pool. A leak in the pool may not show up in Pool paged Resident Bytes. Memory: System Driver Resident Bytes Represents the physical memory consumed by driver code and data. System Driver Resident Bytes and System Driver Total Bytes do not include code that must remain in physical memory and cannot be written to disk. Memory: System Code Resident Bytes Represents the physical memory consumed by page-able system code. System Code Resident Bytes and System Code Total Bytes do not include code that must remain in physical memory and cannot be written to disk. Working Set Performance Counter You can measure the number of page faults in the System Working Set by monitoring the Memory: Cache Faults/sec counter. Contrary to the “Explain” shown in System Monitor, this counter measures the total amount of page faults/sec in the System Working Set, not only the System Cache. You cannot measure the performance of the System Cache using this counter alone. For more information, see also…Inside Windows 2000,Third Edition, p. 656. Note You will find that in general the working set manager will usually trim the working sets of normal processes prior to trimming the system working set. System Cache System Cache The Windows 2000 cache manager provides a write-back cache with lazy writing and intelligent read-ahead. Files are not written to disk immediately but differed until the cache manager calls the memory manager to flush the cache. This helps to reduce the total number of I/Os. Once per second, the lazy writer thread queues one-eighth of the dirty pages in the system cache to be written to disk. If this is not sufficient to meet the needs, the lazy writer will calculate a larger value. If the dirty page threshold is exceeded prior to lazy writer waking, the cache manager will wake the lazy writer. Important It should be pointed out that mapped files or files opened with FILE_FLAG_NO_BUFFERING, do not participate in the System Cache. For more information regarding mapped views, see also…Inside Windows 2000,Third Edition, p. 669. For those applications that would like to leverage system cache but cannot tolerate write delays, the cache manager supports write through operations via the FILE_FLAG_WRITE_THROUGH. On the other hand, an application can disable lazy writing by using the FILE_ATTRIBUTE_TEMPORARY. If this flag is enabled, the lazy writer will not write the pages to disk unless there is a shortage of memory or the file is closed. Important Microsoft SQL Server uses both FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH Tip The file system cache is not represented by a static amount of memory. The system cache can and will grow. It is not unusual to see the system cache consume a large amount of memory. Like other working sets, it is trimmed under pressure but is generally the last thing to be trimmed. System Cache Performance Counters The counters listed are a subset of the counters you should capture. Cache: Data Flushes/sec Data Flushes/sec is the rate at which the file system cache has flushed its contents to disk as the result of a request to flush or to satisfy a write-through file write request. More than one page can be transferred on each flush operation. Cache: Data Flush Pages/sec Data Flush Pages/sec is the number of pages the file system cache has flushed to disk as a result of a request to flush or to satisfy a write-through file write request. Cache: Lazy Write Flushes/sec Represents the rate of lazy writes to flush the system cache per second. More than one page can be transferred per second. Cache: Lazy Write Pages/sec Lazy Write Pages/sec is the rate at which the Lazy Writer thread has written to disk. Note When looking at Memory:Cache Faults/sec, you can remove cache write activity by subtracting (Cache: Data Flush Pages/sec + Cache: Lazy Write Pages/sec). This will give you a better idea of how much other page faulting activity is associated with the other components of the System Working Set. However, you should note that there is no easy way to remove the page faults associated with file cache read activity. For more information, see the following Knowledge Base articles: Q145952 (NT4) Event ID 26 Appears If Large File Transfer Fails Q163401 (NT4) How to Disable Network Redirector File Caching Q181073 (SQL 6.5) DUMP May Cause Access Violation on Win2000 System Pool System Pool As documented earlier, there are two types of shared pool memory: non-paged pool and paged pool. Like private memory, pool memory is susceptible to a leak. Nonpaged Pool Miscellaneous kernel code and structures, and drivers that need working memory while at or above DPC/dispatch level use non-paged pool. The primary counter for non-paged pool is Memory: Pool Nonpaged Bytes. This counter will usually between 3 and 30 MB. Paged Pool Drivers that do not need to access memory above DPC/Dispatch level are one of the primary users of paged pool, however any process can use paged pool by leveraging the ExAllocatePool calls. Paged pool also contains the Registry and file and printing structures. The primary counters for monitoring paged pool is Memory: Pool Paged Bytes. This counter will usually be between 10-30MB plus the size of the Registry. To determine how much of paged pool is currently resident in physical memory, monitor Memory: Pool Paged Resident Bytes. Note The paged and non-paged pools are two of the components of the System Working Set. If a suspected leak is clearly visible in the overview and not associated with a process, then it is most likely a pool leak. If the leak is not associated with SQL Server handles, OLDEB providers, XPROCS or SP_OA calls then most likely this call should be pushed to the Windows NT group. For more information, see the following Knowledge Base articles: Q265028 (MS) Pool Tags Q258793 (MS) How to Find Memory Leaks by Using Pool Bitmap Analysis Q115280 (MS) Finding Windows NT Kernel Mode Memory Leaks Q177415 (MS) How to Use Poolmon to Troubleshoot Kernel Mode Memory Leaks Q126402 PagedPoolSize and NonPagedPoolSize Values in Windows NT Q247904 How to Configure Paged Pool and System PTE Memory Areas Tip To isolate pool leaks you will need to isolate all drivers and third-party processes. This should be done by disabling each service or driver one at a time and monitoring the effect. You can also monitor paged and non-paged pool through poolmon. If pool tagging has been enabled via GFLAGS, you may be able to associate the leak to a particular tag. If you suspect a particular tag, you should involve the platform support group. Process Memory Counters Process _Total Limitations Although the rollup of _Total for Process: Private Bytes, Virtual Bytes, Handles and Threads, represent the key resources being used across all processes, they can be misleading when evaluating a memory leak. This is because a leak in one process may be masked by a decrease in another process. Note The counters listed are a subset of the counters you should capture. Tip When analyzing memory leaks, it is often easier to a build either a separate chart or report showing only one or two key counters for all process. The primary counter used for leak analysis is private bytes, but processes can leak handles and threads just as easily. After a suspect process is located, build a separate chart that includes all the counters for that process. Individual Process Counters When analyzing individual process for memory leaks you should include the counters listed.  Process: % Processor Time  Process: Working Set (includes shared pages)  Process: Virtual Bytes  Process: Private Bytes  Process: Page Faults/sec  Process: Handle Count  Process: Thread Count  Process: Pool Paged Bytes  Process: Pool Nonpaged Bytes Tip WINLOGON, SVCHOST, services, or SPOOLSV are referred to as HELPER processes. They provide core functionality for many operations and as such are often extended by the addition of third-party DLLs. Tlist –s may help identify what services are running under a particular helper. Helper Processes Helper Processes Winlogon, Services, and Spoolsv and Svchost are examples of what are referred to as HELPER processes. They provide core functionality for many operations and, as such, are often extended by the addition of third-party DLLs. Running every service in its own process can waste system resources. Consequently, some services run in their own processes while others share a process with other services. One problem with sharing a process is that a bug in one service may cause the entire process to fail. The resource kit tool, Tlist when used with the –s qualifier can help you identify what services are running in what processes. WINLOGON Used to support GINAs. SPOOLSV SPOOLSV is responsible for printing. You will need to investigate all added printing functionality. Services Service is responsible for system services. Svchost.exe Svchost.exe is a generic host process name for services that are run from dynamic-link libraries (DLLs). There can be multiple instances of Svchost.exe running at the same time. Each Svchost.exe session can contain a grouping of services, so that separate services can be run depending on how and where Svchost.exe is started. This allows for better control and debugging. The Effect of Memory on Other Components Memory Drives Overall Performance Processor, cache, bus speeds, I/O, all of these resources play a roll in overall perceived performance. Without minimizing the impact of these components, it is important to point out that a shortage of memory can often have a larger perceived impact on performance than a shortage of some other resource. On the other hand, an abundance of memory can often be leveraged to mask bottlenecks. For instance, in certain environments, file system cache can significantly reduce the amount of disk I/O, potentially masking a slow I/O subsystem. Effect on I/O I/O can be driven by a number of memory considerations. Page read/faults will cause a read I/O when a page is not in memory. If the modified page list becomes too long the Modified Page Writer and Mapped Page Writer will need to start flushing pages causing disk writes. However, the one event that can have the greatest impact is running low on available memory. In this case, all of the above events will become more pronounced and have a larger impact on disk activity. Effect on CPU The most effective use of a processor from a process perspective is to spend as much time possible executing user mode code. Kernel mode represents processor time associated with doing work, directly or indirectly, on behalf of a thread. This includes items such as synchronization, scheduling, I/O, memory management, and so on. Although this work is essential, it takes processor cycles and the cost, in cycles, to transition between user and kernel mode is expensive. Because all memory management and I/O functions must be done in kernel mode, it follows that the fewer the memory resources the more cycles are going to be spent managing those resources. A direct result of low memory is that the Working Set Manager, Modified Page Writer and Mapped Page Writer will have to use more cycles attempting to free memory. Analyzing Memory Look for Trends and Trend Relationships Troubleshooting performance is about analyzing trends and trend relationships. Establishing that some event happened is not enough. You must establish the effect of the event. For example, you note that paging activity is high at the same time that SQL Server becomes slow. These two individual facts may or may not be related. If the paging is not associated with SQL Servers working set, or the disks SQL is using there may be little or no cause/affect relationship. Look at Physical Memory First The first item to look at is physical memory. You need to know how much physical and page file space the system has to work with. You should then evaluate how much available memory there is. Just because the system has free memory does not mean that there is not any memory pressure. Available Bytes in combination with Pages Input/sec and Pages Output/sec can be a good indicator as to the amount of pressure. The goal in a perfect world is to have as little hard paging activity as possible with available memory greater than 5 MB. This is not to say that paging is bad. On the contrary, paging is a very effective way to manage a limited resource. Again, we are looking for trends that we can use to establish relationships. After evaluating physical memory, you should be able to answer the following questions:  How much physical memory do I have?  What is the commit limit?  Of that physical memory, how much has the operating system committed?  Is the operating system over committing physical memory?  What was the peak commit charge?  How much available physical memory is there?  What is the trend associated with committed and available? Review System Cache and Pool Contribution After you understand the individual process memory usage, you need to evaluate the System Cache and Pool usage. These can and often represent a significant portion of physical memory. Be aware that System Cache can grow significantly on a file server. This is usually normal. One thing to consider is that the file system cache tends to be the last thing trimmed when memory becomes low. If you see abrupt decreases in System Cache Resident Bytes when Available Bytes is below 5 MB you can be assured that the system is experiencing excessive memory pressure. Paged and non-paged pool size is also important to consider. An ever-increasing pool should be an indicator for further research. Non-paged pool growth is usually a driver issue, while paged pool could be driver-related or process-related. If paged pool is steadily growing, you should investigate each process to see if there is a specific process relationship. If not you will have to use tools such as poolmon to investigate further. Review Process Memory Usage After you understand the physical memory limitations and cache and pool contribution you need to determine what components or processes are creating the pressure on memory, if any. Be careful if you opt to chart the _Total Private Byte’s rollup for all processes. This value can be misleading in that it includes shared pages and can therefore exceed the actual amount of memory being used by the processes. The _Total rollup can also mask processes that are leaking memory because other processes may be freeing memory thus creating a balance between leaked and freed memory. Identify processes that expand their working set over time for further analysis. Also, review handles and threads because both use resources and potentially can be mismanaged. After evaluating the process resource usage, you should be able to answer the following:  Are any of the processes increasing their private bytes over time?  Are any processes growing their working set over time?  Are any processes increasing the number of threads or handles over time?  Are any processes increasing their use of pool over time?  Is there a direct relationship between the above named resources and total committed memory or available memory?  If there is a relationship, is this normal behavior for the process in question? For example, SQL does not commit ‘min memory’ on startup; these pages are faulted in into the working set as needed. This is not necessarily an indication of a memory leak.  If there is clearly a leak in the overview and is not identifiable in the process counters it is most likely in the pool.  If the leak in pool is not associated with SQL Server handles, then more often than not, it is not a SQL Server issue. There is however the possibility that the leak could be associated with third party XPROCS, SP_OA* calls or OLDB providers. Review Paging Activity and Its Impact on CPU and I/O As stated earlier, paging is not in and of itself a bad thing. When starting a process the system faults in the pages of an executable, as they are needed. This is preferable to loading the entire image at startup. The same can be said for memory mapped files and file system cache. All of these features leverage the ability of the system to fault in pages as needed The greatest impact of paging on a process is when the process must wait for an in-page fault or when page file activity represents a significant portion of the disk activity on the disk the application is actively using. After evaluating page fault activity, you should be able to answer the following questions:  What is the relationship between PageFaults/sec and Page Input/sec + Page Output/Sec?  What is the relationship if any between hard page faults and available memory?  Does paging activity represent a significant portion of processor or I/O resource usage? Don’t Prematurely Jump to Any Conclusions Analyzing memory pressure takes time and patience. An individual counter in and of it self means little. It is only when you start to explore relationships between cause and effect that you can begin to understand the impact of a particular counter. The key thoughts to remember are:  With the exception of a swap (when the entire process’s working set has been swapped out/in), hard page faults to resolve reads, are the most expensive in terms its effect on a processes perceived performance.  In general, page writes associated with page faults do not directly affect a process’s perceived performance, unless that process is waiting on a free page to be made available. Page file activity can become a problem if that activity competes for a significant percentage of the disk throughput in a heavy I/O orientated environment. That assumes of course that the page file resides on the same disk the application is using. Lab 3.1 System Memory Lab 3.1 Analyzing System Memory Using System Monitor Exercise 1 – Troubleshooting the Cardinal1.log File Students will evaluate an existing System Monitor log and determine if there is a problem and what the problem is. Students should be able to isolate the issue as a memory problem, locate the offending process, and determine whether or not this is a pool issue. Exercise 2 – Leakyapp Behavior Students will start leaky app and monitor memory, page file and cache counters to better understand the dynamics of these counters. Exercise 3 – Process Swap Due To Minimizing of the Cmd Window Students will start SQL from command line while viewing SQL process performance counters. Students will then minimize the window and note the effect on the working set. Overview What You Will Learn After completing this lab, you will be able to:  Use some of the basic functions within System Monitor.  Troubleshoot one or more common performance scenarios. Before You Begin Prerequisites To complete this lab, you need the following:  Windows 2000  SQL Server 2000  Lab Files Provided  LeakyApp.exe (Resource Kit) Estimated time to complete this lab: 45 minutes Exercise 1 Troubleshooting the Cardinal1.log File In this exercise, you will analyze a log file from an actual system that was having performance problems. Like an actual support engineer, you will not have much information from which to draw conclusions. The customer has sent you this log file and it is up to you to find the cause of the problem. However, unlike the real world, you have an instructor available to give you hints should you become stuck. Goal Review the Cardinal1.log file (this file is from Windows NT 4.0 Performance Monitor, which Windows 2000 can read). Chart the log file and begin to investigate the counters to determine what is causing the performance problems. Your goal should be to isolate the problem to a major area such as pool, virtual address space etc, and begin to isolate the problem to a specific process or thread. This lab requires access to the log file Cardinal1.log located in C:\LABS\M3\LAB1\EX1  To analyze the log file 1. Using the Performance MMC, select the System Monitor snap-in, and click the View Log File Data button (icon looks like a disk). 2. Under Files of type, choose PERFMON Log Files (*.log) 3. Navigate to the folder containing Cardinal1.log file and open it. 4. Begin examining counters to find what might be causing the performance problems. When examining some of these counters, you may notice that some of them go off the top of the chart. It may be necessary to adjust the scale on these. This can be done by right-clicking the rightmost pane and selecting Properties. Select the Data tab. Select the counter that you wish to modify. Under the Scale option, change the scale value, which makes the counter data visible on the chart. You may need to experiment with different scale values before finding the ideal value. Also, it may sometimes be beneficial to adjust the vertical scale for the entire chart. Selecting the Graph tab on the Properties page can do this. In the Vertical scale area, adjust the Maximum and Minimum values to best fit the data on the chart. Lab 3.1, Exercise 1: Results Exercise 2 LeakyApp Behavior In this lab, you will have an opportunity to work with a partner to monitor a live system, which is suffering from a simulated memory leak. Goal During this lab, your goal is to observe the system behavior when memory starts to become a limited resource. Specifically you will want to monitor committed memory, available memory, the system working set including the file system cache and each processes working set. At the end of the lab, you should be able to provide an answer to the listed questions.  To monitor a live system with a memory leak 1. Choose one of the two systems as a victim on which to run the leakyapp.exe program. It is recommended that you boot using the \MAXMEM=128 option so that this lab goes a little faster. You and your partner should decide which server will play the role of the problematic server and which server is to be used for monitoring purposes. 2. On the problematic server, start the leakyapp program. 3. On the monitoring system, create a counter that logs all necessary counters need to troubleshoot a memory problem. This should include physicaldisk counters if you think paging is a problem. Because it is likely that you will only need to capture less than five minutes of activity, the suggested interval for capturing is five seconds. 4. After the counters have been started, start the leaky application program 5. Click Start Leaking. The button will now change to Stop Leaking, which indicates that the system is now leaking memory. 6. After leakyapp shows the page file is 50 percent full, click Stop leaking. Note that the process has not given back its memory, yet. After approximately one minute, exit. Lab 3.1, Exercise 2: Questions After analyzing the counter logs you should be able to answer the following: 1. Under which system memory counter does the leak show up clearly? Memory:Committed Bytes 2. What process counter looked very similar to the overall system counter that showed the leak? Private Bytes 3. Is the leak in Paged Pool, Non-paged pool, or elsewhere? Elsewhere 4. At what point did Windows 2000 start to aggressively trim the working sets of all user processes? <5 MB Free 5. Was the System Working Set trimmed before or after the working sets of other processes? After 6. What counter showed this? Memory:Cache Bytes 7. At what point was the File System Cache trimmed? After the first pass through all other working sets 8. What was the effect on all the processes working set when the application quit leaking? None 9. What was the effect on all the working sets when the application exited? Nothing, initially; but all grew fairly quickly based on use 10. When the server was running low on memory, which was Windows spending more time doing, paging to disk or in-paging? Paging to disk, initially; however, as other applications began to run, in-paging increased Exercise 3 Minimizing a Command Window In this exercise, you will have an opportunity to observe the behavior of Windows 2000 when a command window is minimized. Goal During this lab, your goal is to observe the behavior of Windows 2000 when a command window becomes minimized. Specifically, you will want to monitor private bytes, virtual bytes, and working set of SQL Server when the command window is minimized. At the end of the lab, you should be able to provide an answer to the listed questions.  To monitor a command window’s working set as the window is minimized 1. Using System Monitor, create a counter list that logs all necessary counters needed to troubleshoot a memory problem. Because it is likely that you will only need to capture less than five minutes of activity, the suggested capturing interval is five seconds. 2. After the counters have been started, start a Command Prompt window on the target system. 3. In the command window, start SQL Server from the command line. Example: SQL Servr.exe –c –sINSTANCE1 4. After SQL Server has successfully started, Minimize the Command Prompt window. 5. Wait approximately two minutes, and then Restore the window. 6. Wait approximately two minutes, and then stop the counter log. Lab 3.1, Exercise 3: Questions After analyzing the counter logs you should be able to answer the following questions: 1. What was the effect on SQL Servers private bytes, virtual bytes, and working set when the window was minimized? Private Bytes and Virtual Bytes remained the same, while Working Set went to 0 2. What was the effect on SQL Servers private bytes, virtual bytes, and working set when the window was restored? None; the Working Set did not grow until SQL accessed the pages and faulted them back in on an as-needed basis SQL Server Memory Overview SQL Server Memory Overview Now that you have a better understanding of how Windows 2000 manages memory resources, you can take a closer look at how SQL Server 2000 manages its memory. During the course of the lecture and labs you will have the opportunity to monitor SQL Servers use of memory under varying conditions using both System Monitor counters and SQL Server tools. SQL Server Memory Management Goals Because SQL Server has in-depth knowledge about the relationships between data and the pages they reside on, it is in a better position to judge when and what pages should be brought into memory, how many pages should be brought in at a time, and how long they should be resident. SQL Servers primary goals for management of its memory are the following:  Be able to dynamically adjust for varying amounts of available memory.  Be able to respond to outside memory pressure from other applications.  Be able to adjust memory dynamically for internal components. Items Covered  SQL Server Memory Definitions  SQL Server Memory Layout  SQL Server Memory Counters  Memory Configurations Options  Buffer Pool Performance and Counters  Set Aside Memory and Counters  General Troubleshooting Process  Memory Myths and Tips SQL Server Memory Definitions SQL Server Memory Definitions Pool A group of resources, objects, or logical components that can service a resource allocation request Cache The management of a pool or resource, the primary goal of which is to increase performance. Bpool The Bpool (Buffer Pool) is a single static class instance. The Bpool is made up of 8-KB buffers and can be used to handle data pages or external memory requests. There are three basic types or categories of committed memory in the Bpool.  Hashed Data Pages  Committed Buffers on the Free List  Buffers known by their owners (Refer to definition of Stolen) Consumer A consumer is a subsystem that uses the Bpool. A consumer can also be a provider to other consumers. There are five consumers and two advanced consumers who are responsible for the different categories of memory. The following list represents the consumers and a partial list of their categories  Connection – Responsible for PSS and ODS memory allocations  General – Resource structures, parse headers, lock manager objects  Utilities – Recovery, Log Manager  Optimizer – Query Optimization  Query Plan – Query Plan Storage Advanced Consumer Along with the five consumers, there are two advanced consumers. They are  Ccache – Procedure cache. Accepts plans from the Optimizer and Query Plan consumers. Is responsible for managing that memory and determines when to release the memory back to the Bpool.  Log Cache – Managed by the LogMgr, which uses the Utility consumer to coordinate memory requests with the Bpool. Reservation Requesting the future use of a resource. A reservation is a reasonable guarantee that the resource will be available in the future. Committed Producing the physical resource Allocation The act of providing the resource to a consumer Stolen The act of getting a buffer from the Bpool is referred to as stealing a buffer. If the buffer is stolen and hashed for a data page, it is referred to as, and counted as, a Hashed buffer, not a stolen buffer. Stolen buffers on the other hand are buffers used for things such as procedure cache and SRV_PROC structures. Target Target memory is the amount of memory SQL Server would like to maintain as committed memory. Target memory is based on the min and max server configuration values and current available memory as reported by the operating system. Actual target calculation is operating system specific. Memory to Leave (Set Aside) The virtual address space set aside to ensure there is sufficient address space for thread stacks, XPROCS, COM objects etc. Hashed Page A page in pool that represents a database page. SQL Server Memory Layout Virtual Address Space When SQL Server is started the minimum of physical ram or virtual address space supported by the OS is evaluated. There are many possible combinations of OS versions and memory configurations. For example: you could be running Microsoft Windows 2000 Advanced Server with 2 GB or possibly 4 GB of memory. To avoid page file use, the appropriate memory level is evaluated for each configuration. Important Utilities can inject a DLL into the process address space by using HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\Windows\AppInit_DLLs When the USER32.dll library is mapped into the process space, so, too, are the DLLs listed in the Registry key. To determine what DLL’s are running in SQL Server address space you can use tlist.exe. You can also use a tool such as Depends from Microsoft or HandelEx from http://ww.sysinternals.com. Memory to Leave As stated earlier there are many possible configurations of physical memory and address space. It is possible for physical memory to be greater than virtual address space. To ensure that some virtual address space is always available for things such as thread stacks and external needs such as XPROCS, SQL Server reserves a small portion of virtual address space prior to determining the size of the buffer pool. This address space is referred to as Memory To Leave. Its size is based on the number of anticipated tread stacks and a default value for external needs referred to as cmbAddressSave. After reserving the buffer pool space, the Memory To Leave reservation is released. Buffer Pool Space During Startup, SQL Server must determine the maximum size of the buffer pool so that the BUF, BUFHASH and COMMIT BITMAP structures that are used to manage the Bpool can be created. It is important to understand that SQL Server does not take ‘max memory’ or existing memory pressure into consideration. The reserved address space of the buffer pool remains static for the life of SQL Server process. However, the committed space varies as necessary to provide dynamic scaling. Remember only the committed memory effects the overall memory usage on the machine. This ensures that the max memory configuration setting can be dynamically changed with minimal changes needed to the Bpool. The reserved space does not need to be adjusted and is maximized for the current machine configuration. Only the committed buffers need to be limited to maintain a specified max server memory (MB) setting. SQL Server Startup Pseudo Code The following pseudo code represents the process SQL Server goes through on startup. Warning This example does not represent a completely accurate portrayal of the steps SQL Server takes when initializing the buffer pool. Several details have been left out or glossed over. The intent of this example is to help you understand the general process, not the specific details.  Determine the size of cmbAddressSave (-g)  Determine Total Physical Memory  Determine Available Physical Memory  Determine Total Virtual Memory  Calculate MemToLeave maxworkterthreads * (stacksize=512 KB) + (cmbAddressSave = 256 MB)  Reserve MemToLeave and set PAGE_NOACCESS  Check for AWE, test to see if it makes sense to use it and log the results • Min(Available Memory, Max Server Memory) > Virtual Memory • Supports Read Scatter • SQL Server not started with -f • AWE Enabled via sp_configure • Enterprise Edition • Lock Pages In Memory user right enabled  Calculate Virtual Address Limit VA Limit = Min(Physical Memory, Virtual Memory – MemtoLeave)  Calculate the number of physical and virtual buffers that can be supported AWE Present Physical Buffers = (RAM / (PAGESIZE + Physical Overhead)) Virtual Buffers = (VA Limit / (PAGESIZE + Virtual Overhead)) AWE Not Present Physical Buffers = Virtual Buffers = VA Limit / (PAGESIZE + Physical Overhead + Virtual Overhead)  Make sure we have the minimum number of buffers Physical Buffers = Max(Physical Buffers, MIN_BUFFERS)  Allocate and commit the buffer management structures  Reserve the address space required to support the Bpool buffers  Release the MemToLeave SQL Server Startup Pseudo Code Example The following is an example based on the pseudo code represented on the previous page. This example is based on a machine with 384 MB of physical memory, not using AWE or /3GB. Note CmbAddressSave was changed between SQL Server 7.0 and SQL Server 2000. For SQL Server 7.0, cmbAddressSave was 128. Warning This example does not represent a completely accurate portrayal of the steps SQL Server takes when initializing the buffer pool. Several details have been left out or glossed over. The intent of this example is to help you understand the general process, not the specific details.  Determine the size of cmbAddressSave (No –g so 256MB)  Determine Total Physical Memory (384)  Determine Available Physical Memory (384)  Determine Total Virtual Memory (2GB)  Calculate MemToLeave maxworkterthreads * (stacksize=512 KB) + (cmbAddressSave = 256 MB) (255 * .5MB + 256MB = 384MB)  Reserve MemToLeave and set PAGE_NOACCESS  Check for AWE, test to see if it makes sense to use it and log the results (AWE Not Enabled)  Calculate Virtual Address Limit VA Limit = Min(Physical Memory, Virtual Memory – MemtoLeave) 384MB = Min(384MB, 2GB – 384MB)  Calculate the number of physical and virtual buffers that can be supported AWE Not Present 48664 (approx) = 384 MB / (8 KB + Overhead)  Make sure we have the minimum number of buffers Physical Buffers = Max(Physical Buffers, MIN_BUFFERS) 48664 = Max(48664,1024)  Allocate and commit the buffer management structures  Reserve the address space required to support the Bpool buffers  Release the MemToLeave Tip Trace Flag 1604 can be used to view memory allocations on startup. The cmbAddressSave can be adjusted using the –g XXX startup parameter. SQL Server Memory Counters SQL Server Memory Counters The two primary tools for monitoring and analyzing SQL Server memory usage are System Monitor and DBCC MEMORYSTATUS. For detailed information on DBCC MEMORYSTATUS refer to Q271624 Interpreting the Output of the DBCC MEMORYSTAUS Command. Important Represents SQL Server 2000 Counters. The counters presented are not the same as the counters for SQL Server 7.0. The SQL Server 7.0 counters are listed in the appendix. Determining Memory Usage for OS and BPOOL Memory Manager: Total Server memory (KB) - Represents all of SQL usage Buffer Manager: Total Pages - Represents total bpool usage To determine how much of Total Server Memory (KB) represents MemToLeave space; subtract Buffer Manager: Total Pages. The result can be verified against DBCC MEMORYSTATUS, specifically Dynamic Memory Manager: OS In Use. It should however be noted that this value only represents requests that went thru the bpool. Memory reserved outside of the bpool by components such as COM objects will not show up here, although they will count against SQL Server private byte count. Buffer Counts: Target (Buffer Manager: Target Pages) The size the buffer pool would like to be. If this value is larger than committed, the buffer pool is growing. Buffer Counts: Committed (Buffer Manager: Total Pages) The total number of buffers committed in the OS. This is the current size of the buffer pool. Buffer Counts: Min Free This is the number of pages that the buffer pool tries to keep on the free list. If the free list falls below this value, the buffer pool will attempt to populate it by discarding old pages from the data or procedure cache. Buffer Distribution: Free (Buffer Manager / Buffer Partition: Free Pages) This value represents the buffers currently not in use. These are available for data or may be requested by other components and mar
内容: 一. Bootloader 二.Kernel引导入口 三.核心数据结构初始化--内核引导第一部分 四.外设初始化--内核引导第二部分 五.init进程和inittab引导指令 六.rc启动脚本 七.getty和login 八.bash 附:XDM方式登录 本文以Redhat 6.0 Linux 2.2.19 for Alpha/AXP为平台,描述了从开机到登录的 Linux 启动全过程。该文对i386平台同样适用。 一. Bootloader 在Alpha/AXP 平台上引导Linux通常有两种方法,一种是由MILO及其他类似的引导程序引导,另一种是由Firmware直接引导。MILO功能与i386平台的LILO相近,但内置有基本的磁盘驱动程序(如IDE、SCSI等),以及常见的文件系统驱动程序(如ext2,iso9660等), firmware有ARC、SRM两种形式,ARC具有类BIOS界面,甚至还有多重引导的设置;而SRM则具有功能强大的命令行界面,用户可以在控制台上使用boot等命令引导系统。ARC有分区(Partition)的概念,因此可以访问到分区的首扇区;而SRM只能将控制转给磁盘的首扇区。两种firmware都可以通过引导MILO来引导Linux,也可以直接引导Linux的引导代码。 “arch/alpha/boot” 下就是制作Linux Bootloader的文件。“head.S”文件提供了对 OSF PAL/1的调用入口,它将被编译后置于引导扇区(ARC的分区首扇区或SRM的磁盘0扇区),得到控制后初始化一些数据结构,再将控制转给“main.c”中的start_kernel(), start_kernel()向控制台输出一些提示,调用pal_init()初始化PAL代码,调用openboot() 打开引导设备(通过读取Firmware环境),调用load()将核心代码加载到START_ADDR(见 “include/asm-alpha/system.h”),再将Firmware中的核心引导参数加载到ZERO_PAGE(0) 中,最后调用runkernel()将控制转给0x100000的kernel,bootloader部分结束。 “arch/alpha/boot/bootp.c”以“main.c”为基础,可代替“main.c”与“head.S” 生成用于BOOTP协议网络引导的Bootloader。 Bootloader中使用的所有“srm_”函数在“arch/alpha/lib/”中定义。 以上这种Boot方式是一种最简单的方式,即不需其他工具就能引导Kernel,前提是按照 Makefile的指导,生成bootimage文件,内含以上提到的bootloader以及vmlinux,然后将 bootimage写入自磁盘引导扇区始的位置中。 当采用MILO这样的引导程序来引导Linux时,不需要上面所说的Bootloader,而只需要 vmlinux或vmlinux.gz,引导程序会主动解压加载内核到0x1000(小内核)或0x100000(大内核),并直接进入内核引导部分,即本文的第二节。 对于I386平台 i386系统中一般都有BIOS做最初的引导工作,那就是将四个主分区表中的第一个可引导 分区的第一个扇区加载到实模式地址0x7c00上,然后将控制转交给它。 在“arch/i386/boot” 目录下,bootsect.S是生成引导扇区的汇编源码,它首先将自己拷贝到0x90000上,然后将紧接其后的setup部分(第二扇区)拷贝到0x90200,将真正的内核代码拷贝到0x100000。以上这些拷贝动作都是以bootsect.S、setup.S以及vmlinux在磁盘上连续存放为前提的,也就是说,我们的bzImage文件或者zImage文件是按照bootsect,setup, vmlinux这样的顺序组织,并存放于始于引导分区的首扇区的连续磁盘扇区之中。 bootsect.S完成加载动作后,就直接跳转到0x90200,这里正是setup.S的程序入口。 setup.S的主要功能就是将系统参数(包括内存、磁盘等,由BIOS返回)拷贝到 0x90000-0x901FF内存中,这个地方正是bootsect.S存放的地方,这时它将被系统参数覆盖。以后这些参数将由保护模式下的代码来读取。 除此之外,setup.S还将video.S中的代码包含进来,检测和设置显示器和显示模式。最 后,setup.S将系统转换到保护模式,并跳转到0x100000(对于bzImage格式的大内核是 0x100000,对于zImage格式的是0x1000)的内核引导代码,Bootloader过程结束。 对于2.4.x版内核 没有什么变化。 二.Kernel引导入口 在arch/alpha/vmlinux.lds 的链接脚本控制下,链接程序将vmlinux的入口置于 "arch/alpha/kernel/head.S"中的__start上,因此当Bootloader跳转到0x100000时, __start处的代码开始执行。__start的代码很简单,只需要设置一下全局变量,然后就跳转到start_kernel去了。start_kernel()是"init/main.c"中的asmlinkage函数,至此,启动过程转入体系结构无关的通用C代码中。 对于I386平台 在i386体系结构中,因为i386本身的问题,在 "arch/alpha/kernel/head.S"中需要更多的设置,但最终也是通过call SYMBOL_NAME(start_kernel)转到start_kernel()这个体系结构无关的函数中去执行了。 所不同的是,在i386系统中,当内核以bzImage的形式压缩,即大内核方式(__BIG_KERNEL__)压缩时就需要预先处理bootsect.S和setup.S,按照大核模式使用$(CPP) 处理生成bbootsect.S和bsetup.S,然后再编译生成相应的.o文件,并使用 "arch/i386/boot/compressed/build.c"生成的build工具,将实际的内核(未压缩的,含 kernel中的head.S代码)与"arch/i386/boot/compressed"下的head.S和misc.c合成到一起,其中的 head.S代替了"arch/i386/kernel/head.S"的位置,由Bootloader引导执行(startup_32入口),然后它调用misc.c中定义的decompress_kernel()函数,使用 "lib/inflate.c"中定义的gunzip()将内核解压到0x100000,再转到其上执行 "arch/i386/kernel/head.S"中的startup_32代码。 对于2.4.x版内核 没有变化。 三.核心数据结构初始化--内核引导第一部分 start_kernel()中调用了一系列初始化函数,以完成kernel本身的设置。 这些动作有的是公共的,有的则是需要配置的才会执行的。 在start_kernel()函数中, 输出Linux版本信息(printk(linux_banner)) 设置与体系结构相关的环境(setup_arch()) 页表结构初始化(paging_init()) 使用"arch/alpha/kernel/entry.S"中的入口点设置系统自陷入口(trap_init()) 使用alpha_mv结构和entry.S入口初始化系统IRQ(init_IRQ()) 核心进程调度器初始化(包括初始化几个缺省的Bottom-half,sched_init()) 时间、定时器初始化(包括读取CMOS时钟、估测主频、初始化定时器中断等,time_init()) 提取并分析核心启动参数(从环境变量中读取参数,设置相应标志位等待处理,(parse_options()) 控制台初始化(为输出信息而先于PCI初始化,console_init()) 剖析器数据结构初始化(prof_buffer和prof_len变量) 核心Cache初始化(描述Cache信息的Cache,kmem_cache_init()) 延迟校准(获得时钟jiffies与CPU主频ticks的延迟,calibrate_delay()) 内存初始化(设置内存上下界和页表项初始值,mem_init()) 创建和设置内部及通用cache("slab_cache",kmem_cache_sizes_init()) 创建uid taskcount SLAB cache("uid_cache",uidcache_init()) 创建文件cache("files_cache",filescache_init()) 创建目录cache("dentry_cache",dcache_init()) 创建与虚存相关的cache("vm_area_struct","mm_struct",vma_init()) 块设备读写缓冲区初始化(同时创建"buffer_head"cache用户加速访问,buffer_init()) 创建页cache(内存页hash表初始化,page_cache_init()) 创建信号队列cache("signal_queue",signals_init()) 初始化内存inode表(inode_init()) 创建内存文件描述符表("filp_cache",file_table_init()) 检查体系结构漏洞(对于alpha,此函数为空,check_bugs()) SMP机器其余CPU(除当前引导CPU)初始化(对于没有配置SMP的内核,此函数为空,smp_init()) 启动init过程(创建第一个核心线程,调用init()函数,原执行序列调用cpu_idle() 等待调度,init()) 至此start_kernel()结束,基本的核心环境已经建立起来了。 对于I386平台 i386平台上的内核启动过程与此基本相同,所不同的主要是实现方式。 对于2.4.x版内核 2.4.x中变化比较大,但基本过程没变,变动的是各个数据结构的具体实现,比如Cache。 四.外设初始化--内核引导第二部分 init()函数作为核心线程,首先锁定内核(仅对SMP机器有效),然后调用 do_basic_setup()完成外设及其驱动程序的加载和初始化。过程如下: 总线初始化(比如pci_init()) 网络初始化(初始化网络数据结构,包括sk_init()、skb_init()和proto_init()三部分,在proto_init()中,将调用protocols结构中包含的所有协议的初始化过程,sock_init()) 创建bdflush核心线程(bdflush()过程常驻核心空间,由核心唤醒来清理被写过的内存缓冲区,当bdflush()由kernel_thread()启动后,它将自己命名为kflushd) 创建kupdate核心线程(kupdate()过程常驻核心空间,由核心按时调度执行,将内存缓冲区中的信息更新到磁盘中,更新的内容包括超级块和inode表) 设置并启动核心调页线程kswapd(为了防止kswapd启动时将版本信息输出到其他信息中间,核心线调用kswapd_setup()设置kswapd运行所要求的环境,然后再创建 kswapd核心线程) 创建事件管理核心线程(start_context_thread()函数启动context_thread()过程,并重命名为keventd) 设备初始化(包括并口parport_init()、字符设备chr_dev_init()、块设备 blk_dev_init()、SCSI设备scsi_dev_init()、网络设备net_dev_init()、磁盘初始化及分区检查等等, device_setup()) 执行文件格式设置(binfmt_setup()) 启动任何使用__initcall标识的函数(方便核心开发者添加启动函数,do_initcalls()) 文件系统初始化(filesystem_setup()) 安装root文件系统(mount_root()) 至此do_basic_setup()函数返回init(),在释放启动内存段(free_initmem())并给内核解锁以后,init()打开 /dev/console设备,重定向stdin、stdout和stderr到控制台,最后,搜索文件系统中的init程序(或者由init=命令行参数指定的程序),并使用 execve()系统调用加载执行init程序。 init()函数到此结束,内核的引导部分也到此结束了,这个由start_kernel()创建的第一个线程已经成为一个用户模式下的进程了。此时系统中存在着六个运行实体: start_kernel()本身所在的执行体,这其实是一个"手工"创建的线程,它在创建了init()线程以后就进入cpu_idle()循环了,它不会在进程(线程)列表中出现 init线程,由start_kernel()创建,当前处于用户态,加载了init程序 kflushd核心线程,由init线程创建,在核心态运行bdflush()函数 kupdate核心线程,由init线程创建,在核心态运行kupdate()函数 kswapd核心线程,由init线程创建,在核心态运行kswapd()函数 keventd核心线程,由init线程创建,在核心态运行context_thread()函数 对于I386平台 基本相同。 对于2.4.x版内核 这一部分的启动过程在2.4.x内核中简化了不少,缺省的独立初始化过程只剩下网络 (sock_init())和创建事件管理核心线程,而其他所需要的初始化都使用__initcall()宏 包含在do_initcalls()函数中启动执行。 五.init进程和inittab引导指令 init进程是系统所有进程的起点,内核在完成核内引导以后,即在本线程(进程)空 间内加载init程序,它的进程号是1。 init程序需要读取/etc/inittab文件作为其行为指针,inittab是以行为单位的描述性(非执行性)文本,每一个指令行都具有以下格式: id:runlevel:action:process其中id为入口标识符,runlevel为运行级别,action为动作代号,process为具体的执行程序。 id一般要求4个字符以内,对于getty或其他login程序项,要求id与tty的编号相同,否则getty程序将不能正常工作。 runlevel 是init所处于的运行级别的标识,一般使用0-6以及S或s。0、1、6运行级别被系统保留,0作为shutdown动作,1作为重启至单用户模式,6 为重启;S和s意义相同,表示单用户模式,且无需inittab文件,因此也不在inittab中出现,实际上,进入单用户模式时,init直接在控制台(/dev/console)上运行/sbin/sulogin。 在一般的系统实现中,都使用了2、3、4、5几个级别,在 Redhat系统中,2表示无NFS支持的多用户模式,3表示完全多用户模式(也是最常用的级别),4保留给用户自定义,5表示XDM图形登录方式。7- 9级别也是可以使用的,传统的Unix系统没有定义这几个级别。runlevel可以是并列的多个值,以匹配多个运行级别,对大多数action来说,仅当runlevel与当前运行级别匹配成功才会执行。 initdefault是一个特殊的action值,用于标识缺省的启动级别;当init由核心激活 以后,它将读取inittab中的initdefault项,取得其中的runlevel,并作为当前的运行级别。如果没有inittab文件,或者其中没有initdefault项,init将在控制台上请求输入 runlevel。 sysinit、 boot、bootwait等action将在系统启动时无条件运行,而忽略其中的runlevel,其余的action(不含initdefault)都与某个runlevel相关。各个action的定义在inittab的man手册中有详细的描述。 在Redhat系统中,一般情况下inittab都会有如下几项: id:3:initdefault: #表示当前缺省运行级别为3--完全多任务模式; si::sysinit:/etc/rc.d/rc.sysinit #启动时自动执行/etc/rc.d/rc.sysinit脚本 l3:3:wait:/etc/rc.d/rc 3 #当运行级别为3时,以3为参数运行/etc/rc.d/rc脚本,init将等待其返回 0:12345:respawn:/sbin/mingetty tty0 #在1-5各个级别上以tty0为参数执行/sbin/mingetty程序,打开tty0终端用于 #用户登录,如果进程退出则再次运行mingetty程序 x:5:respawn:/usr/bin/X11/xdm -nodaemon #在5级别上运行xdm程序,提供xdm图形方式登录界面,并在退出时重新执行. 六.rc启动脚本 上一节已经提到init进程将启动运行rc脚本,这一节将介绍rc脚本具体的工作。 一般情况下,rc启动脚本都位于/etc/rc.d目录下,rc.sysinit中最常见的动作就是激活交换分区,检查磁盘,加载硬件模块,这些动作无论哪个运行级别都是需要优先执行的。仅当rc.sysinit执行完以后init才会执行其他的boot或bootwait动作。 如果没有其他boot、bootwait动作,在运行级别3下,/etc/rc.d/rc将会得到执行,命令行参数为3,即执行 /etc/rc.d/rc3.d/目录下的所有文件。rc3.d下的文件都是指向/etc/rc.d/init.d/目录下各个Shell脚本的符号连接,而这些脚本一般能接受start、stop、restart、status等参数。rc脚本以start参数启动所有以S开头的脚本,在此之前,如果相应的脚本也存在K打头的链接,而且已经处于运行态了(以/var/lock/subsys/下的文件作为标志),则将首先启动K开头的脚本,以stop 作为参数停止这些已经启动了的服务,然后再重新运行。显然,这样做的直接目的就是当init改变运行级别时,所有相关的服务都将重启,即使是同一个级别。 rc程序执行完毕后,系统环境已经设置好了,下面就该用户登录系统了。 七.getty和login 在rc返回后,init将得到控制,并启动mingetty(见第五节)。mingetty是getty的简化,不能处理串口操作。getty的功能一般包括: 打开终端线,并设置模式 输出登录界面及提示,接受用户名的输入 以该用户名作为login的参数,加载login程序 缺省的登录提示记录在/etc/issue文件中,但每次启动,一般都会由rc.local脚本根据系统环境重新生成。 注:用于远程登录的提示信息位于/etc/issue.net中。 login程序在getty的同一个进程空间中运行,接受getty传来的用户名参数作为登录的用户名。 如果用户名不是root,且存在/etc/nologin文件,login将输出nologin文件的内容,然后退出。这通常用来系统维护时防止非root用户登录。 只有/etc/securetty中登记了的终端才允许root用户登录,如果不存在这个文件,则root可以在任何终端上登录。/etc/usertty文件用于对用户作出附加访问限制,如果不存在这个文件,则没有其他限制。 当用户登录通过了这些检查后,login将搜索/etc/passwd文件(必要时搜索 /etc/shadow文件)用于匹配密码、设置主目录和加载shell。如果没有指定主目录,将默认为根目录;如果没有指定shell,将默认为 /bin/sh。在将控制转交给shell以前, getty将输出/var/log/lastlog中记录的上次登录系统的信息,然后检查用户是否有新邮件(/usr/spool/mail/ {username})。在设置好shell的uid、gid,以及TERM,PATH 等环境变量以后,进程加载shell,login的任务也就完成了。 八.bash 运行级别3下的用户login以后,将启动一个用户指定的shell,以下以/bin/bash为例继续我们的启动过程。 bash 是Bourne Shell的GNU扩展,除了继承了sh的所有特点以外,还增加了很多特性和功能。由login启动的bash是作为一个登录shell启动的,它继承了 getty设置的TERM、PATH等环境变量,其中PATH对于普通用户为"/bin:/usr/bin:/usr/local/bin",对于 root 为"/sbin:/bin:/usr/sbin:/usr/bin"。作为登录shell,它将首先寻找/etc/profile 脚本文件,并执行它;然后如果存在~/.bash_profile,则执行它,否则执行 ~/.bash_login,如果该文件也不存在,则执行~/.profile文件。然后bash将作为一个交互式shell执行~/.bashrc文件(如果存在的话),很多系统中,~/.bashrc都将启动 /etc/bashrc作为系统范围内的配置文件。 当显示出命令行提示符的时候,整个启动过程就结束了。此时的系统,运行着内核,运行着几个核心线程,运行着init进程,运行着一批由rc启动脚本激活的守护进程(如 inetd等),运行着一个bash作为用户的命令解释器。 附:XDM方式登录 如果缺省运行级别设为5,则系统中不光有1-6个getty监听着文本终端,还有启动了一个XDM的图形登录窗口。登录过程和文本方式差不多,也需要提供用户名和口令,XDM 的配置文件缺省为/usr/X11R6/lib/X11/xdm/xdm-config文件,其中指定了 /usr/X11R6/lib/X11/xdm/xsession作为XDM的会话描述脚本。登录成功后,XDM将执行这个脚本以运行一个会话管理器,比如gnome-session等。 除了XDM以外,不同的窗口管理系统(如KDE和GNOME)都提供了一个XDM的替代品,如gdm和kdm,这些程序的功能和XDM都差不多。
Linux内核中,Page CacheBuffer Cache都是用于缓存文件系统数据的机制,但它们的缓存对象和缓存方式不同。 Page Cache是用于缓存文件系统数据块的机制,它缓存的是文件系统中的整个页(通常为4KB)。当应用程序请求访问文件时,内核会首先在Page Cache中查找对应的页,如果找到了就直接返回给应用程序,避免了访问磁盘的开销。如果Page Cache中没有找到对应的页,则内核会从磁盘中读取相应的数据块,并将其缓存到Page Cache中。Page Cache的优点是能够加速对文件的访问速度,提高系统的性能。但是,它也会占用系统的内存资源,如果文件系统中的文件过多,Page Cache可能会占用大量的内存资源。 Buffer Cache是用于缓存块设备数据块的机制,它缓存的是块设备中的数据块(通常为512字节或4KB)。当应用程序请求访问块设备时,内核会首先在Buffer Cache中查找对应的数据块,如果找到了就直接返回给应用程序,避免了访问块设备的开销。如果Buffer Cache中没有找到对应的数据块,则内核会从块设备中读取相应的数据块,并将其缓存到Buffer Cache中。Buffer Cache的优点是能够加速对块设备的访问速度,提高系统的性能。但是,它也会占用系统的内存资源,如果块设备中的数据块过多,Buffer Cache可能会占用大量的内存资源。 因此,Page CacheBuffer Cache都是用于加速对文件系统或块设备的访问速度的机制,它们的区别在于缓存的对象和缓存的方式不同。Page Cache缓存的是文件系统的整个页,Buffer Cache缓存的是块设备的数据块。Page Cache的缓存方式是基于文件系统的,Buffer Cache的缓存方式是基于块设备的。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值