性能之巅-preface

最新推荐文章于 2022-07-19 19:00:00 发布

张某人的胡思乱想

最新推荐文章于 2022-07-19 19:00:00 发布

阅读量199

点赞数

本文链接：https://blog.csdn.net/hb_zxl/article/details/122532650

版权

Preface
"There are known knowns; there are things we know we know.
We also know there are known unknowns; that is to say we know there are some things we do not know.
But there are also unknown unknowns-there are things we do not know we don't know."
--U.S. Secretary of Defense Donald Rumsfeld, February 12,2002
While the previous statement was met with chuckles from those
attending the press briefing, it summarizes an important principle that
is as relevant in complex technical systems as it is in geopolitics:
performace issues can originate from anywhere, including areas of
the system that you know nothing about and you are therfore not
checking (the unknown unknowns). This book may reveal many of
these areas, while providing methodologies and tools for their analysis.

About This Edition
I wrote the first edition eight years ago and designed it to have a
long shelf life. Chapters are structured to first cover durable skills
(models,architecture,and methodologies) and then faster-changing
skills (toos and tuning) as example implementations. While the example
tools and tunning will go out of date, the durable skills show you
how to stay updated.
道和术，道不变，术总在变化。

There has been a large addition to Linux in the past eight years:
Extended BPF, a kernel technology that powers a new generation of
performance analysis tools, which is used by companies including
Netflix and Facebook. I have included a BPF chapter and BPF tools
in this new edition,and I have also published a deeper reference on
the topic. The Linux perf and Ftrace tools have also seen
many developments, and I have added separate chapters for them
as well. The Linux kernel has gained many performance features and
technologies, also covered. The hypervisors that drive cloud computing
virtual machines, and container technologies,have also changed
considerably; that content has been updated.

The first editon covered both Linux and Solaris equally. Solaris
market share has shrunk considerably in the meantime.
so the solaris content has been largely removed from this edition,
making room for more linux content to be included. however,
your understanding of an operating system or kernel can be enhanced
by considering an alternative, for perspective.For that reason,
some mentions of Solaris and other operating systems are included
in this edition.

For the past six years I have been a senior performance engineer
at Netflix, applying the field of systems performance to the Netflix
microservices environment. I've worked on the performance of hypervisors,
containers, runtimes, kernels, databases, and applicaitons. I've
developed new methodologies and tools as needed, and worked with
experts in cloud performance and Linux kernel engineering. These
experiences have contributed to improving this edition.

About This Book
Welcome to System Performance: Enterprise and the Cloud, 2nd
Edtions! This book is about the performance of operating systems
and of applications from the operating system context, and it is written
for both enterprise server and cloud computing environments.
Much of the material in this book can also aid you analysis of client
device and desktop operating systems. My aim is to help you get
the most out of you systems, whatever they are.

When working with application software that is under constant development,
you may be tempted to think of operating system performance --
where the kenerl has been developed and tuned for
decades -- as a solved problem. it isn't! The operating system is a
complex body of software, managing a variety of ever-changing
physical devices with new and different application workloads. The
kernels are also in constant development, with features being added
to improve the performance of particular workloads,and newly encountered
bottlenecks being removed as systems continue to scale.
kernel changes such as the mitigations for the Meltdown vulnerability
that were introduced in 2018 can also hurt performance. Analyzing
and working to improve the performance of the operating system is
an ongoing task that should lead to continual performance improvements.
Application performance can also be analyzed from the operating
system context to find more clues that might be missed using
application-specific tools alone; I'll cover that here as well.

Operating System Converage
The main focus on this book is the study of systems perfromance.
using Linux-based operating systems on Intel processors as the primary
example. The content is structured to help you study other kernels
and processors as well.

Unless otherwise noted, the specific Linux distribution is not important
in the examples used. The examples are mostly from the Ubuntu
distribution and, when necessary, notes are included to explain differences
for other distributions. The examples are also taken from a variety
of system types: bare metal and virtualized, production and test,
servers and client devices.

Across my career I've worked with a variety of different operating
systems and kernels, and this has deepened my understanding of
their designj. To deepen your understanding as well, this book includes
some metions of Unix,BSD,Solaris,and Windows.

Other Content
Example screenshots from performance tools are included, not just
for the data shown, but also to illustrate the types of data available.
The tools often present the data in intuitive and self-explanatory
ways. many in the familiar style of earlier Unix tools.This means that
screenshots can be a powerful way to convey the purpose of these
tools, some requiring little additional description.(If a tool does require
laborious explanation, that may be a failure of design!)

Where it provides useful insight to deepen your understanding,I
touch upon the history of certain technoligies.It is also useful to
learn a bit about the key people in this industry: you're likely to come
across them or their work in performance and other contexts. A
"who's who' list has been provided in Appendix E.

A handful of topic in this book were also covered in my prior book,
BPF Performance Tools in particular, BPF, BCC, bpf-trace,
traceoints,kporbes,uprbes,and various BPF-based tools.
You can refer to that book for more information. The summaries of
these topics in this book are often based on that earlier book, and
sometimes use the same text and examples.

Whate Isn't Covered
This book focuses on performance. To undertake all the example
tasks given will require, at times, some system administration activities,
including the installation or compilation of software(which is not
covered here).

The content also summarized operation system internals, wich
are covered in more detail in separate dedicated texts. Advanced
performance analysis topic are summarized so that you are aware
of their existence and can study them as needed from additional
sources. See the Supplemental Material section at the end of this
Preface.

How This Book is Structured
Chapter 1, Introduction, is an introduction to systems performance
analysis, summarizing key concepts and providing examples
of performance activities.
Chapter 2, Mehtodologies, provides the background for performance
analysis and tunning, including terminology, concepts, models,
methodologies for observation and experimentation, capacity planning,
analysis, and statistics.
Chapter 3, Operating Systems, summarizes kernel internals for
the performance analyst. This is necessary background for interpreting
and understanding what the operating system is doing.
Chapter 4, Observability Tools, intoduces the types of system
observability tools available, and the interfaces and frameworks upon
which they are built.
Chapter 5, Applications, discusses application performance topics
and observing them from the operating system.
Chapter 6, CPUs, covers processors,cores,hardware threads,
CPU caches, CPU interconnects,device interconnects, and kernel
scheduling.
Chapter 7, Memory, is about virtual memory, paging, swapping,
memory architectures,buses,address sapces,and allocators.
Chapter 8, File Systems, is about file system I/O perforamnce, including
the differnet caches involved.
Chapter 9, Disks, covers storage devices, disk I/O workloads,
storage controllers, RAID, and the kernel I/O subsystem.
Chapter 10,Network,is about network protocols, sockets, interfaces,
and physical connections.
Chapter 11, Cloud Computing,introduces operating system - and
hardware-based virtualization methods in common use for cloud
computing, along with their performance overhead,isolation, and observability
characteristics. This chapter covers hypervisors and containers.
Chapter 12, Benchmarking, shows how to bench mark accurately,
and how to inerpret others' benchmark results.This is a surprisingly
tricky topic, and this chapter shows how you can avoid common mistakes
and try to make sense of it.
Chapter 13, perf,summarizes the standard Linux profiler,perf(1),
and its many capabilities. This is a reference to support perf(1)'s use
throughout the book.
Chapter 14,Ftrace, summarizes the standard Linux tracer, Ftrace,
which is especially suited for exploring kernel code execution.
Chapter 15,BPF,summarizes the standard BPF front ends:BCC
and bpftrace.
Chapter 16, Case Study,contains a systems performance case
study from Netflix,showing how a production performance puzzle
was analyzed from beginning to end.

Chapter 1 to 4 provide essential background. After reading them,
you can reference the remainder of the book as needed. in particular
Chapter 5 to 12, which vocer specific targets for analysis. Chapters
13 to 15 cover advanced profiling and tracing, and are optional reading
for those who wish to learn one or more tracers in more detail.
Chapter 16 uses a storytelling approach to paint a bigger picture of
a performance engineer's work. If you're new to performance analysis,
you might want to read this first as an example of performance analysis
using a variety of different tools, and then return to it when
you've read the other chapters.

As a Future Reference.
This book has been written to provide value for many years, by focusing
on background and methodologies for the systems performance
analyst.
To support this, many chapter have been separated into two
parts. The first part consists of terms, concepts, and Mehtodologies
(often with those headings), which should stay relevant many years
from now. The second provides examples of how the first part is implemented:
architecture, analysis tools, and tunables, which, while
they will become out-of-date, will still be useful as examples.

Tracing Examples
We frequently need to explore the operating system in depth,
which can be done using tracing tools.
Since the first edition of this book, extended BPF has been developed
and merged into the Linux kernel, powering a new generation
of tracing tools that use the BCC and bpftrace front ends. This book
focuses on BCC and bpftrace, and also the linux kernel's built-in
Ftrace tracer. BPF, BCC, and bpftrace, are covered in more depth in
my prior book.

Linux perf is also included in this book and is another tool that can
do tracing. However, perf is usually included in chapters for its sampling
and PMC analysis capabilities, rather than for tracing.
You may need or wish to use different tracing tools, which is fine.
The tracing tools in this book are used to show the questions that
you can ask of the system. It is often these questions, and the
methodologies that pose them, that are the most difficult to know.

Intended Audience
The intended audience for this book is primarily systems administrators
and operators of enterprise and cloud computing environments.
It is also a reference for developers, database administrators,
and web server administrators who need to understand operating
system and application performance.

As a performance engineer at a company with a large comoute environmnet
(Netflix), I frequently work with SREs(site reliability engineers)
and developers who are under enormous time pressure to
solve multiple simultaneous performance issues. I have also been on
the Netflix CORE SRE on-call rotation and have experienced this
pressure firsthand. For many people,performance is not their primary
job, and they need to know just enough to solve the current isssues.
Knowing that your time may be limited has encouraged me to
keep this book as short as possible, and structure it to facilitate jumping
ahead to specific chapters.