读书摘要-The practice of programming

The practice of programming

    simplicity, clarity and generality form the bedrock of good software.

Chp 1 Style

    The purpose of style is to make the code easy to read for yourself and others,

1.1 Names

    A name should be informative, concise, memorable, and pronounceable if possible.

    Much information comes from context and scope; the broader the scope of a variable, the more information should be conveyed by its name.

    Use descriptive names for globals, short names for locals.

    Programmers are often encouraged to use long variable names regardless of context.That is a mistake: clarity is often achieved through brevity.

    The longer the program, the more important is the choice of good, descriptive, systematic names

    Function names should be based on active verbs, perhaps followed by nouns

    Functions that return a boolean value should be named so that the return value is unambiguous.

1.2 Expressions and Statements

    Write expressions as you might speak them aloud. Conditional expressions that include negations are always hard to understand.

1.3 Consistency and Idioms

    Specific style is much less important than its consistent application. Pick one style, preferably ours, use it consistently, and don't waste time arguing.

    The program's consistency is more important than your own, because it makes life easier for those who follow.

    A central part of learning any language is developing a familiarity with its idioms.

1.5 Magic Numbers

    As a guideline, any number other than 0 or 1 is likely to be magic and should have a name of its own


    Comments are meant to help the reader of a program. They do not help by saying things the code already plainly says, or by contradicting the code, or by distracting the reader with elaborate typographical displays.

    Comments shouldn't report self-evident information.

    Global variables have a tendency to crop up intermittently throughout a program; a comment serves as a reminder to be referred to as needed.

    When you change code, make sure the commentsare still accurate.

    Good code needs fewer comments than bad code.

1.7 Why Bother?

    The key observation is that good style should be a matter of habit.

    If you think about style as you write code originally, and if you take the time to revise and improve it, you will develop good habits.

Chp 2 Algorithms and Data Structures

    Even within an intricate program like a compiler or a web browser, most of the data structures are arrays, lists, trees, and hash tables.

Chp3 Design and Implementation

    As the quotation from Brooks's classic book suggests, the design of the data structures is the central decision in the creation of a program.Once the data structures are laid out, the algorithms tend to fall into place, and the coding is comparatively easy.

    The design of a program is rooted in the layout of its data. The data structures don't define every detail, but they do shape the overall solution.

    This point of view is oversimplified but not misleading.

    As a rule, try to handle irregularities and exceptions and special cases in data.Code is harder to get right so the control flow should be as simple and regular as possible.

    The great strengths of C are that it gives the programmer complete control over implementation, and programs written in it tend to be fast. The cost, however, is that the C programmer must do more of the work, allocating and reclaiming memory, creating hash tables and linked lists, and the like. C is a razor-sharp tool, with which one can create an elegant and efficient program or a bloody mess.

    Less clear, however, is how to assess the loss of control and insight when the pile of system-supplied code gets so big that one no longer knows what's going on underneath. This is the case with the STL version; its performance is unpredictable and there is no easy way to address that.

Chp 4 Interfaces

    It's not usually until you've built and used a version of the program that you understand the issues well enough to get the design right.

    As a principle, library routines should not just die when an error occurs; error status should be returned to the caller for appropriate action.

    Expansion of size and complexity is a typical result of moving from prototype to production.

4.5 Interface Principles

    Hide implementation details
    Choose a small orthogonal set of primitives.

    Having lots of functions may make the library easier to use-whatever one needs is there for the taking. But a large interface is harder to write and maintain, and sheer size may make it hard to learn and use as well.

    In the interest of convenience, some interfaces provide multiple ways of doing the same thing, a tendency that should be resisted.

    Narrow interfaces are to be preferred to wide ones, at least until one has strong evidence that more functions are needed.

    Don't reach behind the user's back

    Do the same thing the same way everywhere.

    The basic strxxx functions in the C library are easy to use without documentation because they all behave about the same: data flows from right to left, the same direction as in an assignment statement, and they all return the resulting string

4.6 Resource Management

    Free a resource in the same layer that allocated it.

    To avoid problems, it is necessary to write code that is reentrant

    Detect errors at a low level, handle them at a high level.

    In most cases, the caller should determine how to handle an error, not the callee.

    Use exceptions only for exceptional situations.

Chp 5 Debugging

5.1 Debuggers

    As a personal choice, we tend not to use debuggers beyond getting a stack trace or the value of a variable or two.

    Debuggers can be arcane and difficult programs,and especially for beginners may provide more confusion than help

5.2 Good Clues, Easy Bugs

    Debugging involves backwards reasoning, like solving murder mysteries.

    Look for familiar patterns  

    Examine the most recent change

    Debug it now, not later

    Get a stack trace:
The source line number of the failure, often part of a stack trace, is the most useful single piece of debugging information

    Read before typing:
Resist the urge to start typing; thinking is a worthwhile alternative

    Explain your code to someone else

5.3 No Clues, Hard Bugs

    Make the bug reproducible

    Divide and conquer

    Study the numerology of failures

    Display output to localize your search.
    Write self-checking code

    Write a logfile

    Draw a picture

    Use tools

    Keep records.

5.4 Last Resorts

    These "mental model" bugs are among the hardest to find; the mechanical aid of debugger is invaluable.

    A debugger is a help, since it forces you to go in a different direction, to follow what the program is doing, not what you think it is doing

5.5 Non-reproducible Bugs

    The very fact that the behavior is nondeterministic is itself information, however; it means that the error is not likely to be a flaw in your algorithm but that in some way your code is using information that changes each time the program runs.

5.8 Summary

    Once a bug has been seen, the first thing to do is to think hard about the clues it presents.

    If there aren't good clues, hard thinking is still the best first step, to be followed by systematic attempts to narrow down the location of the problem.

Chp 6 Testing

    Edsger Dijkstra made the famous observation that testing can demonstrate the presence of bugs, but not their absence.

    One way to write bug-free code is to generate it by a program. If some programming task is understood so well that writing the code seems mechanical. then it should be mechanized.

6.1 Test as You Write the Code

    Test code at its boundaries

    The idea is that most bugs occur at boundaries .If a piece of code is going to fail, it will likely fail at a boundary. Conversely, if it works at its boundaries, it's likely to work elsewhere too.

    Test pre- and post-conditions

    Use assertions

    Assertions are particularly helpful for validating properties of interfaces because they draw attention to inconsistencies between caller and callee and may even indicate who's at fault.

    Program defensively

    Check error returns

6.2 Systematic Testing

    Test incrementally

    Test simple parts first

    Know what output to expect

    Compare independent implementations.

    Measure test coverage

    Complete coverage is often quite difficult to achieve

6.3 Test Automation

    Automate regression testing

    The most basic form of automation is regression testing, which performs a sequence of tests that compare the new version of something with the previous version.

    It's easy to overlook the possibility that the fix broke something else.

    Create self-contained tests.

    What should you do when you discover an error? If it was not found by an existing test, create a new test that does uncover the problen~and verify the test by running it with the broken version of the code

    Keep a record of bugs, changes, and fixes; it will help you identify old problems and fix new ones

6.5 Stress Tests

    Higher volume of machine-generated input in itself tends to break things because very large inputs cause overflow of input buffers, arrays, and counters. and are effective at finding unchecked fixed-size storage within a program.

    Some testing is based on explicitly malicious inputs.

    Any routine that might receive values from outside the program, directly or indirectly, should validate its input values before using them.

6.6 Tips for Testing

    Test on multiple machines, compilers, and operating systems. Each combination potentially reveals errors that won't be seen on others

6.7 Who Does the Testing?

    It is important to test your own code: don't assumethat some testing organization or user will find things for you

    The reason for testing is to find bugs, not to declare the program working.

    It's hard to test interactive programs, especially if they involve mouse input.

    Interactive programs should be controllable from scripts that simulate user behaviors so they can be tested by programs

6.9 Summary

    The single most important rule of testing is to do it !.

Chp 7 Performance

    The first principle of optimization is don't !

7.1 A Bottleneck

    When solving problems, it's important to ask the right question.

7.2 Timing and Profiling

    Knuth's guideline is right: a small part of the program consumes most of the run-time

    When a single function is so overwhelmingly the bottleneck, there are only two ways to go: improve the function to use a better algorithm, or eliminate the function altogether by rewriting the surrounding program.

7.3 Strategies for Speed

    Use a better algorithm or data structure.

    Enable compiler optimizations

    One thing to be aware of is that the more aggressively the compiler optimizes, the more likely it is to introduce bugs into the compiled program. After enabling the optimizer, re-run your regression test suite. as you should for any other modification.

    It is typical of tuning: some things help,some things don't. and one must measure to find out which.

    Don't optimize what doesn't matter

    Optimizing public services like the spam filter or a library is almost always worthwhile; speeding up test programs is almost never worthwhile.

7.4 Tuning the Code

    Bear in mind that good compilers will do some of these for you, and in fact you may impede their efforts by complicating the program.

    Collect common subexpressions

    Replace expensive operations by cheap ones

    Unroll or eliminate loops.

    Cache frequently-used values.

    Write a special-purpose allocator.

    Buffer input and output

    When a C program calls printf, for example, the characters are stored in a buffer but not passed to the operating system until the buffer is full or flushed explicitly. The operating system itself may in turn delay writing the data to disk.

    Handle special cases separately

    Precompute results——trading space for time

    Use approximate values

    Rewrite in a lower-level language

7.5 Space Efficiency

    In general, it is best to store information as text wherever feasible rather than in some binary representation. Text is portable, easy to read, and amenable to processing by all kinds of tools; binary representations have none of these advantages.

7.7 Summary

    By the way, it's extremely difficult to do good benchmarking, and it is not unknown for companies to tune their products to show up well on benchmarks. so it is wise to take all benchmark results with a grain of salt.

Chp 8 Portability

8.1 Language

    Binaries don't port well, but source code does.

    Program in the mainstream.

    It's hard to know just where the mainstream is, but it's easy to recognize constructions that are well outside it.

    By definition, all side effects and function calls must be completed at each semicolon, or when a function is called.

    Bitfields are so machine-dependent that no one should use them.

8.2 Headers and Libraries

    Use standard libraries

8.3 Program Organization

    There are two major approaches to portability, which we will call union and intersection.The approach we recommend is intersection: use only those features that exist in all target systems

    Union code is by design unportable.

    Avoid conditional compilation.

    Conditional compilation with #ifdef and similar preprocessor directives is hard to manage, because information tends to get sprinkled throughout the source.

    Mixing compile-time control flow (determined by #ifdef statements) with runtimecontrol flow is much worse, since it is very difficult to read.

    The nastiest problem with conditional compilation is one we haven't mentioned: it is almost impossible to test

8.4 Isolation

    Localize system dependencies in separate files.

    When different code is needed for different systems, the differences should be localized in separate files, one file for each system.

    Hide system dependencies behind interfaces

8.5 Data Exchange

    Textual data moves readily from one system to another and is the simplest portable way to exchange arbitrary information between systems.

    Good example: SMTP use MIME encoding for transferring binary data in mail messages

8.6 Byte Order

    Still, the best solution is often to convert information to text format, which (except for the CRLF problem) is completely portable.

8.7 Portability and Upgrade

    Change the name if you change the specification(behavior)

8.8 Internationalization

    Unicode documents are usually translated into a byte-stream encoding called UTF-8 before being sent between programs or over a network.

Chp 9 Notation

    "Perhaps of all the creations of man language is the most astonishing"

    The right language can make all the difference in how easy it is to write a program. This is why a practicing programmer's arsenal holds not only general purpose languages like C and its relatives, but also programmable shells, scripting languages, and lots of application-specific languages.

9.1 Formatting Data

    There is always a gap between what we want to say to the computer ("solve my problem") and what we are required to say to get a job done.
    The narrower this gap, the better.

    Good notation makes it easier to say what we want and harder to say the wrong thing by mistake

    Little languages are specialized notations for narrow domains.The printf control sequences are a good example.

9.3 Programmable Tools

    Programmable tools often originate in little languages designed for natural expression of solutions to problems within a narrow domain

9.4 Interpreters, Compilers, and Virtual Machines

    A virtual machine combines many of the advantages of conventional interpretation and compilation.

    Parsers are often written with the aid of an automatic parser generator, also called a compiler-compiler. such as yacc or bison.

    Virtual machines are a lovely old idea.

9.5 Programs that Write Programs

    The most common program-writing program is a compiler that translates highlevel language into machine code.

    In spite of the power of program generators, and in spite of the existence of many good examples, the notion is not appreciated as much as it should be and is infrequently used by individual programmers.

    The large-scale version of self-documenting code is literate programming, which integrates a program and its documentation so one process prints it in a natural order for reading, and another arranges it in the right order for compilation.

    As tasks become so focused and well understood that programming them feels almost mechanical, it may be time to create a notation that naturally expresses the tasks and a language that implements it.Regular expression is a good example.


    Simplicity and clarity are first and most important

