Most experienced C++ programmers have a habit that may seem weird at first: Their programs invariably begin counting from 0
rather than from 1
. For example, if we reduce the outer for
loop of the program above to its essentials, we get
for (int r = 0; r != rows; ++r) { // write a row }
We could have written this loop as
for (int r = 1; r <= rows; ++r) { // write a row }
One version counts from 0
and uses !=
as its comparison; the other counts from 1
and uses <=
as its comparison. The number of iterations is the same in each case. Is there any reason to prefer one form over the other?
One reason to count from 0
is that doing so encourages us to use asymmetric ranges to express intervals. For example, it is natural to use the range [0, rows)
to describe the first for
statement, as it is to use the range [1, rows]
to describe the second one.
Asymmetric ranges are usually easier to use than symmetric ones because of an important property: A range of the form [m, n)
has n - m
elements, and a range of the form [m, n]
has n - m + 1
elements. So, for example, the number of elements in [0, rows)
is obvious (i.e., rows - 0
, or rows
) but the number in [1, rows]
is less so.
This behavioral difference between asymmetric and symmetric ranges is particularly evident in the case of empty ranges: If we use asymmetric ranges, we can express an empty range as [n, n)
, in contrast to [n, n-1]
for symmetric ranges. The possibility that the end of a range could ever be less than the beginning can cause no end of trouble in designing programs.
Another reason to count from 0
is that doing so makes loop invariants easier to express. In our example, counting from 0
makes the invariant straightforward: We have written r
rows of output so far. What would be the invariant if we counted from 1
?
One would be tempted to say that the invariant is that we are about to write the r
th row, but that statement does not qualify as an invariant. The reason is that the last time the while tests its condition, r
is equal to rows + 1
, and we intend to write only rows
rows. Therefore, we are not about to write the r
th row, so the invariant is not true!
Our invariant could be that we have written r - 1
rows so far. However, if that's our invariant, why not simplify it by starting r
at 0
?
Another reason to count from 0
is that we have the option of using !=
as our comparison instead of <=
. This distinction may seem trivial, but it affects what we know about the state of the program when a loop finishes. For example, if the condition is r != rows
, then when the loop finishes, we know that r == rows
. Because the invariant says that we have written r
rows of output, we know that we have written exactly rows
rows all told. On the other hand, if the condition is r <= rows
, then all we can prove is that we have written at least rows rows of output. For all we know, we might have written more.
If we count from 0
, then we can use r != rows
as a condition when we want to ensure that there are exactly rows
iterations, or we can use r < rows
if we care only that the number of iterations is rows
or more. If we count from 1
, we can use r <= rows
if we want at least rows
iterations-but what if we want to ensure that rows
is the exact number? Then we must test a more complicated condition, such as r == rows + 1
. This extra complexity offers no compensating advantage.