Part I: The Basics
Chapter 3. Strings, Vectors, and Arrays
3.5 数组
定义和初始化内置数组
数组的维度必须是一个常量表达式。
unsigned cnt = 42; // not a constant expression
constexpr unsigned sz = 42; // constant expression
int arr[10]; // array of ten ints
int *parr[sz]; // array of 42 pointers to int
string bad[cnt]; // error: cnt is not a constant expression
string strs[get_size()]; // ok if get_size is constexpr, error otherwise
显式初始化数组元素
const unsigned sz = 3;
int ia1[sz] = {0,1,2}; // array of three ints with values 0, 1, 2
int a2[] = {0, 1, 2}; // an array of dimension 3
int a3[5] = {0, 1, 2}; // equivalent to a3[] = {0, 1, 2, 0, 0}
string a4[3] = {"hi", "bye"}; // same as a4[] = {"hi", "bye", ""}
int a5[2] = {0,1,2}; // error: too many initializers
字符数组是特殊的
char a1[] = {'C', '+', '+'}; // list initialization, no null
char a2[] = {'C', '+', '+', '\0'}; // list initialization, explicit null
char a3[] = "C++"; // null terminator added automatically
const char a4[6] = "Daniel"; // error: no space for the null!
不允许复制和赋值
int a[] = {0, 1, 2}; // array of three ints
int a2[] = a; // error: cannot initialize one array with another
a2 = a; // error: cannot assign one array to another
编译器扩展(compiler extension):由特定编译器添加到语言的功能。依赖于编译器扩展的程序无法轻松移至其他编译器。一些编译器允许数组赋值作为编译器扩展。最好不要使用非标准特性!
理解复杂的数组声明
int *ptrs[10]; // ptrs is an array of ten pointers to int
int &refs[10] = /* ? */; // error: no arrays of references
int (*Parray)[10] = &arr; // Parray points to an array of ten ints
int (&arrRef)[10] = arr; // arrRef refers to an array of ten ints
默认情况下,类型修饰符从右至左依次绑定。
理解数组的声明,最好从数组的名字开始,按照由内向外的顺序阅读。即,首先阅读圆括号内的部分,接下来观察右边,最好观察左边。
访问数组元素
当使用变量表示数组下标时,通常应定义该变量的类型为 size_t
。size_t
是机器特定的无符号类型,可以保证足以表示内存中任何对象的大小。
size_t
类型定义在 cstddef
头文件中,该头文件是 C 库中 stddef.h
标头的 C++ 版本。
使用下标运算符访问数组元素
// count the number of grades by clusters of ten: 0--9, 10--19, ... 90--99, 100
unsigned scores[11] = {}; // 11 buckets, all value initialized to 0
unsigned grade;
while (cin >> grade) {
if (grade <= 100)
++scores[grade/10]; // increment the counter for the current cluster
}
使用范围 for 语句遍历数组元素
for (auto i : scores) // for each counter in scores
cout << i << " "; // print the value of that counter cout << endl;
指针和数组
string nums[] = {"one", "two", "three"}; // array of strings
string *p = &nums[0]; // p points to the first element in nums
数组的一个特性——在大多数情况下,当使用数组时,编译器会自动将其替换为一个指向首元素的指针:
string *p2 = nums; // equivalent to p2 = &nums[0]
通常,数组上的操作实际上是指针上的操作。
int ia[] = {0,1,2,3,4,5,6,7,8,9}; // ia is an array of ten ints
auto ia2(ia); // ia2 is an int* that points to the first element in ia
ia2 = 42; // error: ia2 is a pointer, and we can't assign an int to a pointer
auto ia2(&ia[0]); // now it's clear that ia2 has type int*
注意:使用 decltype 关键字不会发生上述转换。
// ia3 is an array of ten ints
decltype(ia) ia3 = {0,1,2,3,4,5,6,7,8,9};
ia3 = p; // error: can't assign an int* to an array
ia3[4] = i; // ok: assigns the value of i to an element in ia3
指针是迭代器
int arr[] = {0,1,2,3,4,5,6,7,8,9};
int *p = arr; // p points to the first element in arr
++p; // p points to arr[1]
int *e = &arr[10]; // pointer just past the last element in arr
for (int *b = arr; b != e; ++b)
cout << *b << endl; // print the elements in arr
C++11标准:库函数 begin 和 end
int ia[] = {0,1,2,3,4,5,6,7,8,9}; // ia is an array of ten ints
int *beg = begin(ia); // pointer to the first element in ia
int *last = end(ia); // pointer one past the last element in ia
// pbeg points to the first and pend points just past the last element in arr
int *pbeg = begin(arr), *pend = end(arr);
// find the first negative element, stopping if we've seen all the elements
while (pbeg != pend && *pbeg >= 0)
++pbeg;
指针运算
constexpr size_t sz = 5;
int arr[sz] = {1,2,3,4,5};
int *ip = arr; // equivalent to int *ip = &arr[0]
int *ip2 = ip + 4; // ip2 points to arr[4], the last element in arr
// ok: arr is converted to a pointer to its first element; p points one past the end of arr
int *p = arr + sz; // use caution -- do not dereference!
int *p2 = arr + 10; // error: arr has only 5 elements; p2 has undefined value
两指针相减的结果的类型是一个名为 ptrdiff_t
的库类型,定义在 cstddef 头文件,是带符号整型。
auto n = end(arr) - begin(arr); // n is 5, the number of elements in arr
比较两个指针:
int *b = arr, *e = arr + sz;
while (b < e) {
// use *b
++b;
}
int i = 0, sz = 42;
int *p = &i, *e = &sz;
// undefined: p and e are unrelated; comparison is meaningless!
while (p < e)
解引用和指针运算的交互
int ia[] = {0,2,4,6,8}; // array with 5 elements of type int
int last = *(ia + 4); // ok: initializes last to 8, the value of ia[4]
last = *ia + 4; // ok: last = 4, equivalent to ia[0] + 4
下标和指针
int ia[] = {0,2,4,6,8}; // array with 5 elements of type int
int *p = &ia[2]; // p points to the element indexed by 2
int j = p[1]; // p[1] is equivalent to *(p + 1), p[1] is the same element as ia[3]
int k = p[-2]; // p[-2] is the same element as ia[0]
C 风格字符串
虽然 C++ 支持 C风格字符串,但 C++ 程序最好不要使用它。C风格字符串很难使用,并且有很多 bug,是许多安全问题的根本原因。
表3.8 C风格字符串函数
函数 | 含义 |
---|---|
strlen(p) | 返回 p 的长度,最后面的空字符不计算在内 |
strcmp(p1, p2) | 比较 p1 和 p2 的相等性。如果 p1 == p2,返回 0;如果 p1 > p2,返回正值;如果 p1 < p2,返回负值 |
strcat(p1, p2) | 将 p2 加到 p1 的后面,返回 p1 |
strcpy(p1, p2) | 将 p2 拷贝到 p1,返回 p1 |
上述函数定义在 cstring 头文件(C 语言头文件 string.h 的 C++ 版本)中。
char ca[] = {'C', '+', '+'}; // not null terminated
cout << strlen(ca) << endl; // disaster: ca isn't null terminated
上面程序的结果是不确定的。此调用最可能的结果是 strlen 将继续查看 ca 之后的内存,直到遇到空字符为止。
比较字符串
string s1 = "A string example";
string s2 = "A different string";
if (s1 < s2) // false: s2 is less than s1
在 C风格字符串上使用这些运算符,比较的是指针值,而不是字符串本身:
const char ca1[] = "A string example";
const char ca2[] = "A different string";
if (ca1 < ca2) // undefined: compares two unrelated addresses
要比较字符串而不是指针值,可以调用 strcmp。
if (strcmp(ca1, ca2) < 0) // same effect as string comparison s1 < s2
连接和复制操作
// disastrous if we miscalculated the size of largeStr
strcpy(largeStr, ca1); // copies ca1 into largeStr
strcat(largeStr, " "); // adds a space at the end of largeStr
strcat(largeStr, ca2); // concatenates ca2 onto largeStr
上面的代码容易出错,经常导致严重的安全漏洞。问题是错误计算 largeStr 所需的空间。
与旧代码的接口
混合库 string 对象和 C风格字符串
string s("Hello World"); // s holds Hello World
如果程序的某处需要使用 C风格字符串,无法使用 string 对象替代它。
char *str = s; // error: can't initialize a char* from a string
const char *str = s.c_str(); // ok
无法保证 c_str
函数返回的数组一直有效。如果后续操作改变了 s
的值,可能让该数组无效。
如果程序需要继续访问 str()
返回的数组的内容,则程序必须复制 c_str
返回的数组。
使用数组初始化 vector 对象
int int_arr[] = {0, 1, 2, 3, 4, 5};
// ivec has six elements; each is a copy of the corresponding element in int_arr
vector<int> ivec(begin(int_arr), end(int_arr));
// copies three elements: int_arr[1], int_arr[2], int_arr[3]
vector<int> subVec(int_arr + 1, int_arr + 4);
建议: 现代 C++程序应使用 vector 和迭代器,而不是内置的数组和指针;使用字符串,而不是 C风格的基于数组的字符串。
3.6 多维数组
严格来说,C++中没有多维数组。通常称为多维数组的实际上是数组的数组。
多维数组的初始化
int ia[3][4] = { // three elements; each element is an array of size 4
{0, 1, 2, 3}, // initializers for the row indexed by 0
{4, 5, 6, 7}, // initializers for the row indexed by 1
{8, 9, 10, 11} // initializers for the row indexed by 2
};
// equivalent initialization without the optional nested braces for each row
int ia[3][4] = {0,1,2,3,4,5,6,7,8,9,10,11};
// explicitly initialize only element 0 in each row
int ia[3][4] = {{ 0 }, { 4 }, { 8 }};
// explicitly initialize row 0; the remaining elements are value initialized
int ix[3][4] = {0, 3, 6, 9};
使用范围 for 语句处理多维数组
size_t cnt = 0;
for (auto &row : ia) // for every element in the outer array
for (auto &col : row) { // for every element in the inner array
col = cnt; // give this element the next value
++cnt; // increment cnt
}
for (const auto &row : ia) // for every element in the outer array
for (auto col : row) // for every element in the inner array
cout << col << endl;
第2个循环不用写入数据,但是仍然将外部循环的控制变量定义为引用。这样做是为了避免数组转换成指针。如果忽略引用,将循环写为:
for (auto row : ia)
for (auto col : row)
上面程序不能编译。
注意: 若要在范围 for 中使用多维数组,除最里面的数组之外,其他所有数组的循环控制变量都必须是引用。
指针和多维数组
int ia[3][4]; // array of size 3; each element is an array of ints of size 4
int (*p)[4] = ia; // p points to an array of four ints
p = &ia[2]; // p now points to the last element in ia
// print the value of each element in ia, with each inner array on its own line
// p points to an array of four ints
for (auto p = ia; p != ia + 3; ++p) {
// q points to the first element of an array of four ints; that is, q points to an int
for (auto q = *p; q != *p + 4; ++q)
cout << *q << ' ';
cout << endl;
}
// p points to the first array in ia
for (auto p = begin(ia); p != end(ia); ++p) {
// q points to the first element in an inner array
for (auto q = begin(*p); q != end(*p); ++q)
cout << *q << ' '; // prints the int value to which q points
cout << endl;
}
类型别名简化多维数组的指针
using int_array = int[4]; // new style type alias declaration
typedef int int_array[4]; // equivalent typedef declaration
// print the value of each element in ia, with each inner array on its own line
for (int_array *p = ia; p != ia + 3; ++p) {
for (int *q = *p; q != *p + 4; ++q)
cout << *q << ' ';
cout << endl;
}
学习目录:【C++ primer】目录