文章目录
在学习C++中的引用时,由于博主之前是Java选手,所以总是对于C++中的引用和指针的概念有一些绕,既然有了指针为什么还要有引用呢?在这里记录下自己的思考。
什么是引用
首先我们要弄明白什么是引用,在 ioscpp 网上找到如下一句描述
An alias (an alternate name) for an object.
即引用 reference 是已定义的变量的别名,引用变量通常用作函数的形参,通过将引用变量用作参数,函数将使用原始数据,而不是副本,可以达到指针一样的效果。
如果要创建引用变量,可以用如下代码
int x;
int & i = x;
这里附上一段对应的引用的说明。
Underneath it all, a reference i to object x is typically the machine address of the object x. But when the programmer says i++, the compiler generates code that increments x. In particular, the address bits that the compiler uses to find x are not changed. A C programmer will think of this as if you used the C style pass-by-pointer, with the syntactic variant of (1) moving the & from the caller into the callee, and (2) eliminating the *s. In other words, a C programmer will think of i as a macro for (*p), where p is a pointer to x (e.g., the compiler automatically dereferences the underlying pointer; i++ is changed to (*p)++; i = 7 is automatically changed to *p = 7).
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.
总结下上述内容,引用的技术在底层可以用对象的地址来实现,不过每个编译器的实现都不同,我们只需要把引用简单理解为对象的别名就行,对引用的操作就是对原对象的操作,也就是换个马甲我还认识你。
引用与指针的区别
指针可以先声明后赋值,但是必须在声明引用时将其初始化
这个概念很好理解,即以下代码是错误的
int a = 1;
int & b; // error
b = a;
但是,对于指针,我们可以用以下两种方式进行赋值
int a = 1;
int *b = &a; // ok
int *c;
c = &a; // ok
指针可以重新赋值,但是引用不行
引用更接近 const 指针,一旦与某个变量关联起来,就一直效忠于它。即 int &b = a;
是 int * const pr = *a
的伪装表示,我们可以用代码来测试下当试图改变引用时会发生什么。
#include <iostream>
int main() {
using namespace std;
int rats = 101;
// rodents is an reference
int &rodents = rats;
int * pt = &rats;
cout << "Before reassign: \n";
cout << "rats = " << rats;
cout << ", rodents = " << rodents;
cout << ", pt = " << *pt << endl;
cout << "rats address = " << &rats;
cout << ", rodents address = " << &rodents;
cout << ", pt address = " << pt << endl;
int bunnies = 50;
// can we change the reference?
// core 可以通过初始化来设置引用,但不能使用赋值来设置,以下语句意味着将 bunnies 的值赋给 rats
rodents = bunnies;
pt = &bunnies;
cout << "After reassign:\n";
cout << "bunnies = " << bunnies;
cout << ", rats = " << rats;
cout << ", rodents = " << rodents;
cout << ", pt = " << *pt << endl;
cout << "bunnies address = " << &bunnies;
cout << ", rats address = " << &rats;
cout << ", rodents address = " << &rodents;
cout << ", pt address = " << pt << endl;
return 0;
}
output
Before reassign:
rats = 101, rodents = 101, pt = 101
rats address = 0x7ffee9045818, rodents address = 0x7ffee9045818, pt address = 0x7ffee9045818
After reassign:
bunnies = 50, rats = 50, rodents = 50, pt = 50
bunnies address = 0x7ffee9045804, rats address = 0x7ffee9045818, rodents address = 0x7ffee9045818, pt address = 0x7ffee9045804
分析上述的结果,我们看到引用 rodents 的值由101变为50,乍一看好像是改变了,但是 rats 的值也变成了50,而且 rats 和 rodents 的地址是相同的,但是他们和 bunnies 的地址不同。这里同引用即别名的定义一起思考,可以得出 rodents = bunnies;
等价于 rats = bunnies;
即该代码的作用是改变引用对应变量的值,而不会改变引用变量的指向,即 rodents
是 rats
的别名,那么它就不能变成 bunnies
的别名。通俗理解就是它们要在一起一生一世。
通过这个现象我们可以更深刻理解不能改变引用赋值的含义。为了加深理解,我们还可看下面一个例子。
int a = 10;
int *ptr = &a;
int & rf = *ptr;
cout << "Before:\n";
cout << "a = " << a << ", (" << &a << ")\n";
cout << "ptr = " << *ptr << ", (" << ptr << ")\n";
cout << "rf = " << rf << ", (" << &rf << ")\n";
int b = 20;
ptr = &b;
cout << "After:\n";
cout << "a = " << a << ", (" << &a << ")\n";
cout << "b = " << b << ", (" << &b << ")\n";
cout << "ptr = " << *ptr << ", (" << ptr << ")\n";
cout << "rf = " << rf << ", (" << &rf << ")\n";
output:
Before:
a = 10, (0x7ffeede77800)
ptr = 10, (0x7ffeede77800)
rf = 10, (0x7ffeede77800)
After:
a = 10, (0x7ffeede77800)
b = 20, (0x7ffeede777ec)
ptr = 20, (0x7ffeede777ec)
rf = 10, (0x7ffeede77800)
上面代码和输出同样证明了引用是不能被重新赋值的。
指针可以为空,引用不能为空
这个很好理解,即我们可以给指针赋值 nullptr
,但是不能给引用赋值null,不过有以下方法可以获取一个null的引用,这里需要注意。
#include <iostream>
bool f(int &r);
void t(int &r);
int main() {
int &r = *static_cast<int *>(nullptr);
// null
std::cout
<< (&r != nullptr
? "not null" : "null")
<< std::endl;
// null
std::cout
<< (f(*static_cast<int *>(nullptr))
? "not null" : "null")
<< std::endl;
// error segmentation fault
t(r);
return 0;
}
bool f(int &r) { return &r != nullptr; }
void t(int &r) {
std::cout << r;
}
当然如下代码也是很危险的,可能编译器会通过该操作,但这是不允许的,因为C++标准规定了引用不能为空,那么就要遵守标准。
int *pt = nullptr;
int & rf = *pt;
// error segmentation fault
std::cout << rf;
指针可以嵌套,引用不可以
指针支持算术运算,引用不可以
使用引用
基本类型
现在来看对于基本类型的引用。
#include <iostream>
int main() {
using namespace std;
int rats = 101;
// rodents is an reference
int & rodents = rats;
cout << "rats = " << rats;
cout << ", rodents = " << rodents << endl;
rodents++;
cout << "rats = " << rats;
cout << ", rodents = " << rodents << endl;
// some implementations require type casting the following
// addresses to type unsigned
cout << "rats address = " << &rats;
cout << ", rodents address = " << &rodents << endl;
return 0;
}
output
rats = 101, rodents = 101
rats = 102, rodents = 102
rats address = 0x7ffeea12f818, rodents address = 0x7ffeea12f818
根据上面的输出结果,我们可以证明引用的特性
- 改变引用的值,那么原来的值也会改变
- 引用和原来的值指向的是同一地址
用作函数参数
引用可以作为函数参数,这种方法称为按引用传递。其效果和传递指针差不多。可以通过 swap 函数来进行测试。
#include <iostream>
void swapr(int &a, int &b);
void swapp(int *a, int *b);
void swapv(int a, int b);
int main() {
using namespace std;
int wallet1 = 1;
int wallet2 = 2;
cout << "wallet1 = $" << wallet1;
cout << " wallet2 = $" << wallet2 << endl;
cout << "Using references to swap contents:\n";
swapr(wallet1, wallet2); // pass variable
cout << "wallet1 = $" << wallet1;
cout << " wallet2 = $" << wallet2 << endl;
cout << "Using pointers to swap contents again:\n";
swapp(&wallet1, &wallet2); // pass addresses of variables
cout << "wallet1 = $" << wallet1;
cout << " wallet2 = $" << wallet2 << endl;
cout << "Trying to use passing by value:\n";
swapv(wallet1, wallet2); // pass values of variables
cout << "wallet1 = $" << wallet1;
cout << " wallet2 = $" << wallet2 << endl;
return 0;
}
void swapr(int &a, int &b) {
int temp;
temp = a;
a = b;
b = temp;
}
void swapp(int *a, int *b) {
int temp;
temp = *a;
*a = *b;
*b = temp;
}
void swapv(int a, int b) {
int temp;
temp = a;
a = b;
b = temp;
}
output:
wallet1 = $1 wallet2 = $2
Using references to swap contents:
wallet1 = $2 wallet2 = $1
Using pointers to swap contents again:
wallet1 = $1 wallet2 = $2
Trying to use passing by value:
wallet1 = $1 wallet2 = $2
可以看到结果和我们预期的一致,当我们不希望函数被修改时可以在参数前加上 const
,比如 void do(const int &)
临时变量
如果实参与引用参数不匹配,C++将生产临时变量,当前,仅当参数为const 引用时,C++才允许这样做。如果引用参数是const,那么在一些两种情况,编译器会产生临时变量:
- 实参的类型正确,但不是左值
- 实参的类型不正确,但可以转换为正确的类型
首先,左值是什么,左值参数是可以被引用的数据对象,例如变量,数组元素,结构成员,引用和解除引用的指针都是左值。非左值包括字面常量(用引号扩起来的字符串除外,它们由其地址表示)和包含多项的表达式。
double refcube(const double &ra);
int main() {
double side = 3.0;
double *pd = &side;
double &rd = side;
long edge = 1L;
double lens[4] = {2.0, 5.0, 1.0, 1.0};
double c1 = refcube(side);
double c2 = refcube(lens[2]);
double c3 = refcube(rd);
double c4 = refcube(*pd);
double c5 = refcube(edge); // ra is temporary variable
double c6 = refcube(7.0); // ra is temporary variable
double c7 = refcube(side + 10); // ra is temporary variable
return 0;
}
我们看上述代码示例,就能明白左值和右值的区别。其中7.0 和 side + 10 的类型正确,但没有名称,不是左值,因此会产生临时变量,而edge 类型不正确,也会产生临时变量。
返回引用
返回引用和返回值是不一样的,传统的返回机制会计算关键字 return 后面的表达式,并将结果返回给调用函数。从概念上说,这个值被复制到一个临时位置,而调用程序将使用这个值,如 double m = sqrt(16.0);
这行代码中函数返回值 4 先被复制到一个临时位置,然后复制给m。
现在来看返回引用的情况:
struct ball {}
ball & f() {}
ball a = f();
上述代码中 f() 将直接返回一个引用,因此可以直接把结果复制给a,省去了中间的过程。返回引用的函数实际上是被引用的变量的别名。
注意点:
- 函数返回的对象引用,必须在调用函数前就已经存在,不要返回局部变量的引用
- 当不希望返回的对象被修改的时候,可以添加 const。
关于第二点这里详细解释下,先看代码。
#include <iostream>
struct ball {
int a;
};
ball & f(ball & b) {
b.a = 3;
return b;
}
const ball & f2(ball & b) {
b.a = 3;
return b;
}
int main() {
ball ball_0 = {1};
ball ball_1 = {2};
f(ball_0) = ball_1; // f(ball_0) 返回的引用,表示了一个可修改的内存块,是左值,因此可以防在语句左边
std::cout << ball_0.a << std::endl; // 2
std::cout << ball_1.a << std::endl; // 2
// compiler error
// f2(ball_0) = ball_1; // 返回的是不可修改的左值,因此编译报错
return 0;
}
从上述代码中我们可以得知,常规函数返回的是右值,因为这种返回值位于临时内存单元中,运行到下一条语句时,它们可能不复存在。
而如果返回引用,引用标识了一个可修改的内存块,因此是左值。
常见的问题
为什么要加上引用这个技术
C++中的指针是从C语言中继承的,考虑到兼容问题,不能将C++中的指针删除,主要还是考虑到实现的优雅型,虽然有些活指针也能做,但用引用实现起来更方便也更安全,因为引用有更多的限制,意味着引用会更安全。
为什么this不是引用
Because this was introduced into C++ (really into C with Classes) before references were added. Also, Stroustrup chose this to follow Simula usage, rather than the (later) Smalltalk use of self.
引用参数,指针参数和按值传递如何选择
对于使用传递的值不做修改的函数
- 如果数据对象很小,如内置数据类型或小型结构,用值传递
- 如果数据对象是数据,那么只能使用指针,并用const修饰
- 如果是结构,那么用指针或引用,并用const修饰
- 如果是对象,那么使用引用,并用const修饰
对于修改调用函数中数据的函数: - 内置对象:使用指针
- 数组:使用指针
- 结构:引用或指针
- 对象:引用
引用和指针的性能
我们知道大部分编译器实现引用是都是用的指针,为了证明这一点我们可以使用反汇编。
void test1(int* p) {
*p = 3;
return;
}
void test2(int & r) {
r = 3;
return;
}
int main() {
}
使用指针的test1方法中的反汇编
(gdb) disassemble test1
Dump of assembler code for function _Z5test1Pi:
0x0000000100003f70 <+0>: push %rbp
0x0000000100003f71 <+1>: mov %rsp,%rbp
0x0000000100003f74 <+4>: mov %rdi,-0x8(%rbp)
0x0000000100003f78 <+8>: mov -0x8(%rbp),%rax
0x0000000100003f7c <+12>: movl $0x3,(%rax)
0x0000000100003f82 <+18>: pop %rbp
0x0000000100003f83 <+19>: ret
End of assembler dump.
使用引用的test2方法中的反汇编
Dump of assembler code for function _Z5test2Ri:
0x0000000100003f90 <+0>: push %rbp
0x0000000100003f91 <+1>: mov %rsp,%rbp
0x0000000100003f94 <+4>: mov %rdi,-0x8(%rbp)
0x0000000100003f98 <+8>: mov -0x8(%rbp),%rax
0x0000000100003f9c <+12>: movl $0x3,(%rax)
0x0000000100003fa2 <+18>: pop %rbp
0x0000000100003fa3 <+19>: ret
End of assembler dump.
我们可以看到其汇编代码是一样的,因此证明了我们的猜想,或者说引用在某种程度上是指针的语法糖。