Rust学习之旅——智能指针

柴猫°

于 2023-12-17 20:54:33 发布

阅读量115

点赞数

文章标签： rust 学习开发语言

本文链接：https://blog.csdn.net/weixin_55334398/article/details/135045273

版权

前言

Rust语言中与最底层内存打交道的数据结构

重温引用

引用本质上只是表示内存中某些字节起始位置的数字。它唯一的目的就是表示特定类型的数据存于何处。引用与数字的唯一区别就在于，Rust将验证引用自身的生命周期不会超过任何它所指向的内容（否则在使用时会报错）。

指针

引用可以转换成一个更原始的类型，指针（raw pointer）。像数字一样，他可以不受限制地复制和传递，但是Rust不保证它指向的内存位置地有效性。有两种类型的指针：

*const T - 指向永远不会改变的T类型数据的指针。
*mut T - 指向可以改变的T类型数据的指针。

指针可以和数字相互转换（例如 usize）。
指针可以使用 unsafe 代码访问数据（之后会展示）。
内存细节：

Rust中的引用在用法上与 C 中的指针非常类似，但是在如何存储和传递给其他函数上有更多的编译时间限制。
Rust中的指针类似C中的指针，它表示一个可以复制和传递的数字，甚至可以转化为数字类型，可以将其修改为数字以进行指针数学运算。

fn main() {
	let a = 42;
	let memory_location = &a as *const i32 as usize;
	println!("Data is here {}", memory_location);
}

解引用

访问或操作由引用（例如&i32）指向的数据的过程称为解除引用。
有两种方式通过引用来访问或操作数据：

在变量赋值期间访问引用的数据。( * 运算符)
访问引用数据的字段或方法。( . 运算符)

Rust有一些强大的运算符可以让我们做到这一点（即列举的两个运算符）。

运算符 *

* 运算符是一种很明确的解引用的方法。

let a: i32 = 42;
let ref_ref_ref_a: &&&i32 = &&&a;
let ref_a: &32 = **ref_ref_ref_a;
let b: i32 = *ref_a;

内存细节：

因为 i32 是实现了 Copy 特性的原始类型，堆栈上变量 a 的字节被复制到变量 b 的字节中（并非两项指向同一地址）。

运算符 .

. 运算符用于访问引用的字段和方法，它的工作原理更加巧妙。

let f = Foo { value： 42 };
let ref_ref_ref_f = &&&f;
println!("{}", ref_ref_ref_f.value);

为什么我们不需要在 ref_ref_ref_f 前面加上 &&& 呢？这是因为 **.**运算符会做一些自动解引用操作。最后一行由编译器自动转换为以下内容：

println!("{}", (***ref_ref_ref_f).value);

struct Foo {
	value: i32
}

fn main() {
	let f = Foo { value： 42 };
	let ref_ref_ref_f = &&&f;
	println!("{}", ref_ref_ref_f.value);
}

智能指针

除了能够使用 & 运算符创建对现有类型数据的引用外，Rust给我们提供了能够创建为 智能指针 的类引用结构。我们可以在更高层次上将引用视为一种类型，它使我们能够访问另外一种类型。智能指针的行为与普通引用不同，因为它们基于程序员编写的内部逻辑进行操作。作为程序员的我们就是智能的一部分。
通常，智能指针实现了 Deref 、DerefMut 和 Drop 特征，以指定当使用 * 和 . 运算符时解引用应该触发的逻辑。

use std::ops::Deref;
struct TattleTell<T> {
	value: T,
}
impl<T> Dref for TattleTell<T> {
	type Target = T;
	fn deref(&self) -> &T {
		println!("{} was used!", std::any::type_name::<T>());
		&self.value
	}
}
fn main() {
	let foo = TattleTell {
		value: "secret message",
	};
	// 在为函数'len'自动引用foo之后，立即解引用，就会触发 Deref 函数返回 &self.value
	println!("{}", foo.len());
}

智能不安全代码

智能指针倾向于经常使用 不安全 的代码。如前所述，它们是与Rust中最低级别的内存进行交互的工具。
什么是不安全代码？不安全代码的行为和普通Rust完全一样，除了一些Rust编译器无法保证的功能。
不安全代码的主要功能是 解引用指针 。这意味着将原始指针指向内存中的某个位置并声明“此处有数据结构”并将其转换为我们可以使用的数据类型表示（例如将 *const u8 转换为 u8）。Rust无法跟踪写入内存的每个字节上存在什么，所以将它解引用放在一个 unsafe { … } 块中。

智能指针广泛地被用来解引用指针，它们的作用得到了很好的证明。

fn main() {
	let a: [u8; 4] = [86, 14, 73, 64];
	// this is a raw pointer. Getting the memory address of something as a number is totally safe
	let pointer_a = &a as *const u8 as usize;
	println!("Data memory location: {}", pointer_a);
	// Turnning our number into a raw pointer to a f32 is also safe to do.
	let pointer_b = pointer_a as *const f32;
	let b = unsafe {
		// This is unsafe because we are telling compiler t assume our pointer is a valid f32 and dereference it's value into variable b.
		// Rust has no way to verify this assumption is true.
		*pointer_b
	};
	println!("I swear this is a pie! {}", b);
}

熟悉的朋友

想一想一些我们已经见过的智能指针，例如 Vec 和 String。

Vec 是一个智能指针，他只拥有一些字节的内存区域。Rust编译器不知道这些字节中存在什么。智能指针解释从它管理的内存区域获取数据意味着什么，跟踪这些字节中的数据结构开始和结束的位置，最后将指针解引用到数据结构中，成为一个漂亮干净的可以阅读的接口供我们使用（例如 my_vec[3]）。

类似地，String 跟踪字节的内存区域，并以编程方式将写入其中的内存限制为始终有效的 utf-8 ，并帮助将该内存区域解引用为类型 &str。
这两种数据结构都使用不安全的解引用指针来完成它们的工作。
内存细节：

Rust有一个相当于C的malloc方法，alloc和Layout来获取我们自己管理的内存区域。

use std::alloc::{alloc, Layout};
use std::ops::Deref;

struct Pie {
	secret_recipe: usize,
}

impl Pie {
	fn new() -> Self {
		// let's ask for 4 bytes
		let layout = Layout::from_size_align(4, 1).unwrap();
		unsafe {
			// allocate and save the memory location as a number
			let ptr = alloc(layout) as *mut u8;
			// use pointer math and write a few u8 values to memory
			ptr.write(86);
			ptr.add(1).write(14);
			ptr.add(2).write(73);
			ptr.add(3).wirte(64);
				
			Pie { secret_recipe: ptr as usize }
		}
	}
}
impl Deref for Pie {
	type Target = f32;
	fn deref(&self) -> &f32 {
		// interpret secret_recipe pointer as a f32 raw pointer
		let pointer = self.secret_recipe as *const f32;
		// dereference it into a return value &f32
		unasfe { &*pointer }
	}	
}
fn mian() {
	let p = Pie::new();
	// 'make a pie' by dereferencing our Pie struct smart pointer
	println!("{:?}", *p);
}

堆分配内存

Box 是一个可以让我们将数据从栈上移动到堆的智能指针。

解引用可以让我们以人类更容易理解的方式使用使用堆分配的数据，就好像它是原始类型一样。

struct Pie;

impl Pie {
	fn eat(&self) {
		println!("taste better on the heap!")	
	}
}

fn main() {
	let heap_pie = Box::new(Pie);
	heap_pie.eat(); // heap_pie目前在堆上，但是.运算符的解引用是其好像是原始类型一样
}

重温error的使用

Rust可能有过多的错误表示方式，但是标准库有一个通用特性 std::error::Error 来描述错误。

使用智能指针“Box”，我们可以使用类型 Box<dyn std::error::Error> 作为常见的的返回错误类型，因为它允许我们在堆上进行高级别的传播错误，而不必知道特定的类型。

在Rust之旅的早期，我们了解到main可以返回一个错误。我们现在可以返回一个类型，该类型能够描述我们程序中可能发生的几乎任何错误，只要错误的数据类型结构实现了Rust的通用 Error 特征。

fn main() -> Result<(), Box<dyn std::error::Error>>

use core::fmt::Display;
use std::error:Error;

struct Pie;

#[derive(Debug)]
struct NotFreshError;

impl Display for NotFreshError {
	fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
		write!(f, "This pie is not fresh")
	}
}

fn main() -> Result<(), Box<dyn Error>> {
	let heap_pie = Box::new(Pie);
	heap_pie.eat()?;
	Ok(())
}

引用计数

Rc 是一个能将数据从栈移动到堆的智能指针。它允许我们克隆其他 Rc 智能指针，这些指针都具有不可改变地借用放在堆上的数据的能力。

只有当最后一个智能指针被删除时，堆上的数据才会被释放。

use std::rc::Rc;

struct Pie;

impl Pie {
	fn eat(&self) {
		println!("tastes better on the heap!")
	}
}

fn main() {
	let heap_pie = Rc::new(Pie);
	let heap_pie2 = heap_pie.clone();
	let heap_pie3 = heap_pie2.clone();

	heap_pie.eat();
	heap_pie2.eat();
	heap_pie3.eat();
	
	// all reference count smart pointers are dropped now
	// the heap data Pie finally deallocates
}

共享访问

RefCell 是一个容器数据结构，通常由智能指针拥有，它接收数据并让我们借用可变或不可变引用来访问内部内容。当我们要求借用数据时，它通过在运行时强制执行Rust的内存安全来防止借用被滥用
只用一个可变引用或者多个不可变引用，但是不能同时拥有
如果违反了这些规则，RefCell 将会 panic

use std::cell::RefCell;

struct Pie {
	slices: u8
}

impl Pie {
	fn eat(&mut self){
		println!("tastes better on the heap");
		self.slices -= 1;	
	}
}

fn main() {
	// Recell validates memory safety at runtime
	// notice: pie_cell is not mut
	let pie_cell = RefCell::new(Pie{slices:8});
	{
		// but we can borrow mutable references!
		let mut mut_ref_pie = pie_cell.borrow_mut();
		mut_ref_pie.eat();
		mut_ref_pie.eat();
		// mut_ref_pie is dropped at end of scope
	}
	// now we can borrow immutably once our mutable reference drops
	let ref_pie = pie_cell.borrow();
	println!("{} slices left", ref_pie.slices);
}

线程间共享

Mutex 是一种容器数据结构，通常由智能指针持有，它接收数据并让我们借用对其中数据的可变和不可变引用（对比RefCell 接收数据并让我们借用可变或不可变引用来访问内部内容）。这可以方式借用被滥用，因为操作系统一次只限制一个CPU线程访问数据，阻塞其他线程，直到原线程完成其锁定的借用。

多线程超出了Rust之旅的范围，但 Mutex 是协调多个CPU线程访问相同数据的基础部分。

有一个特殊的智能指针 Arc，它与 Rc 相同，除了使用线程安全的引用计数递增。它通常对于同一个 Mutex 进行多次引用。

use std::sync::Mutex;

struct Pie;

impl Pie {
	fn eat(&self) {
		println!("Only I eat the pie right now!");
	}
}

fn main() {
	let mutex_pie = Mutex::new(Pie);
	// let's borrow a locked immutable reference of pie
	// we have to unwrap the result of a lock because it might fail
	let ref_pie = mutex_pie.lock().unwrap();
	ref_pie.eat();
	// locked reference drops here, and mutex protected value can be used by someone else
}

组合智能指针

智能指针看起来可能会存在一些限制，但是我们可以做一些非常有用的结合。

Rc<Vec<Foo>> - 允许克隆多个可以借用堆上不可变数据结构的相同向量的智能指针。
Rc<RefCell<Foo>> - 允许多个智能指针可变或不可变地借用相同的结构 Foo。
Arc<Mutex<Foo>> - 允许多个智能指针以CPU线程独占方式锁定可变或不可变引用的能力。

内存细节：

可能注意到一个包含许多这种组合的主题。使用不可变数据类型（可能由多个智能指针拥有）来修改内部数据。这在Rust中被称为“内部可变性”模式。这种模式让我们可以在运行时以与Rust的编译时检查相同的安全级别来改变内存使用规则。

use std::cell::Refcell;
use std::rc::RC;

struct Pie {
	slices: u8,
}

impl Pie {
	fn eat_slice(&mut self, name: &str) {
		println!("{} took a silce!", name);
		self.slices -=  1;
	}
}

struct SeaCreature {
	name: String,
	pie: Rc<RefCell<Pie>>,
}

impl SeaCreature {
	fn eat(&self) {
		// use smart pointer to pie for a mutable borrow
		let mut p = self.pie.borrow_mut();
		// take a bite!
		p.eat_slice(&self.name);
	}
}

fn main() {
	let pie = Rc::new(RefCell::new(Pie { slices: 8 }));
	// ferris and sarah are given clones of smart pointer to pie
	let ferris = SeaCreature {
		name: String::from("ferris"),
		pie: pie.clone,
	};
	let sarah  = SeaCreature {
		name: String::from("sarah"),
		pie: pie.clone,
	};
	ferris.eat();
	sarah.eat();
	
	let p = pie.borrow();
	println!("{} slices left", p.slices);
}