最近发现实现泛型的代码有些难度,虽然rust的基于trait 约束的泛型比起c++ 的模板已经优雅很多了,不会出现打印几页纸的模板实例化错误,但是当泛型参数和生命周期参数纠缠一块时,马上就进入跟rust的borrow checker 抗争的时刻.
rust 标准库已有空间布局连续的双端队列实现VecDeque, 我的实现仅仅是练习,实现队列有两种选择,基于数组和链表, 使用数组实现有最大元素数目限制的boundqueue, 使用链表实现元素数目不受限制的unboundqueue, 提取队列ADT的接口,用trait 表达:
pub trait Queue<T> {
fn push(&mut self, item: T);
fn pop(&mut self) -> Option<T>;
fn is_empty(&self) -> bool;
}
首先是数组的实现, 底层存储可以使用vec, 不用自己做内存管理:
pub trait Queue<T: Clone> {
fn push(&mut self, item: T);
fn pop(&mut self) -> Option<T>;
fn is_empty(&self) -> bool;
}
pub struct FixSizeQueue<T> {
data: Box<[T]>,
head: usize,
tail: usize,
size: usize,
}
impl<T> FixSizeQueue<T> {
pub fn new(size: usize) -> FixSizeQueue<T>
where
T: Default + Clone,
{
FixSizeQueue {
head: 0,
tail: 0,
size: size + 1,
data: vec![T::default(); size + 1].into_boxed_slice(),
}
}
pub fn is_full(&self) -> bool {
return self.tail + 1 == self.head;
}
}
impl<T: Clone> Queue<T> for FixSizeQueue<T> {
fn push(&mut self, item: T) {
let mut next = self.tail + 1;
if next >= self.size {
next = 0
}
if next == self.head {
return;
}
self.data[self.tail] = item;
self.tail = next;
}
fn pop(&mut self) -> Option<T> {
if self.head == self.tail {
return None;
}
let mut next = self.head + 1;
if next >= self.size {
next = 0
}
let i = self.head;
self.head = next;
return Some(self.data[i].clone());
}
fn is_empty(&self) -> bool {
self.head == self.tail
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn is_empty() {
let q = FixSizeQueue::<i32>::new(10);
assert_eq!(q.is_empty(), true);
}
#[test]
fn test_push_pop() {
let mut q = FixSizeQueue::<i32>::new(10);
for i in 1..=10 {
q.push(i);
assert_eq!(q.pop(), Some(i));
}
for i in 1..=10 {
q.push(i);
}
for i in 1..=10 {
assert_eq!(q.pop(), Some(i));
}
assert_eq!(q.is_full(), false);
}
}
当我用boxed slice 作为底层数据存储时,由于不能从slice move out 被弹出的元素, 约束了Queue trait 的定义,要求元素的类型实现Clone Trait:
pub trait Queue<T: Clone> {
fn push(&mut self, item: T);
fn pop(&mut self) -> Option<T>;
fn is_empty(&self) -> bool;
}
在阅读std Vec, VecDeque 部分代码时, 发现它们内部是用RawVec来实现内存管理,结合unsafe 的使用, 估计可以放开对 Queue<T: Clone> 的约束, 不过使用RawVec,需要使用nightly rust, boundqueue 定义:
pub struct BoundQueue<T> {
data: RawVec<T>,
head: usize, // 待取位置
tail: usize, //待插位置
}
impl<T> BoundQueue<T> {
pub fn new(size: usize) -> Self {
let mut buf = RawVec::new();
buf.reserve(0, size + 1);
BoundQueue {
head: 0,
tail: 0,
data: buf,
}
}
pub fn cap(&self) -> usize {
self.data.cap()
}
pub fn is_full(&self) -> bool {
return self.tail + 1 == self.head;
}
}
impl<T> Queue<T> for BoundQueue<T> {
fn push(&mut self, item: T) {
let mut next = self.tail + 1;
if next >= self.cap() {
next = 0
}
if next == self.head {
return;
}
let tail = self.tail;
unsafe {
self.write(tail, item);
}
self.tail = next;
}
fn pop(&mut self) -> Option<T> {
if self.head == self.tail {
return None;
}
let mut next = self.head + 1;
if next >= self.cap() {
next = 0
}
let head = self.head;
let v = unsafe { self.read(head) };
self.head = next;
return Some(v);
}
fn is_empty(&self) -> bool {
self.head == self.tail
}
}
在pop 方法的unsafe block 中 使用ptr::read , 绕开rust 自动move 的语义。 然后添加迭代器的实现,rust 独特的ownership borrow 语义贯彻在所有的api 接口中, 如方法的receiver, 可以是:
receiver 类型 | 语义描述 |
---|---|
Self | 对于可Copy类型Copy,非Copy类型Move |
&mut Self | 读写borrow, 需要修改receiver 内部状态 |
&Self | 只读borrow, 只读取receiver 内部状态 |
相应的迭代器有三种对应实现:
迭代器类型 | 语义描述 |
---|---|
IntoIter | 消费掉被迭代对象的元素,接管值的ownership |
IterMut | 读写borrow, 通过此迭代器可以修改对应元素 |
Iter | 只读borrow, 通过此迭代器观察被迭代对象的状态 |
pub struct Iter<'a, T: 'a> {
pos: usize,
tail: usize,
data: &'a [T],
}
pub struct IterMut<'a, T: 'a> {
pos: usize,
tail: usize,
data: &'a mut [T],
}
pub struct IntoIter<T>(BoundQueue<T>);
impl<T> Iterator for IntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
self.0.pop()
}
}
impl<'a, T> Iterator for Iter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
let c = self.pos;
if c == self.tail {
return None;
}
if c == self.data.len() - 1 {
self.pos = 0;
} else {
self.pos += 1;
}
unsafe { Some(self.data.get_unchecked(c)) }
}
}
impl<'a, T> Iterator for IterMut<'a, T> {
type Item = &'a mut T;
fn next(&mut self) -> Option<Self::Item> {
let c = self.pos;
if c == self.tail {
return None;
}
if c == self.data.len() - 1 {
self.pos = 0;
} else {
self.pos += 1;
}
unsafe {
Some(self.data.get_unchecked_mut(c))
}
}
}
结果编译器告诉我:
一直让我困惑的是为什么几乎同样的写法,为什么Iter 版本就没有问题? 猜测在Iter 的 self.data.get_unchecked(c) 产生了类似: &'tmp mut &'a T 的类型,data.get_unchecked(c) 仅需要只读引用(&'a T), self 本身是(&'tmp mut U), 在receiver self 上发生reborrow (&'tmp mut U -> U -> &'a T), U 就是Iter<'a,T> ; 而 self.data.get_unchecked_mut(c) 产生了: &'tmp mut &'a mut T, 一个二级mutable 引用, rust 不能reborrow, next 方法的返回值同时受到 'tmp 和 'a 的约束, 而iterator trait 的 item 类型必须是 &'a mut T 。在rust 中raw pointer 不受生命周期的约束, 先cast 到 raw pointer ,然后再从中生成带有生命周期的引用:
unsafe {
let item = self.data.get_unchecked_mut(c);
Some(&mut *(item as *mut _))
}
添加drop trait 实现, 不然存储某些包含heap分配数据的类型,如String ,出现内存泄漏:
impl<T> Drop for BoundQueue<T> {
fn drop(&mut self) {
for e in self.iter_mut() {
unsafe {
ptr::drop_in_place(e as *mut _);
}
}
}
}
对于unboundqueue 如何表示, 刚开始的直觉是:
struct Node<T> {
next: Option<Box<Node<T>>>,
data: T,
}
pub struct UnboundQueue<T> {
head: Option<Box<Node<T>>>,
tail: Option<Box<Node<T>>>,
}
但是编写过程中发现,这种表示方法通不过borrow checker的检查,值的ownership 关系要形成树型结构, 才能通过,Box<T> 是owning pointer, ownership 通过Box<T> 往后传递, 假设插入值 1 -> 2 -> 3 -> 4 , ownership 形成以下图:
不是树形,被拒绝,需要non-owning pointer, std lib 提供了*mut T 的协变非空包裹类型NonNull<T>, 组合Option<T>, 就能表达nullable non-owning pointer:
struct Node<T> {
next: Option<NonNull<Node<T>>>,
data: T,
}
pub struct UnboundQueue<T> {
head: Option<NonNull<Node<T>>>,
tail: Option<NonNull<Node<T>>>,
len: usize,
marker: PhantomData<Box<Node<T>>>,//逻辑上,整个UnboundQueue类型 own 所有heap 上分配的元素
}
添加iterator trait, drop trait 实现:
impl<T> Node<T> {
fn new(data: T) -> Self {
Node {
next: None,
data: data,
}
}
}
impl<T> UnboundQueue<T> {
pub fn new() -> Self {
UnboundQueue {
head: None,
tail: None,
len: 0,
marker: PhantomData,
}
}
}
impl<T> Queue<T> for UnboundQueue<T> {
fn push(&mut self, item: T) {
let boxnode = Box::new(Node::new(item));
self.push_node(boxnode);
}
fn pop(&mut self) -> Option<T> {
self.pop_node().map(|node| node.data)
}
fn is_empty(&self) -> bool {
self.head.is_none()
}
}
impl<T> UnboundQueue<T> {
pub fn into_iter(self) -> IntoIter<T> {
IntoIter(self)
}
fn push_node(&mut self, mut node: Box<Node<T>>) {
let node = Some(Box::into_raw_non_null(node));
unsafe {
match self.tail {
None => self.head = node,
Some(mut tail) => tail.as_mut().next = node,
}
self.tail = node;
self.len += 1;
}
}
fn pop_node(&mut self) -> Option<Box<Node<T>>> {
self.head.map(|node| unsafe {
let node = Box::from_raw(node.as_ptr());
self.head = node.next;
if let None = self.head {
self.tail = None;
}
self.len -= 1;
node
})
}
pub fn iter(&self) -> Iter<T> {
Iter {
pos: self.head.as_ref().map(|node| unsafe { node.as_ref() }),
}
}
pub fn iter_mut(&mut self) -> IterMut<T> {
IterMut {
pos: self.head.as_mut().map(|node| unsafe { node.as_mut() }),
}
}
}
impl<T> Drop for UnboundQueue<T> {
fn drop(&mut self) {
while let Some(_) = self.pop_node() {}
}
}
pub struct Iter<'a, T: 'a> {
pos: Option<&'a Node<T>>,
}
pub struct IterMut<'a, T: 'a> {
pos: Option<&'a mut Node<T>>,
}
pub struct IntoIter<T>(UnboundQueue<T>);
impl<T> Iterator for IntoIter<T> {
type Item = T;
fn next(&mut self) -> Option<Self::Item> {
self.0.pop()
}
}
impl<'a, T> Iterator for Iter<'a, T> {
type Item = &'a T;
fn next(&mut self) -> Option<Self::Item> {
self.pos.map(|node| {
self.pos = node.next.as_ref().map(|node| unsafe { node.as_ref() });
&node.data
})
}
}
impl<'a, T> Iterator for IterMut<'a, T> {
type Item = &'a mut T;
fn next(&mut self) -> Option<Self::Item> {
self.pos.take().map(|node| {
self.pos = node.next.as_mut().map(|node| unsafe { node.as_mut() });
&mut node.data
})
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_push_pop() {
let mut q = UnboundQueue::<i32>::new();
for i in 1..=10 {
q.push(i);
assert_eq!(q.pop(), Some(i));
}
for i in 1..=10 {
q.push(i);
}
for i in 1..=10 {
assert_eq!(q.pop(), Some(i));
}
}
#[test]
fn into_iter() {
let mut q = UnboundQueue::<i32>::new();
q.push(1);
q.push(2);
q.push(3);
let mut iter = q.into_iter();
assert_eq!(iter.next(), Some(1));
assert_eq!(iter.next(), Some(2));
assert_eq!(iter.next(), Some(3));
assert_eq!(iter.next(), None);
}
#[test]
fn iter() {
let mut q = UnboundQueue::<String>::new();
q.push(1.to_string());
q.push(2.to_string());
q.push(3.to_string());
let mut iter = q.iter();
assert_eq!(iter.next(), Some(&"1".to_string()));
assert_eq!(iter.next(), Some(&"2".to_string()));
assert_eq!(iter.next(), Some(&"3".to_string()));
assert_eq!(iter.next(), None);
}
#[test]
fn iter_mut() {
let mut q = UnboundQueue::<i32>::new();
q.push(1);
q.push(2);
q.push(3);
let mut iter = q.iter_mut();
assert_eq!(iter.next(), Some(&mut 1));
assert_eq!(iter.next(), Some(&mut 2));
assert_eq!(iter.next(), Some(&mut 3));
assert_eq!(iter.next(), None);
}
}