A half-hour to learn Rust

血_影

于 2021-04-01 10:28:42 发布

阅读量209

点赞数

分类专栏：好文赏析

原文链接：https://fasterthanli.me/articles/a-half-hour-to-learn-rust

版权

好文赏析专栏收录该内容

5 篇文章 0 订阅

订阅专栏

In order to increase fluency in a programming language, one has to read a lot of it. But how can you read a lot of it if you don’t know what it means?

In this article, instead of focusing on one or two concepts, I’ll try to go through as many Rust snippets as I can, and explain what the keywords and symbols they contain mean.

Ready? Go!

let introduces a variable binding:

let x; // declare "x"
x = 42; // assign 42 to "x"

This can also be written as a single line:

let x = 42;

You can specify the variable’s type explicitly with :, that’s a type annotation:

let x: i32; // `i32` is a signed 32-bit integer
x = 42;

// there's i8, i16, i32, i64, i128
//    also u8, u16, u32, u64, u128 for unsigned

This can also be written as a single line:

let x: i32 = 42;

If you declare a name and initialize it later, the compiler will prevent you from using it before it’s initialized.

let x;
foobar(x); // error: borrow of possibly-uninitialized variable: `x`
x = 42;

However, doing this is completely fine:

let x;
x = 42;
foobar(x); // the type of `x` will be inferred from here

The underscore _ is a special name - or rather, a “lack of name”. It basically means to throw away something:

// this does *nothing* because 42 is a constant
let _ = 42;
// this calls `get_thing` but throws away its result
let _ = get_thing();

Names that start with an underscore are regular names, it’s just that the compiler won’t warn about them being unused:

// we may use `_x` eventually, but our code is a work-in-progress
// and we just wanted to get rid of a compiler warning for now.
let _x = 42;

Separate bindings with the same name can be introduced - you can shadow a variable binding:

let x = 13;
let x = x + 3;
// using `x` after that line only refers to the second `x`,
// the first `x` no longer exists.

Rust has tuples, which you can think of as “fixed-length collections of values of different types”.

let pair = ('a', 17);
pair.0; // this is 'a'
pair.1; // this is 17

If we really wanted to annotate the type of pair, we would write:

let pair: (char, i32) = ('a', 17);

Tuples can be destructured when doing an assignment, which means they’re broken down into their individual fields:

let (some_char, some_int) = ('a', 17);
// now, `some_char` is 'a', and `some_int` is 17

This is especially useful when a function returns a tuple:

let (left, right) = slice.split_at(middle);

Of course, when destructuring a tuple, _ can be used to throw away part of it:

let (_, right) = slice.split_at(middle);

The semi-colon marks the end of a statement:

let x = 3;
let y = 5;
let z = y + x;

Which means statements can span multiple lines:

let x = vec![1, 2, 3, 4, 5, 6, 7, 8]
    .iter()
    .map(|x| x + 3)
    .fold(0, |x, y| x + y);

(We’ll go over what those actually mean later).
fn declares a function.
Here’s a void function:

fn greet() {
    println!("Hi there!");
}

And here’s a function that returns a 32-bit signed integer. The arrow indicates its return type:

fn fair_dice_roll() -> i32 {
    4
}

A pair of brackets declares a block, which has its own scope

// This prints "in", then "out"
fn main() {
    let x = "out";
    {
        // this is a different `x`
        let x = "in";
        println!("{}", x);
    }
    println!("{}", x);
}

Blocks are also expressions, which mean they evaluate to… a value.

// this:
let x = 42;
// is equivalent to this:
let x = { 42 };

Inside a block, there can be multiple statements:

let x = {
    let y = 1; // first statement
    let z = 2; // second statement
    y + z // this is the *tail* - what the whole block will evaluate to
};

And that’s why "omitting the semicolon at the end of a function" is the same as returning, ie. these are equivalent:

fn fair_dice_roll() -> i32 {
    return 4;
}

fn fair_dice_roll() -> i32 {
    4
}

if conditionals are also expressions:

fn fair_dice_roll() -> i32 {
    if feeling_lucky {
        6
    } else {
        4
    }
}

A match is also an expression:

fn fair_dice_roll() -> i32 {
    match feeling_lucky {
        true => 6,
        false => 4,
    }
}

Dots are typically used to access fields of a value:

let a = (10, 20);
a.0; // this is 10

let amos = get_some_struct();
amos.nickname; // this is "fasterthanlime"

Or call a method on a value:

let nick = "fasterthanlime";
nick.len(); // this is 14

The double-colon, ::, is similar but it operates on namespaces.
In this example, std is a crate (~ a library), cmp is a module (~ a source file), and min is a function:

let least = std::cmp::min(3, 8); // this is 3

use directives can be used to “bring in scope” names from other namespace:

use std::cmp::min;
let least = min(7, 1); // this is 1

Within use directives, curly brackets have another meaning: they’re “globs”. If we want to import both min and max, we can do any of these:

// this works:
use std::cmp::min;
use std::cmp::max;
// this also works:
use std::cmp::{min, max};
// this also works!
use std::{cmp::min, cmp::max};

A wildcard (*) lets you import every symbol from a namespace:

// this brings `min` and `max` in scope, and many other things
use std::cmp::*;

Types are namespaces too, and methods can be called as regular functions

let x = "amos".len(); // this is 4
let x = str::len("amos"); // this is also 4

str is a primitive type, but many non-primitive types are also in scope by default.

// `Vec` is a regular struct, not a primitive type
let v = Vec::new();
// this is exactly the same code, but with the *full* path to `Vec`
let v = std::vec::Vec::new();

This works because Rust inserts this at the beginning of every module:

use std::prelude::v1::*;

(Which in turns re-exports a lot of symbols, like Vec, String, Option and Result).
Structs are declared with the struct keyword:

struct Vec2 {
    x: f64, // 64-bit floating point, aka "double precision"
    y: f64,
}

They can be initialized using struct literals:

let v1 = Vec2 { x: 1.0, y: 3.0 };
let v2 = Vec2 { y: 2.0, x: 4.0 };
// the order does not matter, only the names do

There is a shortcut for initializing the rest of the fields from another struct:

let v3 = Vec2 {
    x: 14.0,
    ..v2
};

This is called “struct update syntax”, can only happen in last position, and cannot be followed by a comma.
Note that the rest of the fields can mean all the fields:

let v4 = Vec2 { ..v3 };

Structs, like tuples, can be destructured. Just like this is a valid let pattern:

let (left, right) = slice.split_at(middle);

So is this:

let v = Vec2 { x: 3.0, y: 6.0 };
let Vec2 { x, y } = v;
// `x` is now 3.0, `y` is now `6.0`

And this:

let Vec2 { x, .. } = v;
// this throws away `v.y`

let patterns can be used as conditions in if:

struct Number {
    odd: bool,
    value: i32,
}

fn main() {
    let one = Number { odd: true, value: 1 };
    let two = Number { odd: false, value: 2 };
    print_number(one);
    print_number(two);
}

fn print_number(n: Number) {
    if let Number { odd: true, value } = n {
        println!("Odd number: {}", value);
    } else if let Number { odd: false, value } = n {
        println!("Even number: {}", value);
    }
}
// this prints:
// Odd number: 1
// Even number: 2

match arms are also patterns, just like if let:

fn print_number(n: Number) {
    match n {
        Number { odd: true, value } => println!("Odd number: {}", value),
        Number { odd: false, value } => println!("Even number: {}", value),
    }
}
// this prints the same as before

A match has to be exhaustive: at least one arm needs to match.

fn print_number(n: Number) {
    match n {
        Number { value: 1, .. } => println!("One"),
        Number { value: 2, .. } => println!("Two"),
        Number { value, .. } => println!("{}", value),
        // if that last arm didn't exist, we would get a compile-time error
    }
}

If that’s hard, _ can be used as a “catch-all” pattern:

fn print_number(n: Number) {
    match n.value {
        1 => println!("One"),
        2 => println!("Two"),
        _ => println!("{}", n.value),
    }
}

You can declare methods on your own types:

struct Number {
    odd: bool,
    value: i32,
}

impl Number {
    fn is_strictly_positive(self) -> bool {
        self.value > 0
    }
}

And use them like usual:

fn main() {
    let minus_two = Number {
        odd: false,
        value: -2,
    };
    println!("positive? {}", minus_two.is_strictly_positive());
    // this prints "positive? false"
}

Variable bindings are immutable by default, which means their interior can’t be mutated:

fn main() {
    let n = Number {
        odd: true,
        value: 17,
    };
    n.odd = false; // error: cannot assign to `n.odd`,
                   // as `n` is not declared to be mutable
}

And also that they cannot be assigned to:

fn main() {
    let n = Number {
        odd: true,
        value: 17,
    };
    n = Number {
        odd: false,
        value: 22,
    }; // error: cannot assign twice to immutable variable `n`
}

mut makes a variable binding mutable:

fn main() {
    let mut n = Number {
        odd: true,
        value: 17,
    }
    n.value = 19; // all good
}

Traits are something multiple types can have in common:

trait Signed {
    fn is_strictly_negative(self) -> bool;
}

You can implement:

one of your traits on anyone’s type
anyone’s trait on one of your types
but not a foreign trait on a foreign type
These are called the “orphan rules”.

Here’s an implementation of our trait on our type:

impl Signed for Number {
    fn is_strictly_negative(self) -> bool {
        self.value < 0
    }
}

fn main() {
    let n = Number { odd: false, value: -44 };
    println!("{}", n.is_strictly_negative()); // prints "true"
}

Our trait on a foreign type (a primitive type, even):

impl Signed for i32 {
    fn is_strictly_negative(self) -> bool {
        self < 0
    }
}

fn main() {
    let n: i32 = -44;
    println!("{}", n.is_strictly_negative()); // prints "true"
}

A foreign trait on our type:

// the `Neg` trait is used to overload `-`, the
// unary minus operator.
impl std::ops::Neg for Number {
    type Output = Number;

    fn neg(self) -> Number {
        Number {
            value: -self.value,
            odd: self.odd,
        }        
    }
}

fn main() {
    let n = Number { odd: true, value: 987 };
    let m = -n; // this is only possible because we implemented `Neg`
    println!("{}", m.value); // prints "-987"
}

An impl block is always for a type, so, inside that block, Self means that type:

impl std::ops::Neg for Number {
    type Output = Self;

    fn neg(self) -> Self {
        Self {
            value: -self.value,
            odd: self.odd,
        }        
    }
}

Some traits are markers - they don’t say that a type implements some methods, they say that certain things can be done with a type.

For example, i32 implements trait Copy (in short, i32 is Copy), so this works:

fn main() {
    let a: i32 = 15;
    let b = a; // `a` is copied
    let c = a; // `a` is copied again
}

And this also works:

fn print_i32(x: i32) {
    println!("x = {}", x);
}

fn main() {
    let a: i32 = 15;
    print_i32(a); // `a` is copied
    print_i32(a); // `a` is copied again
}

But the Number struct is not Copy, so this doesn’t work:

fn main() {
    let n = Number { odd: true, value: 51 };
    let m = n; // `n` is moved into `m`
    let o = n; // error: use of moved value: `n`
}

And neither does this:

fn print_number(n: Number) {
    println!("{} number {}", if n.odd { "odd" } else { "even" }, n.value);
}

fn main() {
    let n = Number { odd: true, value: 51 };
    print_number(n); // `n` is moved
    print_number(n); // error: use of moved value: `n`
}

But it works if print_number takes an immutable reference instead:

fn print_number(n: &Number) {
    println!("{} number {}", if n.odd { "odd" } else { "even" }, n.value);
}

fn main() {
    let n = Number { odd: true, value: 51 };
    print_number(&n); // `n` is borrowed for the time of the call
    print_number(&n); // `n` is borrowed again
}

It also works if a function takes a mutable reference - but only if our variable binding is also mut.

fn invert(n: &mut Number) {
    n.value = -n.value;
}

fn print_number(n: &Number) {
    println!("{} number {}", if n.odd { "odd" } else { "even" }, n.value);
}

fn main() {
    // this time, `n` is mutable
    let mut n = Number { odd: true, value: 51 };
    print_number(&n);
    invert(&mut n); // `n is borrowed mutably - everything is explicit
    print_number(&n);
}

Trait methods can also take self by reference or mutable reference:

impl std::clone::Clone for Number {
    fn clone(&self) -> Self {
        Self { ..*self }
    }
}

When invoking trait methods, the receiver is borrowed implicitly:

fn main() {
    let n = Number { odd: true, value: 51 };
    let mut m = n.clone();
    m.value += 100;
    
    print_number(&n);
    print_number(&m);
}

To highlight this: these are equivalent:

let m = n.clone();
let m = std::clone::Clone::clone(&n);

Marker traits like Copy have no methods:

// note: `Copy` requires that `Clone` is implemented too
impl std::clone::Clone for Number {
    fn clone(&self) -> Self {
        Self { ..*self }
    }
}

impl std::marker::Copy for Number {}

Now, Clone can still be used:

fn main() {
    let n = Number { odd: true, value: 51 };
    let m = n.clone();
    let o = n.clone();
}

But Number values will no longer be moved:

fn main() {
    let n = Number { odd: true, value: 51 };
    let m = n; // `m` is a copy of `n`
    let o = n; // same. `n` is neither moved nor borrowed.
}

Some traits are so common, they can be implemented automatically by using the derive attribute:

#[derive(Clone, Copy)]
struct Number {
    odd: bool,
    value: i32,
}

// this expands to `impl Clone for Number` and `impl Copy for Number` blocks.

Functions can be generic:

fn foobar<T>(arg: T) {
    // do something with `arg`
}

They can have multiple type parameters, which can then be used in the function’s declaration and its body, instead of concrete types:

fn foobar<L, R>(left: L, right: R) {
    // do something with `left` and `right`
}

Type parameters usually have constraints, so you can actually do something with them.
The simplest constraints are just trait names:

fn print<T: Display>(value: T) {
    println!("value = {}", value);
}

fn print<T: Debug>(value: T) {
    println!("value = {:?}", value);
}

There’s a longer syntax for type parameter constraints:

fn print<T>(value: T)
where
    T: Display,
{
    println!("value = {}", value);
}

Constraints can be more complicated: they can require a type parameter to implement multiple traits:

use std::fmt::Debug;

fn compare<T>(left: T, right: T)
where
    T: Debug + PartialEq,
{
    println!("{:?} {} {:?}", left, if left == right { "==" } else { "!=" }, right);
}

fn main() {
    compare("tea", "coffee");
    // prints: "tea" != "coffee"
}

Generic functions can be thought of as namespaces, containing an infinity of functions with different concrete types.

Same as with crates, and modules, and types, generic functions can be “explored” (navigated?) using ::

fn main() {
    use std::any::type_name;
    println!("{}", type_name::<i32>()); // prints "i32"
    println!("{}", type_name::<(f64, char)>()); // prints "(f64, char)"
}

This is lovingly called turbofish syntax, because ::<> looks like a fish.

Structs can be generic too:

struct Pair<T> {
    a: T,
    b: T,
}

fn print_type_name<T>(_val: &T) {
    println!("{}", std::any::type_name::<T>());
}

fn main() {
    let p1 = Pair { a: 3, b: 9 };
    let p2 = Pair { a: true, b: false };
    print_type_name(&p1); // prints "Pair<i32>"
    print_type_name(&p2); // prints "Pair<bool>"
}

The standard library type Vec (~ a heap-allocated array), is generic:

fn main() {
    let mut v1 = Vec::new();
    v1.push(1);
    let mut v2 = Vec::new();
    v2.push(false);
    print_type_name(&v1); // prints "Vec<i32>"
    print_type_name(&v2); // prints "Vec<bool>"
}

Speaking of Vec, it comes with a macro that gives more or less “vec literals”:

fn main() {
    let v1 = vec![1, 2, 3];
    let v2 = vec![true, false, true];
    print_type_name(&v1); // prints "Vec<i32>"
    print_type_name(&v2); // prints "Vec<bool>"
}

All of name!(), name![] or name!{} invoke a macro. Macros just expand to regular code.
In fact, println is a macro:

fn main() {
    println!("{}", "Hello there!");
}

This expands to something that has the same effect as:

fn main() {
    use std::io::{self, Write};
    io::stdout().lock().write_all(b"Hello there!\n").unwrap();
}

panic is also a macro. It violently stops execution with an error message, and the file name / line number of the error, if enabled:

fn main() {
    panic!("This panics");
}
// output: thread 'main' panicked at 'This panics', src/main.rs:3:5

Some methods also panic. For example, the Option type can contain something, or it can contain nothing. If .unwrap() is called on it, and it contains nothing, it panics:

fn main() {
    let o1: Option<i32> = Some(128);
    o1.unwrap(); // this is fine

    let o2: Option<i32> = None;
    o2.unwrap(); // this panics!
}
// output: thread 'main' panicked at 'called `Option::unwrap()` on a `None` value', src/libcore/option.rs:378:21

Option is not a struct - it’s an enum, with two variants.

enum Option<T> {
    None,
    Some(T),
}

impl<T> Option<T> {
    fn unwrap(self) -> T {
        // enums variants can be used in patterns:
        match self {
            Self::Some(t) => t,
            Self::None => panic!(".unwrap() called on a None option"),
        }
    }
}

use self::Option::{None, Some};

fn main() {
    let o1: Option<i32> = Some(128);
    o1.unwrap(); // this is fine

    let o2: Option<i32> = None;
    o2.unwrap(); // this panics!
}
// output: thread 'main' panicked at '.unwrap() called on a None option', src/main.rs:11:27

Result is also an enum, it can either contain something, or an error:

enum Result<T, E> {
    Ok(T),
    Err(E),
}

It also panics when unwrapped and containing an error.

Variables bindings have a “lifetime”:

fn main() {
    // `x` doesn't exist yet
    {
        let x = 42; // `x` starts existing
        println!("x = {}", x);
        // `x` stops existing
    }
    // `x` no longer exists
}

Similarly, references have a lifetime:

fn main() {
    // `x` doesn't exist yet
    {
        let x = 42; // `x` starts existing
        let x_ref = &x; // `x_ref` starts existing - it borrows `x`
        println!("x_ref = {}", x_ref);
        // `x_ref` stops existing
        // `x` stops existing
    }
    // `x` no longer exists
}

The lifetime of a reference cannot exceed the lifetime of the variable binding it borrows:

fn main() {
    let x_ref = {
        let x = 42;
        &x
    };
    println!("x_ref = {}", x_ref);
    // error: `x` does not live long enough
}

A variable binding can be immutably borrowed multiple times:

fn main() {
    let x = 42;
    let x_ref1 = &x;
    let x_ref2 = &x;
    let x_ref3 = &x;
    println!("{} {} {}", x_ref1, x_ref2, x_ref3);
}

While borrowed, a variable binding cannot be mutated:

fn main() {
    let mut x = 42;
    let x_ref = &x;
    x = 13;
    println!("x_ref = {}", x_ref);
    // error: cannot assign to `x` because it is borrowed
}

While immutably borrowed, a variable cannot be mutably borrowed:

fn main() {
    let mut x = 42;
    let x_ref1 = &x;
    let x_ref2 = &mut x;
    // error: cannot borrow `x` as mutable because it is also borrowed as immutable
    println!("x_ref1 = {}", x_ref1);
}

References in function arguments also have lifetimes:

fn print(x: &i32) {
    // `x` is borrowed (from the outside) for the
    // entire time this function is called.
}

Functions with reference arguments can be called with borrows that have different lifetimes, so:

All functions that take references are generic
Lifetimes are generic parameters
Lifetimes’ names start with a single quote, ':

// elided (non-named) lifetimes:
fn print(x: &i32) {}
// named lifetimes:
fn print<'a>(x: &'a i32) {}

This allows returning references whose lifetime depend on the lifetime of the arguments:

struct Number {
    value: i32,
}

fn number_value<'a>(num: &'a Number) -> &'a i32 {
    &num.value
}

fn main() {
    let n = Number { value: 47 };
    let v = number_value(&n);
    // `v` borrows `n` (immutably), thus: `v` cannot outlive `n`.
    // While `v` exists, `n` cannot be mutably borrowed, mutated, moved, etc.
}

When there is a single input lifetime, it doesn’t need to be named, and everything has the same lifetime, so the two functions below are equivalent:

fn number_value<'a>(num: &'a Number) -> &'a i32 {
    &num.value
}

fn number_value(num: &Number) -> &i32 {
    &num.value
}

Structs can also be generic over lifetimes, which allows them to hold references:

struct NumRef<'a> {
    x: &'a i32,
}

fn main() {
    let x: i32 = 99;
    let x_ref = NumRef { x: &x };
    // `x_ref` cannot outlive `x`, etc.
}

The same code, but with an additional function:

struct NumRef<'a> {
    x: &'a i32,
}

fn as_num_ref<'a>(x: &'a i32) -> NumRef<'a> {
    NumRef { x: &x }
}

fn main() {
    let x: i32 = 99;
    let x_ref = as_num_ref(&x);
    // `x_ref` cannot outlive `x`, etc.
}

The same code, but with “elided” lifetimes:

struct NumRef<'a> {
    x: &'a i32,
}

fn as_num_ref(x: &i32) -> NumRef<'_> {
    NumRef { x: &x }
}

fn main() {
    let x: i32 = 99;
    let x_ref = as_num_ref(&x);
    // `x_ref` cannot outlive `x`, etc.
}

impl blocks can be generic over lifetimes too:

impl<'a> NumRef<'a> {
    fn as_i32_ref(&'a self) -> &'a i32 {
        self.x
    }
}

fn main() {
    let x: i32 = 99;
    let x_num_ref = NumRef { x: &x };
    let x_i32_ref = x_num_ref.as_i32_ref();
    // neither ref can outlive `x`
}

But you can do elision (“to elide”) there too:

impl<'a> NumRef<'a> {
    fn as_i32_ref(&self) -> &i32 {
        self.x
    }
}

ref

A half-hour to learn Rust

血_影

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录