Formatter (#51)

Enforce consistent formatting use `dprint`
This commit is contained in:
Luca Palmieri 2024-05-24 17:00:03 +02:00 committed by GitHub
parent 537118574b
commit 99591a715e
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
157 changed files with 1057 additions and 1044 deletions

View file

@ -9,6 +9,12 @@ on:
- main
jobs:
formatter:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: dprint/check@v2.2
check-links:
runs-on: ubuntu-latest
steps:

View file

@ -1,15 +1,15 @@
# Learn Rust, one exercise at a time
You've heard about Rust, but you never had the chance to try it out?
You've heard about Rust, but you never had the chance to try it out?\
This course is for you!
You'll learn Rust by solving 100 exercises.
You'll learn Rust by solving 100 exercises.\
You'll go from knowing nothing about Rust to being able to start
writing your own programs, one exercise at a time.
> [!NOTE]
> This course has been written by [Mainmatter](https://mainmatter.com/rust-consulting/).
> It's one of the trainings in [our portfolio of Rust workshops](https://mainmatter.com/services/workshops/rust/).
> This course has been written by [Mainmatter](https://mainmatter.com/rust-consulting/).\
> It's one of the trainings in [our portfolio of Rust workshops](https://mainmatter.com/services/workshops/rust/).\
> Check out our [landing page](https://mainmatter.com/rust-consulting/) if you're looking for Rust consulting or
> training!
@ -20,7 +20,7 @@ to get started with the course.
## Requirements
- **Rust** (follow instructions [here](https://www.rust-lang.org/tools/install)).
- **Rust** (follow instructions [here](https://www.rust-lang.org/tools/install)).\
If `rustup` is already installed on your system, run `rustup update` (or another appropriate command depending on how
you installed Rust on your system)
to make sure you're running on the latest stable version.

View file

@ -1,45 +1,45 @@
# Welcome
Welcome to **"100 Exercises To Learn Rust"**!
Welcome to **"100 Exercises To Learn Rust"**!
This course will teach you Rust's core concepts, one exercise at a time.
This course will teach you Rust's core concepts, one exercise at a time.\
You'll learn about Rust's syntax, its type system, its standard library, and its ecosystem.
We don't assume any prior knowledge of Rust, but we assume you know at least
another programming language.
another programming language.\
We also don't assume any prior knowledge of systems programming or memory management. Those
topics will be covered in the course.
In other words, we'll be starting from scratch!
In other words, we'll be starting from scratch!\
You'll build up your Rust knowledge in small, manageable steps.
By the end of the course, you will have solved ~100 exercises, enough to
feel comfortable working on small to medium-sized Rust projects.
## Methodology
This course is based on the "learn by doing" principle.
It has been designed to be interactive and hands-on.
This course is based on the "learn by doing" principle.\
It has been designed to be interactive and hands-on.
[Mainmatter](https://mainmatter.com/rust-consulting/) developed this course
to be delivered in a classroom setting, over 4 days: each attendee advances
through the lessons at their own pace, with an experienced instructor providing
guidance, answering questions and diving deeper into the topics as needed.
to be delivered in a classroom setting, over 4 days: each attendee advances
through the lessons at their own pace, with an experienced instructor providing
guidance, answering questions and diving deeper into the topics as needed.\
If you're interested in attending one of our training sessions, or if you'd like to
bring this course to your company, please [get in touch](https://mainmatter.com/contact/).
You can also follow the course on your own, but we recommend you find a friend or
a mentor to help you along the way should you get stuck. You can
also find solutions to all exercises in the
You can also follow the course on your own, but we recommend you find a friend or
a mentor to help you along the way should you get stuck. You can
also find solutions to all exercises in the
[`solutions` branch of the GitHub repository](https://github.com/mainmatter/100-exercises-to-learn-rust/tree/solutions).
## Structure
On the left side of the screen, you can see that the course is divided into sections.
Each section introduces a new concept or feature of the Rust language.
To verify your understanding, each section is paired with an exercise that you need to solve.
Each section introduces a new concept or feature of the Rust language.\
To verify your understanding, each section is paired with an exercise that you need to solve.
You can find the exercises in the
[companion GitHub repository](https://github.com/mainmatter/100-exercises-to-learn-rust).
You can find the exercises in the
[companion GitHub repository](https://github.com/mainmatter/100-exercises-to-learn-rust).\
Before starting the course, make sure to clone the repository to your local machine:
```bash
@ -60,13 +60,13 @@ git checkout -b my-solutions
All exercises are located in the `exercises` folder.
Each exercise is structured as a Rust package.
The package contains the exercise itself, instructions on what to do (in `src/lib.rs`), and a test suite to
The package contains the exercise itself, instructions on what to do (in `src/lib.rs`), and a test suite to
automatically verify your solution.
### `wr`, the workshop runner
To verify your solutions, we've provided a tool that will guide you through the course.
It is the `wr` CLI (short for "workshop runner").
It is the `wr` CLI (short for "workshop runner").
Install it with:
```bash
@ -80,10 +80,10 @@ Run the `wr` command to start the course:
wr
```
`wr` will verify the solution to the current exercise.
Don't move on to the next section until you've solved the exercise for the current one.
`wr` will verify the solution to the current exercise.\
Don't move on to the next section until you've solved the exercise for the current one.
> We recommend committing your solutions to Git as you progress through the course,
> We recommend committing your solutions to Git as you progress through the course,
> so you can easily track your progress and "restart" from a known point if needed.
Enjoy the course!
@ -95,10 +95,10 @@ Enjoy the course!
## Author
This course was written by [Luca Palmieri](https://www.lpalmieri.com/), Principal Engineering
Consultant at [Mainmatter](https://mainmatter.com/rust-consulting/).
Luca has been working with Rust since 2018, initially at TrueLayer and then at AWS.
Luca is the author of ["Zero to Production in Rust"](https://zero2prod.com),
the go-to resource for learning how to build backend applications in Rust.
Consultant at [Mainmatter](https://mainmatter.com/rust-consulting/).\
Luca has been working with Rust since 2018, initially at TrueLayer and then at AWS.\
Luca is the author of ["Zero to Production in Rust"](https://zero2prod.com),
the go-to resource for learning how to build backend applications in Rust.\
He is also the author and maintainer of a variety of open-source Rust projects, including
[`cargo-chef`](https://github.com/LukeMathWalker/cargo-chef),
[Pavex](https://pavex.dev) and [`wiremock`](https://github.com/LukeMathWalker/wiremock-rs).
[Pavex](https://pavex.dev) and [`wiremock`](https://github.com/LukeMathWalker/wiremock-rs).

View file

@ -2,16 +2,16 @@
<div class="warning">
Don't jump ahead!
Complete the exercise for the previous section before you start this one.
It's located in `exercises/01_intro/00_welcome`, in the [course GitHub's repository](https://github.com/mainmatter/100-exercises-to-learn-rust).
Don't jump ahead!\
Complete the exercise for the previous section before you start this one.\
It's located in `exercises/01_intro/00_welcome`, in the [course GitHub's repository](https://github.com/mainmatter/100-exercises-to-learn-rust).\
Use [`wr`](00_welcome.md#wr-the-workshop-runner) to start the course and verify your solutions.
</div>
The previous task doesn't even qualify as an exercise, but it already exposed you to quite a bit of Rust **syntax**.
We won't cover every single detail of Rust's syntax used in the previous exercise.
Instead, we'll cover _just enough_ to keep going without getting stuck in the details.
Instead, we'll cover _just enough_ to keep going without getting stuck in the details.\
One step at a time!
## Comments
@ -88,7 +88,7 @@ It is considered idiomatic to omit the `return` keyword when possible.
### Input parameters
Input parameters are declared inside the parentheses `()` that follow the function's name.
Input parameters are declared inside the parentheses `()` that follow the function's name.\
Each parameter is declared with its name, followed by a colon `:`, followed by its type.
For example, the `greet` function below takes a `name` parameter of type `&str` (a "string slice"):
@ -105,10 +105,10 @@ If there are multiple input parameters, they must be separated with commas.
### Type annotations
Since we've been mentioned "types" a few times, let's state it clearly: Rust is a **statically typed language**.
Since we've been mentioned "types" a few times, let's state it clearly: Rust is a **statically typed language**.\
Every single value in Rust has a type and that type must be known to the compiler at compile-time.
Types are a form of **static analysis**.
Types are a form of **static analysis**.\
You can think of a type as a **tag** that the compiler attaches to every value in your program. Depending on the
tag, the compiler can enforce different rules—e.g. you can't add a string to a number, but you can add two numbers
together.

View file

@ -1,6 +1,6 @@
# A Basic Calculator
In this chapter we'll learn how to use Rust as a **calculator**.
In this chapter we'll learn how to use Rust as a **calculator**.\
It might not sound like much, but it'll give us a chance to cover a lot of Rust's basics, such as:
- How to define and call functions

View file

@ -1,6 +1,6 @@
# Types, part 1
In the ["Syntax" section](../01_intro/01_syntax.md) `compute`'s input parameters were of type `u32`.
In the ["Syntax" section](../01_intro/01_syntax.md) `compute`'s input parameters were of type `u32`.\
Let's unpack what that _means_.
## Primitive types
@ -18,25 +18,25 @@ An integer is a number that can be written without a fractional component. E.g.
### Signed vs. unsigned
An integer can be **signed** or **unsigned**.
An integer can be **signed** or **unsigned**.\
An unsigned integer can only represent non-negative numbers (i.e. `0` or greater).
A signed integer can represent both positive and negative numbers (e.g. `-1`, `12`, etc.).
The `u` in `u32` stands for **unsigned**.
The `u` in `u32` stands for **unsigned**.\
The equivalent type for signed integer is `i32`, where the `i` stands for integer (i.e. any integer, positive or
negative).
### Bit width
The `32` in `u32` refers to the **number of bits[^bit]** used to represent the number in memory.
The `32` in `u32` refers to the **number of bits[^bit]** used to represent the number in memory.\
The more bits, the larger the range of numbers that can be represented.
Rust supports multiple bit widths for integers: `8`, `16`, `32`, `64`, `128`.
With 32 bits, `u32` can represent numbers from `0` to `2^32 - 1` (a.k.a. [`u32::MAX`](https://doc.rust-lang.org/std/primitive.u32.html#associatedconstant.MAX)).
With 32 bits, `u32` can represent numbers from `0` to `2^32 - 1` (a.k.a. [`u32::MAX`](https://doc.rust-lang.org/std/primitive.u32.html#associatedconstant.MAX)).\
With the same number of bits, a signed integer (`i32`) can represent numbers from `-2^31` to `2^31 - 1`
(i.e. from [`i32::MIN`](https://doc.rust-lang.org/std/primitive.i32.html#associatedconstant.MIN)
to [`i32::MAX`](https://doc.rust-lang.org/std/primitive.i32.html#associatedconstant.MAX)).
to [`i32::MAX`](https://doc.rust-lang.org/std/primitive.i32.html#associatedconstant.MAX)).\
The maximum value for `i32` is smaller than the maximum value for `u32` because one bit is used to represent
the sign of the number. Check out the [two's complement](https://en.wikipedia.org/wiki/Two%27s_complement)
representation for more details on how signed integers are represented in memory.
@ -46,7 +46,7 @@ representation for more details on how signed integers are represented in memory
Combining the two variables (signed/unsigned and bit width), we get the following integer types:
| Bit width | Signed | Unsigned |
|-----------|--------|----------|
| --------- | ------ | -------- |
| 8-bit | `i8` | `u8` |
| 16-bit | `i16` | `u16` |
| 32-bit | `i32` | `u32` |
@ -55,21 +55,21 @@ Combining the two variables (signed/unsigned and bit width), we get the followin
## Literals
A **literal** is a notation for representing a fixed value in source code.
A **literal** is a notation for representing a fixed value in source code.\
For example, `42` is a Rust literal for the number forty-two.
### Type annotations for literals
But all values in Rust have a type, so... what's the type of `42`?
The Rust compiler will try to infer the type of a literal based on how it's used.
If you don't provide any context, the compiler will default to `i32` for integer literals.
The Rust compiler will try to infer the type of a literal based on how it's used.\
If you don't provide any context, the compiler will default to `i32` for integer literals.\
If you want to use a different type, you can add the desired integer type as a suffix—e.g. `2u64` is a 2 that's
explicitly typed as a `u64`.
### Underscores in literals
You can use underscores `_` to improve the readability of large numbers.
You can use underscores `_` to improve the readability of large numbers.\
For example, `1_000_000` is the same as `1000000`.
## Arithmetic operators
@ -82,7 +82,7 @@ Rust supports the following arithmetic operators[^traits] for integers:
- `/` for division
- `%` for remainder
Precedence and associativity rules for these operators are the same as in mathematics.
Precedence and associativity rules for these operators are the same as in mathematics.\
You can use parentheses to override the default precedence. E.g. `2 * (3 + 4)`.
> ⚠️ **Warning**
@ -92,7 +92,7 @@ You can use parentheses to override the default precedence. E.g. `2 * (3 + 4)`.
## No automatic type coercion
As we discussed in the previous exercise, Rust is a statically typed language.
As we discussed in the previous exercise, Rust is a statically typed language.\
In particular, Rust is quite strict about type coercion. It won't automatically convert a value from one type to
another[^coercion],
even if the conversion is lossless. You have to do it explicitly.

View file

@ -1,6 +1,6 @@
# Variables
In Rust, you can use the `let` keyword to declare **variables**.
In Rust, you can use the `let` keyword to declare **variables**.\
For example:
```rust
@ -35,20 +35,20 @@ let x = 42;
let y: u32 = x;
```
In the example above, we didn't specify the type of `x`.
In the example above, we didn't specify the type of `x`.\
`x` is later assigned to `y`, which is explicitly typed as `u32`. Since Rust doesn't perform automatic type coercion,
the compiler infers the type of `x` to be `u32`—the same as `y` and the only type that will allow the program to compile
without errors.
### Inference limitations
The compiler sometimes needs a little help to infer the correct variable type based on its usage.
The compiler sometimes needs a little help to infer the correct variable type based on its usage.\
In those cases you'll get a compilation error and the compiler will ask you to provide an explicit type hint to
disambiguate the situation.
## Function arguments are variables
Not all heroes wear capes, not all variables are declared with `let`.
Not all heroes wear capes, not all variables are declared with `let`.\
Function arguments are variables too!
```rust
@ -57,22 +57,22 @@ fn add_one(x: u32) -> u32 {
}
```
In the example above, `x` is a variable of type `u32`.
In the example above, `x` is a variable of type `u32`.\
The only difference between `x` and a variable declared with `let` is that functions arguments **must** have their type
explicitly declared. The compiler won't infer it for you.
explicitly declared. The compiler won't infer it for you.\
This constraint allows the Rust compiler (and us humans!) to understand the function's signature without having to look
at its implementation. That's a big boost for compilation speed[^speed]!
## Initialization
You don't have to initialize a variable when you declare it.
You don't have to initialize a variable when you declare it.\
For example
```rust
let x: u32;
```
is a valid variable declaration.
is a valid variable declaration.\
However, you must initialize the variable before using it. The compiler will throw an error if you don't:
```rust
@ -101,4 +101,4 @@ help: consider assigning a value
- The exercise for this section is located in `exercises/02_basic_calculator/02_variables`
[^speed]: The Rust compiler needs all the help it can get when it comes to compilation speed.
[^speed]: The Rust compiler needs all the help it can get when it comes to compilation speed.

View file

@ -1,6 +1,6 @@
# Control flow, part 1
All our programs so far have been pretty straightforward.
All our programs so far have been pretty straightforward.\
A sequence of instructions is executed from top to bottom, and that's it.
It's time to introduce some **branching**.
@ -23,7 +23,7 @@ This program will print `number is smaller than 5` because the condition `number
### `else` clauses
Like most programming languages, Rust supports an optional `else` branch to execute a block of code when the condition in an
`if` expression is false.
`if` expression is false.\
For example:
```rust
@ -38,7 +38,7 @@ if number < 5 {
## Booleans
The condition in an `if` expression must be of type `bool`, a **boolean**.
The condition in an `if` expression must be of type `bool`, a **boolean**.\
Booleans, just like integers, are a primitive type in Rust.
A boolean can have one of two values: `true` or `false`.
@ -67,12 +67,12 @@ error[E0308]: mismatched types
```
This follows from Rust's philosophy around type coercion: there's no automatic conversion from non-boolean types to booleans.
Rust doesn't have the concept of **truthy** or **falsy** values, like JavaScript or Python.
Rust doesn't have the concept of **truthy** or **falsy** values, like JavaScript or Python.\
You have to be explicit about the condition you want to check.
### Comparison operators
It's quite common to use comparison operators to build conditions for `if` expressions.
It's quite common to use comparison operators to build conditions for `if` expressions.\
Here are the comparison operators available in Rust when working with integers:
- `==`: equal to
@ -84,7 +84,7 @@ Here are the comparison operators available in Rust when working with integers:
## `if/else` is an expression
In Rust, `if` expressions are **expressions**, not statements: they return a value.
In Rust, `if` expressions are **expressions**, not statements: they return a value.\
That value can be assigned to a variable or used in other expressions. For example:
```rust
@ -96,11 +96,10 @@ let message = if number < 5 {
};
```
In the example above, each branch of the `if` evaluates to a string literal,
which is then assigned to the `message` variable.
In the example above, each branch of the `if` evaluates to a string literal,
which is then assigned to the `message` variable.\
The only requirement is that both `if` branches return the same type.
## References
- The exercise for this section is located in `exercises/02_basic_calculator/03_if_else`
- The exercise for this section is located in `exercises/02_basic_calculator/03_if_else`

View file

@ -13,7 +13,7 @@ fn speed(start: u32, end: u32, time_elapsed: u32) -> u32 {
If you have a keen eye, you might have spotted one issue[^one]: what happens if `time_elapsed` is zero?
You can try it
out [on the Rust playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=36e5ddbe3b3f741dfa9f74c956622bac)!
out [on the Rust playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=36e5ddbe3b3f741dfa9f74c956622bac)!\
The program will exit with the following error message:
```text
@ -21,7 +21,7 @@ thread 'main' panicked at src/main.rs:3:5:
attempt to divide by zero
```
This is known as a **panic**.
This is known as a **panic**.\
A panic is Rust's way to signal that something went so wrong that
the program can't continue executing, it's an **unrecoverable error**[^catching]. Division by zero classifies as such an
error.

View file

@ -12,4 +12,4 @@ It looks like you're ready to tackle factorials!
## References
- The exercise for this section is located in `exercises/02_basic_calculator/05_factorial`
- The exercise for this section is located in `exercises/02_basic_calculator/05_factorial`

View file

@ -1,6 +1,6 @@
# Loops, part 1: `while`
Your implementation of `factorial` has been forced to use recursion.
Your implementation of `factorial` has been forced to use recursion.\
This may feel natural to you, especially if you're coming from a functional programming background.
Or it may feel strange, if you're used to more imperative languages like C or Python.
@ -8,7 +8,7 @@ Let's see how you can implement the same functionality using a **loop** instead.
## The `while` loop
A `while` loop is a way to execute a block of code as long as a **condition** is true.
A `while` loop is a way to execute a block of code as long as a **condition** is true.\
Here's the general syntax:
```rust
@ -62,7 +62,7 @@ error[E0384]: cannot assign twice to immutable variable `i`
| ^^^^^^ cannot assign twice to immutable variable
```
This is because variables in Rust are **immutable** by default.
This is because variables in Rust are **immutable** by default.\
You can't change their value once it has been assigned.
If you want to allow modifications, you have to declare the variable as **mutable** using the `mut` keyword:

View file

@ -1,6 +1,6 @@
# Loops, part 2: `for`
Having to manually increment a counter variable is somewhat tedious. The pattern is also extremely common!
Having to manually increment a counter variable is somewhat tedious. The pattern is also extremely common!\
To make this easier, Rust provides a more concise way to iterate over a range of values: the `for` loop.
## The `for` loop
@ -62,7 +62,7 @@ for i in 1..(end + 1) {
- [`for` loop documentation](https://doc.rust-lang.org/std/keyword.for.html)
[^iterator]: Later in the course we'll give a precise definition of what counts as an "iterator".
For now, think of it as a sequence of values that you can loop over.
[^weird-ranges]: You can use ranges with other types too (e.g. characters and IP addresses),
but integers are definitely the most common case in day-to-day Rust programming.
[^iterator]: Later in the course we'll give a precise definition of what counts as an "iterator".
For now, think of it as a sequence of values that you can loop over.
[^weird-ranges]: You can use ranges with other types too (e.g. characters and IP addresses),
but integers are definitely the most common case in day-to-day Rust programming.

View file

@ -1,18 +1,18 @@
# Overflow
The factorial of a number grows quite fast.
The factorial of a number grows quite fast.\
For example, the factorial of 20 is 2,432,902,008,176,640,000. That's already bigger than the maximum value for a
32-bit integer, 2,147,483,647.
When the result of an arithmetic operation is bigger than the maximum value for a given integer type,
we are talking about **an integer overflow**.
Integer overflows are an issue because they violate the contract for arithmetic operations.
Integer overflows are an issue because they violate the contract for arithmetic operations.\
The result of an arithmetic operation between two integers of a given type should be another integer of the same type.
But the _mathematically correct result_ doesn't fit into that integer type!
> If the result is smaller than the minimum value for a given integer type, we refer to the event as **an integer
> underflow**.
> underflow**.\
> For brevity, we'll only talk about integer overflows for the rest of this section, but keep in mind that
> everything we say applies to integer underflows as well.
>
@ -32,7 +32,7 @@ is not Rust's solution to the integer overflow problem.
## Alternatives
Since we ruled out automatic promotion, what can we do when an integer overflow occurs?
Since we ruled out automatic promotion, what can we do when an integer overflow occurs?\
It boils down to two different approaches:
- Reject the operation
@ -40,13 +40,13 @@ It boils down to two different approaches:
### Reject the operation
This is the most conservative approach: we stop the program when an integer overflow occurs.
This is the most conservative approach: we stop the program when an integer overflow occurs.\
That's done via a panic, the mechanism we've already seen in the ["Panics" section](04_panics.md).
### Come up with a "sensible" result
When the result of an arithmetic operation is bigger than the maximum value for a given integer type, you can
choose to **wrap around**.
choose to **wrap around**.\
If you think of all the possible values for a given integer type as a circle, wrapping around means that when you
reach the maximum value, you start again from the minimum value.
@ -69,14 +69,14 @@ You may be wondering—what is a profile setting? Let's get into that!
A [**profile**](https://doc.rust-lang.org/cargo/reference/profiles.html) is a set of configuration options that can be
used to customize the way Rust code is compiled.
Cargo provides two built-in profiles: `dev` and `release`.
Cargo provides two built-in profiles: `dev` and `release`.\
The `dev` profile is used every time you run `cargo build`, `cargo run` or `cargo test`. It's aimed at local
development,
therefore it sacrifices runtime performance in favor of faster compilation times and a better debugging experience.
therefore it sacrifices runtime performance in favor of faster compilation times and a better debugging experience.\
The `release` profile, instead, is optimized for runtime performance but incurs longer compilation times. You need
to explicitly request via the `--release` flag—e.g. `cargo build --release` or `cargo run --release`.
> "Have you built your project in release mode?" is almost a meme in the Rust community.
> "Have you built your project in release mode?" is almost a meme in the Rust community.\
> It refers to developers who are not familiar with Rust and complain about its performance on
> social media (e.g. Reddit, Twitter, etc.) before realizing they haven't built their project in
> release mode.
@ -90,12 +90,12 @@ By default, `overflow-checks` is set to:
- `true` for the `dev` profile
- `false` for the `release` profile
This is in line with the goals of the two profiles.
`dev` is aimed at local development, so it panics in order to highlight potential issues as early as possible.
This is in line with the goals of the two profiles.\
`dev` is aimed at local development, so it panics in order to highlight potential issues as early as possible.\
`release`, instead, is tuned for runtime performance: checking for overflows would slow down the program, so it
prefers to wrap around.
At the same time, having different behaviours for the two profiles can lead to subtle bugs.
At the same time, having different behaviours for the two profiles can lead to subtle bugs.\
Our recommendation is to enable `overflow-checks` for both profiles: it's better to crash than to silently produce
incorrect results. The runtime performance hit is negligible in most cases; if you're working on a performance-critical
application, you can run benchmarks to decide if it's something you can afford.
@ -107,4 +107,4 @@ application, you can run benchmarks to decide if it's something you can afford.
## Further reading
- Check out ["Myths and legends about integer overflow in Rust"](https://huonw.github.io/blog/2016/04/myths-and-legends-about-integer-overflow-in-rust/)
for an in-depth discussion about integer overflow in Rust.
for an in-depth discussion about integer overflow in Rust.

View file

@ -1,12 +1,12 @@
# Case-by-case behavior
`overflow-checks` is a blunt tool: it's a global setting that affects the whole program.
`overflow-checks` is a blunt tool: it's a global setting that affects the whole program.\
It often happens that you want to handle integer overflows differently depending on the context: sometimes
wrapping is the right choice, other times panicking is preferable.
## `wrapping_` methods
You can opt into wrapping arithmetic on a per-operation basis by using the `wrapping_` methods[^method].
You can opt into wrapping arithmetic on a per-operation basis by using the `wrapping_` methods[^method].\
For example, you can use `wrapping_add` to add two integers with wrapping:
```rust
@ -18,7 +18,7 @@ assert_eq!(sum, 0);
## `saturating_` methods
Alternatively, you can opt into **saturating arithmetic** by using the `saturating_` methods.
Alternatively, you can opt into **saturating arithmetic** by using the `saturating_` methods.\
Instead of wrapping around, saturating arithmetic will return the maximum or minimum value for the integer type.
For example:
@ -29,7 +29,7 @@ let sum = x.saturating_add(y);
assert_eq!(sum, 255);
```
Since `255 + 1` is `256`, which is bigger than `u8::MAX`, the result is `u8::MAX` (255).
Since `255 + 1` is `256`, which is bigger than `u8::MAX`, the result is `u8::MAX` (255).\
The opposite happens for underflows: `0 - 1` is `-1`, which is smaller than `u8::MIN`, so the result is `u8::MIN` (0).
You can't get saturating arithmetic via the `overflow-checks` profile setting—you have to explicitly opt into it
@ -40,4 +40,4 @@ when performing the arithmetic operation.
- The exercise for this section is located in `exercises/02_basic_calculator/09_saturating`
[^method]: You can think of methods as functions that are "attached" to a specific type.
We'll cover methods (and how to define them) in the next chapter.
We'll cover methods (and how to define them) in the next chapter.

View file

@ -1,12 +1,12 @@
# Conversions, pt. 1
We've repeated over and over again that Rust won't perform
implicit type conversions for integers.
implicit type conversions for integers.\
How do you perform _explicit_ conversions then?
## `as`
You can use the `as` operator to convert between integer types.
You can use the `as` operator to convert between integer types.\
`as` conversions are **infallible**.
For example:
@ -24,7 +24,7 @@ let c: u64 = a as _;
```
The semantics of this conversion are what you expect: all `u32` values are valid `u64`
values.
values.
### Truncation
@ -38,11 +38,11 @@ let b = a as u8;
```
This program will run without issues, because `as` conversions are infallible.
But what is the value of `b`?
But what is the value of `b`?
When going from a larger integer type to a smaller, the Rust compiler will perform
a **truncation**.
a **truncation**.
To understand what happens, let's start by looking at how `256u16` is
To understand what happens, let's start by looking at how `256u16` is
represented in memory, as a sequence of bits:
```text
@ -59,10 +59,10 @@ memory representation:
0 0 0 0 0 0 0 0
| |
+---------------+
Last 8 bits
Last 8 bits
```
Hence `256 as u8` is equal to `0`. That's... not ideal, in most scenarios.
Hence `256 as u8` is equal to `0`. That's... not ideal, in most scenarios.\
In fact, the Rust compiler will actively try to stop you if it sees you trying
to cast a literal value which will result in a truncation:
@ -79,19 +79,19 @@ error: literal out of range for `i8`
### Recommendation
As a rule of thumb, be quite careful with `as` casting.
Use it _exclusively_ for going from a smaller type to a larger type.
To convert from a larger to smaller integer type, rely on the
[*fallible* conversion machinery](../05_ticket_v2/13_try_from.md) that we'll
As a rule of thumb, be quite careful with `as` casting.\
Use it _exclusively_ for going from a smaller type to a larger type.
To convert from a larger to smaller integer type, rely on the
[_fallible_ conversion machinery](../05_ticket_v2/13_try_from.md) that we'll
explore later in the course.
### Limitations
Surprising behaviour is not the only downside of `as` casting.
Surprising behaviour is not the only downside of `as` casting.
It is also fairly limited: you can only rely on `as` casting
for primitive types and a few other special cases.
When working with composite types, you'll have to rely on
different conversion mechanisms ([fallible](../05_ticket_v2/13_try_from.md)
for primitive types and a few other special cases.\
When working with composite types, you'll have to rely on
different conversion mechanisms ([fallible](../05_ticket_v2/13_try_from.md)
and [infallible](../04_traits/09_from.md)), which we'll explore later on.
## References
@ -100,6 +100,6 @@ and [infallible](../04_traits/09_from.md)), which we'll explore later on.
## Further reading
- Check out [Rust's official reference](https://doc.rust-lang.org/reference/expressions/operator-expr.html#numeric-cast)
to learn the precise behaviour of `as` casting for each source/target combination,
as well as the exhaustive list of allowed conversions.
- Check out [Rust's official reference](https://doc.rust-lang.org/reference/expressions/operator-expr.html#numeric-cast)
to learn the precise behaviour of `as` casting for each source/target combination,
as well as the exhaustive list of allowed conversions.

View file

@ -1,14 +1,14 @@
# Modelling A Ticket
The first chapter should have given you a good grasp over some of Rust's primitive types, operators and
basic control flow constructs.
In this chapter we'll go one step further and cover what makes Rust truly unique: **ownership**.
The first chapter should have given you a good grasp over some of Rust's primitive types, operators and
basic control flow constructs.\
In this chapter we'll go one step further and cover what makes Rust truly unique: **ownership**.\
Ownership is what enables Rust to be both memory-safe and performant, with no garbage collector.
As our running example, we'll use a (JIRA-like) ticket, the kind you'd use to track bugs, features, or tasks in
a software project.
We'll take a stab at modeling it in Rust. It'll be the first iteration—it won't be perfect nor very idiomatic
by the end of the chapter. It'll be enough of a challenge though!
a software project.\
We'll take a stab at modeling it in Rust. It'll be the first iteration—it won't be perfect nor very idiomatic
by the end of the chapter. It'll be enough of a challenge though!\
To move forward you'll have to pick up several new Rust concepts, such as:
- `struct`s, one of Rust's ways to define custom types
@ -19,4 +19,4 @@ To move forward you'll have to pick up several new Rust concepts, such as:
## References
- The exercise for this section is located in `exercises/03_ticket_v1/00_intro`
- The exercise for this section is located in `exercises/03_ticket_v1/00_intro`

View file

@ -6,9 +6,9 @@ We need to keep track of three pieces of information for each ticket:
- A description
- A status
We can start by using a [`String`](https://doc.rust-lang.org/std/string/struct.String.html)
to represent them. `String` is the type defined in Rust's standard library to represent
[UTF-8 encoded](https://en.wikipedia.org/wiki/UTF-8) text.
We can start by using a [`String`](https://doc.rust-lang.org/std/string/struct.String.html)
to represent them. `String` is the type defined in Rust's standard library to represent
[UTF-8 encoded](https://en.wikipedia.org/wiki/UTF-8) text.
But how do we **combine** these three pieces of information into a single entity?
@ -28,7 +28,7 @@ A struct is quite similar to what you would call a class or an object in other p
## Defining fields
The new type is built by combining other types as **fields**.
The new type is built by combining other types as **fields**.\
Each field must have a name and a type, separated by a colon, `:`. If there are multiple fields, they are separated by a comma, `,`.
Fields don't have to be of the same type, as you can see in the `Configuration` struct below:
@ -64,7 +64,7 @@ let x = ticket.description;
## Methods
We can attach behaviour to our structs by defining **methods**.
We can attach behaviour to our structs by defining **methods**.\
Using the `Ticket` struct as an example:
```rust
@ -140,4 +140,3 @@ but it's definitely more verbose. Prefer the method call syntax when possible.
## References
- The exercise for this section is located in `exercises/03_ticket_v1/01_struct`

View file

@ -10,9 +10,9 @@ struct Ticket {
}
```
We are using "raw" types for the fields of our `Ticket` struct.
This means that users can create a ticket with an empty title, a suuuuuuuper long description or
a nonsensical status (e.g. "Funny").
We are using "raw" types for the fields of our `Ticket` struct.
This means that users can create a ticket with an empty title, a suuuuuuuper long description or
a nonsensical status (e.g. "Funny").\
We can do better than that!
## References
@ -21,5 +21,5 @@ We can do better than that!
## Further reading
- Check out [`String`'s documentation](https://doc.rust-lang.org/std/string/struct.String.html)
for a thorough overview of the methods it provides. You'll need it for the exercise!
- Check out [`String`'s documentation](https://doc.rust-lang.org/std/string/struct.String.html)
for a thorough overview of the methods it provides. You'll need it for the exercise!

View file

@ -1,4 +1,4 @@
# Modules
# Modules
The `new` method you've just defined is trying to enforce some **constraints** on the field values for `Ticket`.
But are those invariants really enforced? What prevents a developer from creating a `Ticket`
@ -9,8 +9,8 @@ Let's start with modules.
## What is a module?
In Rust a **module** is a way to group related code together, under a common namespace (i.e. the module's name).
You've already seen modules in action: the unit tests that verify the correctness of your code are defined in a
In Rust a **module** is a way to group related code together, under a common namespace (i.e. the module's name).\
You've already seen modules in action: the unit tests that verify the correctness of your code are defined in a
different module, named `tests`.
```rust
@ -23,11 +23,11 @@ mod tests {
## Inline modules
The `tests` module above is an example of an **inline module**: the module declaration (`mod tests`) and the module
contents (the stuff inside `{ ... }`) are next to each other.
contents (the stuff inside `{ ... }`) are next to each other.
## Module tree
Modules can be nested, forming a **tree** structure.
Modules can be nested, forming a **tree** structure.\
The root of the tree is the **crate** itself, which is the top-level module that contains all the other modules.
For a library crate, the root module is usually `src/lib.rs` (unless its location has been customized).
The root module is also known as the **crate root**.
@ -43,9 +43,9 @@ multiple files. In the parent module, you declare the existence of a submodule u
mod dog;
```
`cargo`, Rust's build tool, is then in charge of finding the file that contains
the module implementation.
If your module is declared in the root of your crate (e.g. `src/lib.rs` or `src/main.rs`),
`cargo`, Rust's build tool, is then in charge of finding the file that contains
the module implementation.\
If your module is declared in the root of your crate (e.g. `src/lib.rs` or `src/main.rs`),
`cargo` expects the file to be named either:
- `src/<module_name>.rs`
@ -76,7 +76,7 @@ fn mark_ticket_as_done(ticket: Ticket) {
}
```
That's not the case if you want to access an entity from a different module.
That's not the case if you want to access an entity from a different module.\
You have to use a **path** pointing to the entity you want to access.
You can compose the path in various ways:
@ -106,13 +106,13 @@ You can also import all the items from a module with a single `use` statement.
use crate::module_1::module_2::*;
```
This is known as a **star import**.
This is known as a **star import**.\
It is generally discouraged because it can pollute the current namespace, making it hard to understand
where each name comes from and potentially introducing name conflicts.
where each name comes from and potentially introducing name conflicts.\
Nonetheless, it can be useful in some cases, like when writing unit tests. You might have noticed
that most of our test modules start with a `use super::*;` statement to bring all the items from the parent module
(the one being tested) into scope.
## References
- The exercise for this section is located in `exercises/03_ticket_v1/03_modules`
- The exercise for this section is located in `exercises/03_ticket_v1/03_modules`

View file

@ -1,12 +1,12 @@
# Visibility
When you start breaking down your code into multiple modules, you need to start thinking about **visibility**.
When you start breaking down your code into multiple modules, you need to start thinking about **visibility**.
Visibility determines which regions of your code (or other people's code) can access a given entity,
be it a struct, a function, a field, etc.
## Private by default
By default, everything in Rust is **private**.
By default, everything in Rust is **private**.\
A private entity can only be accessed:
1. within the same module where it's defined, or
@ -16,16 +16,16 @@ We've used this extensively in the previous exercises:
- `create_todo_ticket` worked (once you added a `use` statement) because `helpers` is a submodule of the crate root,
where `Ticket` is defined. Therefore, `create_todo_ticket` can access `Ticket` without any issues even
though `Ticket` is private.
- All our unit tests are defined in a submodule of the code they're testing, so they can access everything without
restrictions.
though `Ticket` is private.
- All our unit tests are defined in a submodule of the code they're testing, so they can access everything without
restrictions.
## Visibility modifiers
You can modify the default visibility of an entity using a **visibility modifier**.
You can modify the default visibility of an entity using a **visibility modifier**.\
Some common visibility modifiers are:
- `pub`: makes the entity **public**, i.e. accessible from outside the module where it's defined, potentially from
- `pub`: makes the entity **public**, i.e. accessible from outside the module where it's defined, potentially from
other crates.
- `pub(crate)`: makes the entity public within the same **crate**, but not outside of it.
- `pub(super)`: makes the entity public within the parent module.
@ -46,4 +46,4 @@ The `active` field, instead, is private and can only be accessed from within the
## References
- The exercise for this section is located in `exercises/03_ticket_v1/04_visibility`
- The exercise for this section is located in `exercises/03_ticket_v1/04_visibility`

View file

@ -1,6 +1,6 @@
# Encapsulation
Now that we have a basic understanding of modules and visibility, let's circle back to **encapsulation**.
Now that we have a basic understanding of modules and visibility, let's circle back to **encapsulation**.\
Encapsulation is the practice of hiding the internal representation of an object. It is most commonly
used to enforce some **invariants** on the object's state.
@ -14,16 +14,16 @@ struct Ticket {
}
```
If all fields are made public, there is no encapsulation.
If all fields are made public, there is no encapsulation.\
You must assume that the fields can be modified at any time, set to any value that's allowed by
their type. You can't rule out that a ticket might have an empty title or a status
their type. You can't rule out that a ticket might have an empty title or a status
that doesn't make sense.
To enforce stricter rules, we must keep the fields private[^newtype].
We can then provide public methods to interact with a `Ticket` instance.
To enforce stricter rules, we must keep the fields private[^newtype].
We can then provide public methods to interact with a `Ticket` instance.
Those public methods will have the responsibility of upholding our invariants (e.g. a title must not be empty).
If all fields are private, it is no longer possible to create a `Ticket` instance directly using the struct
If all fields are private, it is no longer possible to create a `Ticket` instance directly using the struct
instantiation syntax:
```rust
@ -35,9 +35,9 @@ let ticket = Ticket {
};
```
You've seen this in action in the previous exercise on visibility.
You've seen this in action in the previous exercise on visibility.\
We now need to provide one or more public **constructors**—i.e. static methods or functions that can be used
from outside the module to create a new instance of the struct.
from outside the module to create a new instance of the struct.\
Luckily enough we already have one: `Ticket::new`, as implemented in [a previous exercise](02_validation.md).
## Accessor methods
@ -47,10 +47,10 @@ In summary:
- All `Ticket` fields are private
- We provide a public constructor, `Ticket::new`, that enforces our validation rules on creation
That's a good start, but it's not enough: apart from creating a `Ticket`, we also need to interact with it.
That's a good start, but it's not enough: apart from creating a `Ticket`, we also need to interact with it.
But how can we access the fields if they're private?
We need to provide **accessor methods**.
We need to provide **accessor methods**.\
Accessor methods are public methods that allow you to read the value of a private field (or fields) of a struct.
Rust doesn't have a built-in way to generate accessor methods for you, like some other languages do.
@ -60,4 +60,4 @@ You have to write them yourself—they're just regular methods.
- The exercise for this section is located in `exercises/03_ticket_v1/05_encapsulation`
[^newtype]: Or refine their type, a technique we'll explore [later on](../05_ticket_v2/15_outro.md).
[^newtype]: Or refine their type, a technique we'll explore [later on](../05_ticket_v2/15_outro.md).

View file

@ -1,6 +1,6 @@
# Ownership
If you solved the previous exercise using what this course has taught you so far,
If you solved the previous exercise using what this course has taught you so far,
your accessor methods probably look like this:
```rust
@ -74,11 +74,11 @@ All these things are true at the same time for Rust:
2. As a developer, you rarely have to manage memory directly
3. You can't cause dangling pointers, double frees, and other memory-related bugs
Languages like Python, JavaScript, and Java give you 2. and 3., but not 1.
Language like C or C++ give you 1., but neither 2. nor 3.
Languages like Python, JavaScript, and Java give you 2. and 3., but not 1.\
Language like C or C++ give you 1., but neither 2. nor 3.
Depending on your background, 3. might sound a bit arcane: what is a "dangling pointer"?
What is a "double free"? Why are they dangerous?
Depending on your background, 3. might sound a bit arcane: what is a "dangling pointer"?
What is a "double free"? Why are they dangerous?\
Don't worry: we'll cover these concepts in more details during the rest of the course.
For now, though, let's focus on learning how to work within Rust's ownership system.
@ -89,8 +89,8 @@ In Rust, each value has an **owner**, statically determined at compile-time.
There is only one owner for each value at any given time.
## Move semantics
Ownership can be transferred.
Ownership can be transferred.
If you own a value, for example, you can transfer ownership to another variable:
@ -113,9 +113,9 @@ impl Ticket {
}
```
`Ticket::description` takes ownership of the `Ticket` instance it's called on.
`Ticket::description` takes ownership of the `Ticket` instance it's called on.\
This is known as **move semantics**: ownership of the value (`self`) is **moved** from the caller to
the callee, and the caller can't use it anymore.
the callee, and the caller can't use it anymore.
That's exactly the language used by the compiler in the error message we saw earlier:
@ -152,10 +152,10 @@ To build _useful_ accessor methods we need to start working with **references**.
## Borrowing
It is desirable to have methods that can read the value of a variable without taking ownership of it.
It is desirable to have methods that can read the value of a variable without taking ownership of it.\
Programming would be quite limited otherwise. In Rust, that's done via **borrowing**.
Whenever you borrow a value, you get a **reference** to it.
Whenever you borrow a value, you get a **reference** to it.\
References are tagged with their privileges[^refine]:
- Immutable references (`&`) allow you to read the value, but not to mutate it
@ -173,20 +173,20 @@ To ensure these two properties, Rust has to introduce some restrictions on refer
- The owner can't mutate the value while it's being borrowed
- You can have as many immutable references as you want, as long as there are no mutable references
In a way, you can think of an immutable reference as a "read-only" lock on the value,
while a mutable reference is like a "read-write" lock.
In a way, you can think of an immutable reference as a "read-only" lock on the value,
while a mutable reference is like a "read-write" lock.
All these restrictions are enforced at compile-time by the borrow checker.
### Syntax
How do you borrow a value, in practice?
How do you borrow a value, in practice?\
By adding `&` or `&mut` **in front a variable**, you're borrowing its value.
Careful though! The same symbols (`&` and `&mut`) in **front of a type** have a different meaning:
Careful though! The same symbols (`&` and `&mut`) in **front of a type** have a different meaning:
they denote a different type, a reference to the original type.
For example:
```rust
struct Configuration {
version: u32,
@ -220,18 +220,18 @@ fn f(number: &mut u32) -> &u32 {
## Breathe in, breathe out
Rust's ownership system can be a bit overwhelming at first.
But don't worry: it'll become second nature with practice.
Rust's ownership system can be a bit overwhelming at first.\
But don't worry: it'll become second nature with practice.\
And you're going to get a lot of practice over the rest of this chapter, as well as the rest of the course!
We'll revisit each concept multiple times to make sure you get familiar with them
and truly understand how they work.
Towards the end of this chapter we'll explain *why* Rust's ownership system is designed the way it is.
For the time being, focus on understanding the *how*. Take each compiler error as a learning opportunity!
Towards the end of this chapter we'll explain _why_ Rust's ownership system is designed the way it is.
For the time being, focus on understanding the _how_. Take each compiler error as a learning opportunity!
## References
- The exercise for this section is located in `exercises/03_ticket_v1/06_ownership`
[^refine]: This is a great mental model to start out, but it doesn't capture the _full_ picture.
We'll refine our understanding of references [later in the course](../07_threads/06_interior_mutability.md).
We'll refine our understanding of references [later in the course](../07_threads/06_interior_mutability.md).

View file

@ -18,7 +18,7 @@ impl Ticket {
}
```
A sprinkle of `&` here and there did the trick!
A sprinkle of `&` here and there did the trick!\
We now have a way to access the fields of a `Ticket` instance without consuming it in the process.
Let's see how we can enhance our `Ticket` struct with **setter methods** next.
@ -46,7 +46,7 @@ impl Ticket {
}
```
It takes ownership of `self`, changes the title, and returns the modified `Ticket` instance.
It takes ownership of `self`, changes the title, and returns the modified `Ticket` instance.\
This is how you'd use it:
```rust
@ -55,8 +55,8 @@ let ticket = ticket.set_title("New title".into());
```
Since `set_title` takes ownership of `self` (i.e. it **consumes it**), we need to reassign the result to a variable.
In the example above we take advantage of **variable shadowing** to reuse the same variable name: when
you declare a new variable with the same name as an existing one, the new variable **shadows** the old one. This
In the example above we take advantage of **variable shadowing** to reuse the same variable name: when
you declare a new variable with the same name as an existing one, the new variable **shadows** the old one. This
is a common pattern in Rust code.
`self`-setters work quite nicely when you need to change multiple fields at once: you can chain multiple calls together!
@ -82,8 +82,8 @@ impl Ticket {
}
```
This time the method takes a mutable reference to `self` as input, changes the title, and that's it.
Nothing is returned.
This time the method takes a mutable reference to `self` as input, changes the title, and that's it.
Nothing is returned.
You'd use it like this:

View file

@ -5,16 +5,16 @@ Now it's a good time to take a look under the hood: let's talk about **memory**.
## Stack and heap
When discussing memory, you'll often hear people talk about the **stack** and the **heap**.
When discussing memory, you'll often hear people talk about the **stack** and the **heap**.\
These are two different memory regions used by programs to store data.
Let's start with the stack.
## Stack
The **stack** is a **LIFO** (Last In, First Out) data structure.
The **stack** is a **LIFO** (Last In, First Out) data structure.\
When you call a function, a new **stack frame** is added on top of the stack. That stack frame stores
the function's arguments, local variables and a few "bookkeeping" values.
the function's arguments, local variables and a few "bookkeeping" values.\
When the function returns, the stack frame is popped off the stack[^stack-overflow].
```text
@ -25,22 +25,22 @@ When the function returns, the stack frame is popped off the stack[^stack-overfl
+-----------------+ +-----------------+ +-----------------+
```
From an operational point of view, stack allocation/de-allocation is **very fast**.
From an operational point of view, stack allocation/de-allocation is **very fast**.\
We are always pushing and popping data from the top of the stack, so we don't need to search for free memory.
We also don't have to worry about fragmentation: the stack is a single contiguous block of memory.
### Rust
Rust will often allocate data on the stack.
You have a `u32` input argument in a function? Those 32 bits will be on the stack.
You define a local variable of type `i64`? Those 64 bits will be on the stack.
Rust will often allocate data on the stack.\
You have a `u32` input argument in a function? Those 32 bits will be on the stack.\
You define a local variable of type `i64`? Those 64 bits will be on the stack.\
It all works quite nicely because the size of those integers is known at compile time, therefore
the compiled program knows how much space it needs to reserve on the stack for them.
### `std::mem::size_of`
You can verify how much space a type would take on the stack
using the [`std::mem::size_of`](https://doc.rust-lang.org/std/mem/fn.size_of.html) function.
You can verify how much space a type would take on the stack
using the [`std::mem::size_of`](https://doc.rust-lang.org/std/mem/fn.size_of.html) function.
For a `u8`, for example:
@ -57,6 +57,6 @@ assert_eq!(std::mem::size_of::<u8>(), 1);
- The exercise for this section is located in `exercises/03_ticket_v1/08_stack`
[^stack-overflow]: If you have nested function calls, each function pushes its data onto the stack when it's called but
it doesn't pop it off until the innermost function returns.
If you have too many nested function calls, you can run out of stack space—the stack is not infinite!
That's called a [**stack overflow**](https://en.wikipedia.org/wiki/Stack_overflow).
it doesn't pop it off until the innermost function returns.
If you have too many nested function calls, you can run out of stack space—the stack is not infinite!
That's called a [**stack overflow**](https://en.wikipedia.org/wiki/Stack_overflow).

View file

@ -6,14 +6,14 @@ That's where the **heap** comes in.
## Heap allocations
You can visualize the heap as a big chunk of memory—a huge array, if you will.
You can visualize the heap as a big chunk of memory—a huge array, if you will.\
Whenever you need to store data on the heap, you ask a special program, the **allocator**, to reserve for you
a subset of the heap. We call this interaction (and the memory you reserved) a **heap allocation**.
If the allocation succeeds, the allocator will give you a **pointer** to the start of the reserved block.
## No automatic de-allocation
The heap is structured quite differently from the stack.
The heap is structured quite differently from the stack.\
Heap allocations are not contiguous, they can be located anywhere inside the heap.
```
@ -29,15 +29,15 @@ calling the allocator again to **free** the memory you no longer need.
## Performance
The heap's flexibility comes at a cost: heap allocations are **slower** than stack allocations.
There's a lot more bookkeeping involved!
If you read articles about performance optimization you'll often be advised to minimize heap allocations
There's a lot more bookkeeping involved!\
If you read articles about performance optimization you'll often be advised to minimize heap allocations
and prefer stack-allocated data whenever possible.
## `String`'s memory layout
When you create a local variable of type `String`,
Rust is forced to allocate on the heap[^empty]: it doesn't know in advance how much text you're going to put in it,
so it can't reserve the right amount of space on the stack.
When you create a local variable of type `String`,
Rust is forced to allocate on the heap[^empty]: it doesn't know in advance how much text you're going to put in it,
so it can't reserve the right amount of space on the stack.\
But a `String` is not _entirely_ heap-allocated, it also keeps some data on the stack. In particular:
- The **pointer** to the heap region you reserved.
@ -65,11 +65,11 @@ Heap: | ? | ? | ? | ? | ? |
+---+---+---+---+---+
```
We asked for a `String` that can hold up to 5 bytes of text.
`String::with_capacity` goes to the allocator and asks for 5 bytes of heap memory. The allocator returns
a pointer to the start of that memory block.
The `String` is empty, though. On the stack, we keep track of this information by distinguishing between
the length and the capacity: this `String` can hold up to 5 bytes, but it currently holds 0 bytes of
We asked for a `String` that can hold up to 5 bytes of text.\
`String::with_capacity` goes to the allocator and asks for 5 bytes of heap memory. The allocator returns
a pointer to the start of that memory block.\
The `String` is empty, though. On the stack, we keep track of this information by distinguishing between
the length and the capacity: this `String` can hold up to 5 bytes, but it currently holds 0 bytes of
actual text.
If you push some text into the `String`, the situation will change:
@ -96,40 +96,40 @@ Three of the five bytes on the heap are used to store the characters `H`, `e`, a
### `usize`
How much space do we need to store pointer, length and capacity on the stack?
How much space do we need to store pointer, length and capacity on the stack?\
It depends on the **architecture** of the machine you're running on.
Every memory location on your machine has an [**address**](https://en.wikipedia.org/wiki/Memory_address), commonly
represented as an unsigned integer.
Depending on the maximum size of the address space (i.e. how much memory your machine can address),
this integer can have a different size. Most modern machines use either a 32-bit or a 64-bit address space.
Depending on the maximum size of the address space (i.e. how much memory your machine can address),
this integer can have a different size. Most modern machines use either a 32-bit or a 64-bit address space.
Rust abstracts away these architecture-specific details by providing the `usize` type:
an unsigned integer that's as big as the number of bytes needed to address memory on your machine.
On a 32-bit machine, `usize` is equivalent to `u32`. On a 64-bit machine, it matches `u64`.
Capacity, length and pointers are all represented as `usize`s in Rust[^equivalence].
Capacity, length and pointers are all represented as `usize`s in Rust[^equivalence].
### No `std::mem::size_of` for the heap
`std::mem::size_of` returns the amount of space a type would take on the stack,
which is also known as the **size of the type**.
`std::mem::size_of` returns the amount of space a type would take on the stack,
which is also known as the **size of the type**.
> What about the memory buffer that `String` is managing on the heap? Isn't that
> part of the size of `String`?
No!
That heap allocation is a **resource** that `String` is managing.
It's not considered to be part of the `String` type by the compiler.
No!\
That heap allocation is a **resource** that `String` is managing.
It's not considered to be part of the `String` type by the compiler.
`std::mem::size_of` doesn't know (or care) about additional heap-allocated data
`std::mem::size_of` doesn't know (or care) about additional heap-allocated data
that a type might manage or refer to via pointers, as is the case with `String`,
therefore it doesn't track its size.
Unfortunately there is no equivalent of `std::mem::size_of` to measure the amount of
heap memory that a certain value is allocating at runtime. Some types might
provide methods to inspect their heap usage (e.g. `String`'s `capacity` method),
but there is no general-purpose "API" to retrieve runtime heap usage in Rust.
but there is no general-purpose "API" to retrieve runtime heap usage in Rust.\
You can, however, use a memory profiler tool (e.g. [DHAT](https://valgrind.org/docs/manual/dh-manual.html)
or [a custom allocator](https://docs.rs/dhat/latest/dhat/)) to inspect the heap usage of your program.
@ -138,9 +138,9 @@ or [a custom allocator](https://docs.rs/dhat/latest/dhat/)) to inspect the heap
- The exercise for this section is located in `exercises/03_ticket_v1/09_heap`
[^empty]: `std` doesn't allocate if you create an **empty** `String` (i.e. `String::new()`).
Heap memory will be reserved when you push data into it for the first time.
Heap memory will be reserved when you push data into it for the first time.
[^equivalence]: The size of a pointer depends on the operating system too.
In certain environments, a pointer is **larger** than a memory address (e.g. [CHERI](https://blog.acolyer.org/2019/05/28/cheri-abi/)).
Rust makes the simplifying assumption that pointers are the same size as memory addresses,
which is true for most modern systems you're likely to encounter.
In certain environments, a pointer is **larger** than a memory address (e.g. [CHERI](https://blog.acolyer.org/2019/05/28/cheri-abi/)).
Rust makes the simplifying assumption that pointers are the same size as memory addresses,
which is true for most modern systems you're likely to encounter.

View file

@ -2,7 +2,7 @@
What about references, like `&String` or `&mut String`? How are they represented in memory?
Most references[^fat] in Rust are represented, in memory, as a pointer to a memory location.
Most references[^fat] in Rust are represented, in memory, as a pointer to a memory location.\
It follows that their size is the same as the size of a pointer, a `usize`.
You can verify this using `std::mem::size_of`:
@ -12,7 +12,7 @@ assert_eq!(std::mem::size_of::<&String>(), 8);
assert_eq!(std::mem::size_of::<&mut String>(), 8);
```
A `&String`, in particular, is a pointer to the memory location where the `String`'s metadata is stored.
A `&String`, in particular, is a pointer to the memory location where the `String`'s metadata is stored.\
If you run this snippet:
```rust
@ -38,11 +38,11 @@ Heap | H | e | y | ? | ? |
```
It's a pointer to a pointer to the heap-allocated data, if you will.
The same goes for `&mut String`.
The same goes for `&mut String`.
## Not all pointers point to the heap
The example above should clarify one thing: not all pointers point to the heap.
The example above should clarify one thing: not all pointers point to the heap.\
They just point to a memory location, which _may_ be on the heap, but doesn't have to be.
## References

View file

@ -1,6 +1,6 @@
# Destructors
When introducing the heap, we mentioned that you're responsible for freeing the memory you allocate.
When introducing the heap, we mentioned that you're responsible for freeing the memory you allocate.\
When introducing the borrow-checker, we also stated that you rarely have to manage memory directly in Rust.
These two statements might seem contradictory at first.
@ -13,7 +13,7 @@ The **scope** of a variable is the region of Rust code where that variable is va
The scope of a variable starts with its declaration.
It ends when one of the following happens:
1. the block (i.e. the code between `{}`) where the variable was declared ends
1. the block (i.e. the code between `{}`) where the variable was declared ends
```rust
fn main() {
// `x` is not yet in scope here
@ -21,28 +21,28 @@ It ends when one of the following happens:
let x = "World".to_string(); // <-- x's scope starts here...
let h = "!".to_string(); // |
} // <-------------- ...and ends here
```
```
2. ownership of the variable is transferred to someone else (e.g. a function or another variable)
```rust
fn compute(t: String) {
// Do something [...]
}
fn main() {
let s = "Hello".to_string(); // <-- s's scope starts here...
// |
compute(s); // <------------------- ..and ends here
// because `s` is moved into `compute`
}
```
}
```
## Destructors
When the owner of a value goes out of scope, Rust invokes its **destructor**.
When the owner of a value goes out of scope, Rust invokes its **destructor**.\
The destructor tries to clean up the resources used by that value—in particular, whatever memory it allocated.
You can manually invoke the destructor of a value by passing it to `std::mem::drop`.
That's why you'll often hear Rust developers saying "that value has been **dropped**" as a way to state that a value
You can manually invoke the destructor of a value by passing it to `std::mem::drop`.\
That's why you'll often hear Rust developers saying "that value has been **dropped**" as a way to state that a value
has gone out of scope and its destructor has been invoked.
### Visualizing drop points
@ -55,7 +55,7 @@ fn main() {
let x = "World".to_string();
let h = "!".to_string();
}
```
```
It's equivalent to:
@ -100,11 +100,11 @@ fn main() {
}
```
Notice the difference: even though `s` is no longer valid after `compute` is called in `main`, there is no `drop(s)`
Notice the difference: even though `s` is no longer valid after `compute` is called in `main`, there is no `drop(s)`
in `main`.
When you transfer ownership of a value to a function, you're also **transferring the responsibility of cleaning it up**.
When you transfer ownership of a value to a function, you're also **transferring the responsibility of cleaning it up**.
This ensures that the destructor for a value is called **at most[^leak] once**, preventing
This ensures that the destructor for a value is called **at most[^leak] once**, preventing
[double free bugs](https://owasp.org/www-community/vulnerabilities/Doubly_freeing_memory) by design.
### Use after drop
@ -129,12 +129,12 @@ error[E0382]: use of moved value: `x`
| ^ value used here after move
```
Drop **consumes** the value it's called on, meaning that the value is no longer valid after the call.
Drop **consumes** the value it's called on, meaning that the value is no longer valid after the call.\
The compiler will therefore prevent you from using it, avoiding [use-after-free bugs](https://owasp.org/www-community/vulnerabilities/Using_freed_memory).
### Dropping references
What if a variable contains a reference?
What if a variable contains a reference?\
For example:
```rust
@ -143,7 +143,7 @@ let y = &x;
drop(y);
```
When you call `drop(y)`... nothing happens.
When you call `drop(y)`... nothing happens.\
If you actually try to compile this code, you'll get a warning:
```text
@ -158,7 +158,7 @@ warning: calls to `std::mem::drop` with a reference
|
```
It goes back to what we said earlier: we only want to call the destructor once.
It goes back to what we said earlier: we only want to call the destructor once.\
You can have multiple references to the same value—if we called the destructor for the value they point at
when one of them goes out of scope, what would happen to the others?
They would refer to a memory location that's no longer valid: a so-called [**dangling pointer**](https://en.wikipedia.org/wiki/Dangling_pointer),
@ -170,4 +170,4 @@ Rust's ownership system rules out these kinds of bugs by design.
- The exercise for this section is located in `exercises/03_ticket_v1/11_destructor`
[^leak]: Rust doesn't guarantee that destructors will run. They won't, for example, if
you explicitly choose to [leak memory](../07_threads/03_leak.md).
you explicitly choose to [leak memory](../07_threads/03_leak.md).

View file

@ -1,7 +1,7 @@
# Wrapping up
We've covered a lot of foundational Rust concepts in this chapter.
Before moving on, let's go through one last exercise to consolidate what we've learned.
We've covered a lot of foundational Rust concepts in this chapter.\
Before moving on, let's go through one last exercise to consolidate what we've learned.
You'll have minimal guidance this time—just the exercise description and the tests to guide you.
## References

View file

@ -1,9 +1,9 @@
# Traits
In the previous chapter we covered the basics of Rust's type and ownership system.
In the previous chapter we covered the basics of Rust's type and ownership system.\
It's time to dig deeper: we'll explore **traits**, Rust's take on interfaces.
Once you learn about traits, you'll start seeing their fingerprints all over the place.
Once you learn about traits, you'll start seeing their fingerprints all over the place.\
In fact, you've already seen traits in action throughout the previous chapter, e.g. `.into()` invocations as well
as operators like `==` and `+`.
@ -16,9 +16,9 @@ On top of traits as a concept, we'll also cover some of the key traits that are
- `Sized`, to mark types with a known size
- `Drop`, for custom cleanup logic
Since we'll be talking about conversions, we'll seize the opportunity to plug some of the "knowledge gaps"
Since we'll be talking about conversions, we'll seize the opportunity to plug some of the "knowledge gaps"
from the previous chapter—e.g. what is `"A title"`, exactly? Time to learn more about slices too!
## References
- The exercise for this section is located in `exercises/04_traits/00_intro`
- The exercise for this section is located in `exercises/04_traits/00_intro`

View file

@ -10,7 +10,7 @@ pub struct Ticket {
}
```
All our tests, so far, have been making assertions using `Ticket`'s fields.
All our tests, so far, have been making assertions using `Ticket`'s fields.
```rust
assert_eq!(ticket.title(), "A new title");
@ -38,15 +38,15 @@ error[E0369]: binary operation `==` cannot be applied to type `Ticket`
note: an implementation of `PartialEq` might be missing for `Ticket`
```
`Ticket` is a new type. Out of the box, there is **no behavior attached to it**.
`Ticket` is a new type. Out of the box, there is **no behavior attached to it**.\
Rust doesn't magically infer how to compare two `Ticket` instances just because they contain `String`s.
The Rust compiler is nudging us in the right direction though: it's suggesting that we might be missing an implementation
The Rust compiler is nudging us in the right direction though: it's suggesting that we might be missing an implementation
of `PartialEq`. `PartialEq` is a **trait**!
## What are traits?
Traits are Rust's way of defining **interfaces**.
Traits are Rust's way of defining **interfaces**.\
A trait defines a set of methods that a type must implement to satisfy the trait's contract.
### Defining a trait
@ -69,7 +69,7 @@ trait MaybeZero {
### Implementing a trait
To implement a trait for a type we use the `impl` keyword, just like we do for regular[^inherent] methods,
To implement a trait for a type we use the `impl` keyword, just like we do for regular[^inherent] methods,
but the syntax is a bit different:
```rust
@ -117,15 +117,15 @@ use crate::MaybeZero;
This is not necessary if:
- The trait is defined in the same module where the invocation occurs.
- The trait is defined in the standard library's **prelude**.
The prelude is a set of traits and types that are automatically imported into every Rust program.
- The trait is defined in the standard library's **prelude**.
The prelude is a set of traits and types that are automatically imported into every Rust program.
It's as if `use std::prelude::*;` was added at the beginning of every Rust module.
You can find the list of traits and types in the prelude in the
You can find the list of traits and types in the prelude in the
[Rust documentation](https://doc.rust-lang.org/std/prelude/index.html).
## References
- The exercise for this section is located in `exercises/04_traits/01_trait`
[^inherent]: A method defined directly on a type, without using a trait, is also known as an **inherent method**.
[^inherent]: A method defined directly on a type, without using a trait, is also known as an **inherent method**.

View file

@ -44,7 +44,7 @@ fn main() {
## One implementation
There are limitations to the trait implementations you can write.
There are limitations to the trait implementations you can write.\
The simplest and most straight-forward one: you can't implement the same trait twice,
in a crate, for the same type.
@ -101,7 +101,7 @@ Imagine the following situation:
- Crate `C` provides a (different) implementation of the `IsEven` trait for `u32`
- Crate `D` depends on both `B` and `C` and calls `1.is_even()`
Which implementation should be used? The one defined in `B`? Or the one defined in `C`?
Which implementation should be used? The one defined in `B`? Or the one defined in `C`?\
There's no good answer, therefore the orphan rule was defined to prevent this scenario.
Thanks to the orphan rule, neither crate `B` nor crate `C` would compile.
@ -111,6 +111,6 @@ Thanks to the orphan rule, neither crate `B` nor crate `C` would compile.
## Further reading
- There are some caveats and exceptions to the orphan rule as stated above.
Check out [the reference](https://doc.rust-lang.org/reference/items/implementations.html#trait-implementation-coherence)
if you want to get familiar with its nuances.
- There are some caveats and exceptions to the orphan rule as stated above.
Check out [the reference](https://doc.rust-lang.org/reference/items/implementations.html#trait-implementation-coherence)
if you want to get familiar with its nuances.

View file

@ -5,11 +5,11 @@ Operator overloading is the ability to define custom behavior for operators like
## Operators are traits
In Rust, operators are traits.
In Rust, operators are traits.\
For each operator, there is a corresponding trait that defines the behavior of that operator.
By implementing that trait for your type, you **unlock** the usage of the corresponding operators.
By implementing that trait for your type, you **unlock** the usage of the corresponding operators.
For example, the [`PartialEq` trait](https://doc.rust-lang.org/std/cmp/trait.PartialEq.html) defines the behavior of
For example, the [`PartialEq` trait](https://doc.rust-lang.org/std/cmp/trait.PartialEq.html) defines the behavior of
the `==` and `!=` operators:
```rust
@ -33,9 +33,9 @@ and replace `x == y` with `x.eq(y)`. It's syntactic sugar!
This is the correspondence for the main operators:
| Operator | Trait |
|--------------------------|-------------------------------------------------------------------------|
| ------------------------ | ----------------------------------------------------------------------- |
| `+` | [`Add`](https://doc.rust-lang.org/std/ops/trait.Add.html) |
| `-` | [`Sub`](https://doc.rust-lang.org/std/ops/trait.Sub.html) |
| `-` | [`Sub`](https://doc.rust-lang.org/std/ops/trait.Sub.html) |
| `*` | [`Mul`](https://doc.rust-lang.org/std/ops/trait.Mul.html) |
| `/` | [`Div`](https://doc.rust-lang.org/std/ops/trait.Div.html) |
| `%` | [`Rem`](https://doc.rust-lang.org/std/ops/trait.Rem.html) |
@ -47,9 +47,9 @@ while comparison ones live in the [`std::cmp`](https://doc.rust-lang.org/std/cmp
## Default implementations
The comment on `PartialEq::ne` states that "`ne` is a provided method".
It means that `PartialEq` provides a **default implementation** for `ne` in the trait definition—the `{ ... }` elided
block in the definition snippet.
The comment on `PartialEq::ne` states that "`ne` is a provided method".\
It means that `PartialEq` provides a **default implementation** for `ne` in the trait definition—the `{ ... }` elided
block in the definition snippet.\
If we expand the elided block, it looks like this:
```rust
@ -62,7 +62,7 @@ pub trait PartialEq {
}
```
It's what you expect: `ne` is the negation of `eq`.
It's what you expect: `ne` is the negation of `eq`.\
Since a default implementation is provided, you can skip implementing `ne` when you implement `PartialEq` for your type.
It's enough to implement `eq`:
@ -80,7 +80,7 @@ impl PartialEq for WrappingU8 {
}
```
You are not forced to use the default implementation though.
You are not forced to use the default implementation though.
You can choose to override it when you implement the trait:
```rust

View file

@ -1,7 +1,7 @@
# Derive macros
Implementing `PartialEq` for `Ticket` was a bit tedious, wasn't it?
You had to manually compare each field of the struct.
You had to manually compare each field of the struct.
## Destructuring syntax
@ -24,7 +24,7 @@ impl PartialEq for Ticket {
```
If the definition of `Ticket` changes, the compiler will error out, complaining that your
destructuring is no longer exhaustive.
destructuring is no longer exhaustive.\
You can also rename struct fields, to avoid variable shadowing:
```rust
@ -55,13 +55,13 @@ You've already encountered a few macros in past exercises:
- `assert_eq!` and `assert!`, in the test cases
- `println!`, to print to the console
Rust macros are **code generators**.
Rust macros are **code generators**.\
They generate new Rust code based on the input you provide, and that generated code is then compiled alongside
the rest of your program. Some macros are built into Rust's standard library, but you can also
write your own. We won't be creating our macro in this course, but you can find some useful
pointers in the ["Further reading" section](#further-reading).
### Inspection
### Inspection
Some IDEs let you expand a macro to inspect the generated code. If that's not possible, you can use
[`cargo-expand`](https://github.com/dtolnay/cargo-expand).
@ -81,7 +81,7 @@ struct Ticket {
Derive macros are used to automate the implementation of common (and "obvious") traits for custom types.
In the example above, the `PartialEq` trait is automatically implemented for `Ticket`.
If you expand the macro, you'll see that the generated code is functionally equivalent to the one you wrote manually,
If you expand the macro, you'll see that the generated code is functionally equivalent to the one you wrote manually,
although a bit more cumbersome to read:
```rust
@ -104,4 +104,4 @@ The compiler will nudge you to derive traits when possible.
## Further reading
- [The little book of Rust macros](https://veykril.github.io/tlborm/)
- [Proc macro workshop](https://github.com/dtolnay/proc-macro-workshop)
- [Proc macro workshop](https://github.com/dtolnay/proc-macro-workshop)

View file

@ -1,19 +1,19 @@
# Trait bounds
# Trait bounds
We've seen two use cases for traits so far:
- Unlocking "built-in" behaviour (e.g. operator overloading)
- Adding new behaviour to existing types (i.e. extension traits)
There's a third use case: **generic programming**.
There's a third use case: **generic programming**.
## The problem
All our functions and methods, so far, have been working with **concrete types**.
All our functions and methods, so far, have been working with **concrete types**.\
Code that operates on concrete types is usually straightforward to write and understand. But it's also
limited in its reusability.
limited in its reusability.\
Let's imagine, for example, that we want to write a function that returns `true` if an integer is even.
Working with concrete types, we'd have to write a separate function for each integer type we want to
Working with concrete types, we'd have to write a separate function for each integer type we want to
support:
```rust
@ -54,7 +54,7 @@ The duplication remains.
## Generic programming
We can do better using **generics**.
We can do better using **generics**.\
Generics allow us to write code that works with a **type parameter** instead of a concrete type:
```rust
@ -68,19 +68,19 @@ where
}
```
`print_if_even` is a **generic function**.
`print_if_even` is a **generic function**.\
It isn't tied to a specific input type. Instead, it works with any type `T` that:
- Implements the `IsEven` trait.
- Implements the `Debug` trait.
This contract is expressed with a **trait bound**: `T: IsEven + Debug`.
This contract is expressed with a **trait bound**: `T: IsEven + Debug`.\
The `+` symbol is used to require that `T` implements multiple traits. `T: IsEven + Debug` is equivalent to
"where `T` implements `IsEven` **and** `Debug`".
## Trait bounds
What purpose do trait bounds serve in `print_if_even`?
What purpose do trait bounds serve in `print_if_even`?\
To find out, let's try to remove them:
```rust
@ -114,9 +114,9 @@ help: consider restricting type parameter `T`
| +++++++++++++++++
```
Without trait bounds, the compiler doesn't know what `T` **can do**.
It doesn't know that `T` has an `is_even` method, and it doesn't know how to format `T` for printing.
From the compiler point of view, a bare `T` has no behaviour at all.
Without trait bounds, the compiler doesn't know what `T` **can do**.\
It doesn't know that `T` has an `is_even` method, and it doesn't know how to format `T` for printing.
From the compiler point of view, a bare `T` has no behaviour at all.\
Trait bounds restrict the set of types that can be used by ensuring that the behaviour required by the function
body is present.
@ -147,8 +147,8 @@ fn print_if_even<T: IsEven + Debug>(n: T) {
## Syntax: meaningful names
In the examples above, we used `T` as the type parameter name. This is a common convention when a function has
only one type parameter.
In the examples above, we used `T` as the type parameter name. This is a common convention when a function has
only one type parameter.\
Nothing stops you from using a more meaningful name, though:
```rust
@ -158,17 +158,17 @@ fn print_if_even<Number: IsEven + Debug>(n: Number) {
```
It is actually **desirable** to use meaningful names when there are multiple type parameters at play or when the name
`T` doesn't convey enough information about the type's role in the function.
`T` doesn't convey enough information about the type's role in the function.
Maximize clarity and readability when naming type parameters, just as you would with variables or function parameters.
Follow Rust's conventions though: use camel case for type parameter names.
## The function signature is king
You may wonder why we need trait bounds at all. Can't the compiler infer the required traits from the function's body?
It could, but it won't.
The rationale is the same as for [explicit type annotations on function parameters](../02_basic_calculator/02_variables.md#function-arguments-are-variables):
each function signature is a contract between the caller and the callee, and the terms must be explicitly stated.
This allows for better error messages, better documentation, less unintentional breakages across versions,
You may wonder why we need trait bounds at all. Can't the compiler infer the required traits from the function's body?\
It could, but it won't.\
The rationale is the same as for [explicit type annotations on function parameters](../02_basic_calculator/02_variables.md#function-arguments-are-variables):
each function signature is a contract between the caller and the callee, and the terms must be explicitly stated.
This allows for better error messages, better documentation, less unintentional breakages across versions,
and faster compilation times.
## References

View file

@ -1,6 +1,6 @@
# String slices
Throughout the previous chapters you've seen quite a few **string literals** being used in the code,
Throughout the previous chapters you've seen quite a few **string literals** being used in the code,
like `"To-Do"` or `"A ticket description"`.
They were always followed by a call to `.to_string()` or `.into()`. It's time to understand why!
@ -12,12 +12,12 @@ You define a string literal by enclosing the raw text in double quotes:
let s = "Hello, world!";
```
The type of `s` is `&str`, a **reference to a string slice**.
The type of `s` is `&str`, a **reference to a string slice**.
## Memory layout
`&str` and `String` are different types—they're not interchangeable.
Let's recall the memory layout of a `String` from our
`&str` and `String` are different types—they're not interchangeable.\
Let's recall the memory layout of a `String` from our
[previous exploration](../03_ticket_v1/09_heap.md).
If we run:
@ -41,25 +41,25 @@ Heap: | H | e | l | l | o |
+---+---+---+---+---+
```
If you remember, we've [also examined](../03_ticket_v1/10_references_in_memory.md)
If you remember, we've [also examined](../03_ticket_v1/10_references_in_memory.md)
how a `&String` is laid out in memory:
```text
--------------------------------------
| |
+----v----+--------+----------+ +----|----+
| pointer | length | capacity | | pointer |
| | | 5 | 5 | | |
+----|----+--------+----------+ +---------+
| s &s
|
v
+---+---+---+---+---+
| H | e | l | l | o |
+---+---+---+---+---+
--------------------------------------
| |
+----v----+--------+----------+ +----|----+
| pointer | length | capacity | | pointer |
| | | 5 | 5 | | |
+----|----+--------+----------+ +---------+
| s &s
|
v
+---+---+---+---+---+
| H | e | l | l | o |
+---+---+---+---+---+
```
`&String` points to the memory location where the `String`'s metadata is stored.
`&String` points to the memory location where the `String`'s metadata is stored.\
If we follow the pointer, we get to the heap-allocated data. In particular, we get to the first byte of the string, `H`.
What if we wanted a type that represents a **substring** of `s`? E.g. `ello` in `Hello`?
@ -100,19 +100,19 @@ Heap: | H | e | l | l | o | |
- A pointer to the first byte of the slice.
- The length of the slice.
`slice` doesn't own the data, it just points to it: it's a **reference** to the `String`'s heap-allocated data.
`slice` doesn't own the data, it just points to it: it's a **reference** to the `String`'s heap-allocated data.\
When `slice` is dropped, the heap-allocated data won't be deallocated, because it's still owned by `s`.
That's why `slice` doesn't have a `capacity` field: it doesn't own the data, so it doesn't need to know how much
That's why `slice` doesn't have a `capacity` field: it doesn't own the data, so it doesn't need to know how much
space it was allocated for it; it only cares about the data it references.
## `&str` vs `&String`
As a rule of thumb, use `&str` rather than `&String` whenever you need a reference to textual data.
As a rule of thumb, use `&str` rather than `&String` whenever you need a reference to textual data.\
`&str` is more flexible and generally considered more idiomatic in Rust code.
If a method returns a `&String`, you're promising that there is heap-allocated UTF-8 text somewhere that
**matches exactly** the one you're returning a reference to.
If a method returns a `&str`, instead, you have a lot more freedom: you're just saying that *somewhere* there's a
If a method returns a `&String`, you're promising that there is heap-allocated UTF-8 text somewhere that
**matches exactly** the one you're returning a reference to.\
If a method returns a `&str`, instead, you have a lot more freedom: you're just saying that _somewhere_ there's a
bunch of text data and that a subset of it matches what you need, therefore you're returning a reference to it.
## References

View file

@ -1,6 +1,6 @@
# `Deref` trait
In the previous exercise you didn't have to do much, did you?
In the previous exercise you didn't have to do much, did you?
Changing
@ -22,8 +22,8 @@ impl Ticket {
}
```
was all you needed to do to get the code to compile and the tests to pass.
Some alarm bells should be ringing in your head though.
was all you needed to do to get the code to compile and the tests to pass.
Some alarm bells should be ringing in your head though.
## It shouldn't work, but it does
@ -38,7 +38,7 @@ Instead, it just works. **Why**?
## `Deref` to the rescue
The `Deref` trait is the mechanism behind the language feature known as [**deref coercion**](https://doc.rust-lang.org/std/ops/trait.Deref.html#deref-coercion).
The `Deref` trait is the mechanism behind the language feature known as [**deref coercion**](https://doc.rust-lang.org/std/ops/trait.Deref.html#deref-coercion).\
The trait is defined in the standard library, in the `std::ops` module:
```rust
@ -51,13 +51,13 @@ pub trait Deref {
}
```
`type Target` is an **associated type**.
`type Target` is an **associated type**.\
It's a placeholder for a concrete type that must be specified when the trait is implemented.
## Deref coercion
By implementing `Deref<Target = U>` for a type `T` you're telling the compiler that `&T` and `&U` are
somewhat interchangeable.
By implementing `Deref<Target = U>` for a type `T` you're telling the compiler that `&T` and `&U` are
somewhat interchangeable.\
In particular, you get the following behavior:
- References to `T` are implicitly converted into references to `U` (i.e. `&T` becomes `&U`)
@ -84,11 +84,11 @@ Thanks to this implementation and deref coercion, a `&String` is automatically c
## Don't abuse deref coercion
Deref coercion is a powerful feature, but it can lead to confusion.
Deref coercion is a powerful feature, but it can lead to confusion.\
Automatically converting types can make the code harder to read and understand. If a method with the same name
is defined on both `T` and `U`, which one will be called?
is defined on both `T` and `U`, which one will be called?
We'll examine later in the course the "safest" use cases for deref coercion: smart pointers.
We'll examine later in the course the "safest" use cases for deref coercion: smart pointers.
## References

View file

@ -1,11 +1,11 @@
# `Sized`
There's more to `&str` than meets the eye, even after having
investigated deref coercion.
There's more to `&str` than meets the eye, even after having
investigated deref coercion.\
From our previous [discussion on memory layouts](../03_ticket_v1/10_references_in_memory.md),
it would have been reasonable to expect `&str` to be represented as a single `usize` on
the stack, a pointer. That's not the case though. `&str` stores some **metadata** next
to the pointer: the length of the slice it points to. Going back to the example from
the stack, a pointer. That's not the case though. `&str` stores some **metadata** next
to the pointer: the length of the slice it points to. Going back to the example from
[a previous section](06_str_slice.md):
```rust
@ -38,16 +38,16 @@ What's going on?
## Dynamically sized types
`str` is a **dynamically sized type** (DST).
A DST is a type whose size is not known at compile time. Whenever you have a
`str` is a **dynamically sized type** (DST).\
A DST is a type whose size is not known at compile time. Whenever you have a
reference to a DST, like `&str`, it has to include additional
information about the data it points to. It is a **fat pointer**.
In the case of `&str`, it stores the length of the slice it points to.
information about the data it points to. It is a **fat pointer**.\
In the case of `&str`, it stores the length of the slice it points to.
We'll see more examples of DSTs in the rest of the course.
## The `Sized` trait
Rust's `std` library defines a trait called `Sized`.
Rust's `std` library defines a trait called `Sized`.
```rust
pub trait Sized {
@ -59,14 +59,14 @@ A type is `Sized` if its size is known at compile time. In other words, it's not
### Marker traits
`Sized` is your first example of a **marker trait**.
`Sized` is your first example of a **marker trait**.\
A marker trait is a trait that doesn't require any methods to be implemented. It doesn't define any behavior.
It only serves to **mark** a type as having certain properties.
The mark is then leveraged by the compiler to enable certain behaviors or optimizations.
The mark is then leveraged by the compiler to enable certain behaviors or optimizations.
### Auto traits
In particular, `Sized` is also an **auto trait**.
In particular, `Sized` is also an **auto trait**.\
You don't need to implement it explicitly; the compiler implements it automatically for you
based on the type's definition.
@ -74,8 +74,8 @@ based on the type's definition.
All the types we've seen so far are `Sized`: `u32`, `String`, `bool`, etc.
`str`, as we just saw, is not `Sized`.
`&str` is `Sized` though! We know its size at compile time: two `usize`s, one for the pointer
`str`, as we just saw, is not `Sized`.\
`&str` is `Sized` though! We know its size at compile time: two `usize`s, one for the pointer
and one for the length.
## References

View file

@ -20,13 +20,13 @@ impl Ticket {
}
```
We've also seen that string literals (such as `"A title"`) are of type `&str`.
We have a type mismatch here: a `String` is expected, but we have a `&str`.
We've also seen that string literals (such as `"A title"`) are of type `&str`.\
We have a type mismatch here: a `String` is expected, but we have a `&str`.
No magical coercion will come to save us this time; we need **to perform a conversion**.
## `From` and `Into`
The Rust standard library defines two traits for **infallible conversions**: `From` and `Into`,
The Rust standard library defines two traits for **infallible conversions**: `From` and `Into`,
in the `std::convert` module.
```rust
@ -39,7 +39,7 @@ pub trait Into<T>: Sized {
}
```
These trait definitions showcase a few concepts that we haven't seen before: **supertraits** and **implicit trait bounds**.
These trait definitions showcase a few concepts that we haven't seen before: **supertraits** and **implicit trait bounds**.
Let's unpack those first.
### Supertrait / Subtrait
@ -78,7 +78,7 @@ pub trait From<T: Sized>: Sized {
```
In other words, _both_ `T` and the type implementing `From<T>` must be `Sized`, even
though the former bound is implicit.
though the former bound is implicit.
### Negative trait bounds
@ -94,23 +94,23 @@ pub struct Foo<T: ?Sized> {
This syntax reads as "`T` may or may not be `Sized`", and it allows you to
bind `T` to a DST (e.g. `Foo<str>`). It is a special case, though: negative trait bounds are exclusive to `Sized`,
you can't use them with other traits.
you can't use them with other traits.
## `&str` to `String`
In [`std`'s documentation](https://doc.rust-lang.org/std/convert/trait.From.html#implementors)
you can see which `std` types implement the `From` trait.
In [`std`'s documentation](https://doc.rust-lang.org/std/convert/trait.From.html#implementors)
you can see which `std` types implement the `From` trait.\
You'll find that `String` implements `From<&str> for String`. Thus, we can write:
```rust
let title = String::from("A title");
```
We've been primarily using `.into()`, though.
We've been primarily using `.into()`, though.\
If you check out the [implementors of `Into`](https://doc.rust-lang.org/std/convert/trait.Into.html#implementors)
you won't find `Into<&str> for String`. What's going on?
`From` and `Into` are **dual traits**.
`From` and `Into` are **dual traits**.\
In particular, `Into` is implemented for any type that implements `From` using a **blanket implementation**:
```rust
@ -129,7 +129,7 @@ we can write `let title = "A title".into();`.
## `.into()`
Every time you see `.into()`, you're witnessing a conversion between types.
Every time you see `.into()`, you're witnessing a conversion between types.\
What's the target type, though?
In most cases, the target type is either:

View file

@ -14,20 +14,20 @@ pub trait Deref {
}
```
They both feature type parameters.
In the case of `From`, it's a generic parameter, `T`.
They both feature type parameters.\
In the case of `From`, it's a generic parameter, `T`.\
In the case of `Deref`, it's an associated type, `Target`.
What's the difference? Why use one over the other?
## At most one implementation
Due to how deref coercion works, there can only be one "target" type for a given type. E.g. `String` can
only deref to `str`.
Due to how deref coercion works, there can only be one "target" type for a given type. E.g. `String` can
only deref to `str`.
It's about avoiding ambiguity: if you could implement `Deref` multiple times for a type,
which `Target` type should the compiler choose when you call a `&self` method?
That's why `Deref` uses an associated type, `Target`.
That's why `Deref` uses an associated type, `Target`.\
An associated type is uniquely determined **by the trait implementation**.
Since you can't implement `Deref` more than once, you'll only be able to specify one `Target` for a given type
and there won't be any ambiguity.
@ -51,7 +51,7 @@ impl From<u16> for WrappingU32 {
}
```
This works because `From<u16>` and `From<u32>` are considered **different traits**.
This works because `From<u16>` and `From<u32>` are considered **different traits**.\
There is no ambiguity: the compiler can determine which implementation to use based on type of the value being converted.
## Case study: `Add`
@ -73,7 +73,7 @@ It uses both mechanisms:
### `RHS`
`RHS` is a generic parameter to allow for different types to be added together.
`RHS` is a generic parameter to allow for different types to be added together.\
For example, you'll find these two implementations in the standard library:
```rust
@ -109,10 +109,10 @@ because `u32` implements `Add<&u32>` _as well as_ `Add<u32>`.
### `Output`
`Output` represents the type of the result of the addition.
`Output` represents the type of the result of the addition.
Why do we need `Output` in the first place? Can't we just use `Self` as output, the type implementing `Add`?
We could, but it would limit the flexibility of the trait. In the standard library, for example, you'll find
Why do we need `Output` in the first place? Can't we just use `Self` as output, the type implementing `Add`?
We could, but it would limit the flexibility of the trait. In the standard library, for example, you'll find
this implementation:
```rust
@ -125,9 +125,9 @@ impl Add<&u32> for &u32 {
}
```
The type they're implementing the trait for is `&u32`, but the result of the addition is `u32`.
It would be impossible[^flexible] to provide this implementation if `add` had to return `Self`, i.e. `&u32` in this case.
`Output` lets `std` decouple the implementor from the return type, thus supporting this case.
The type they're implementing the trait for is `&u32`, but the result of the addition is `u32`.\
It would be impossible[^flexible] to provide this implementation if `add` had to return `Self`, i.e. `&u32` in this case.
`Output` lets `std` decouple the implementor from the return type, thus supporting this case.
On the other hand, `Output` can't be a generic parameter. The output type of the operation **must** be uniquely determined
once the types of the operands are known. That's why it's an associated type: for a given combination of implementor
@ -138,7 +138,7 @@ and generic parameters, there is only one `Output` type.
To recap:
- Use an **associated type** when the type must be uniquely determined for a given trait implementation.
- Use a **generic parameter** when you want to allow multiple implementations of the trait for the same type,
- Use a **generic parameter** when you want to allow multiple implementations of the trait for the same type,
with different input types.
## References
@ -146,6 +146,5 @@ To recap:
- The exercise for this section is located in `exercises/04_traits/10_assoc_vs_generic`
[^flexible]: Flexibility is rarely free: the trait definition is more complex due to `Output`, and implementors have to reason about
what they want to return. The trade-off is only justified if that flexibility is actually needed. Keep that in mind
when designing your own traits.
what they want to return. The trade-off is only justified if that flexibility is actually needed. Keep that in mind
when designing your own traits.

View file

@ -1,13 +1,13 @@
# Copying values, pt. 1
In the previous chapter we introduced ownership and borrowing.
In the previous chapter we introduced ownership and borrowing.\
We stated, in particular, that:
- Every value in Rust has a single owner at any given time.
- When a function takes ownership of a value ("it consumes it"), the caller can't use that value anymore.
These restrictions can be somewhat limiting.
Sometimes we might have to call a function that takes ownership of a value, but we still need to use
These restrictions can be somewhat limiting.\
Sometimes we might have to call a function that takes ownership of a value, but we still need to use
that value afterward.
```rust
@ -49,8 +49,8 @@ fn example() {
}
```
Instead of giving ownership of `s` to `consumer`, we create a new `String` (by cloning `s`) and give
that to `consumer` instead.
Instead of giving ownership of `s` to `consumer`, we create a new `String` (by cloning `s`) and give
that to `consumer` instead.\
`s` remains valid and usable after the call to `consumer`.
## In memory
@ -92,7 +92,7 @@ If you're coming from a language like Java, you can think of `clone` as a way to
## Implementing `Clone`
To make a type `Clone`-able, we have to implement the `Clone` trait for it.
To make a type `Clone`-able, we have to implement the `Clone` trait for it.\
You almost always implement `Clone` by deriving it:
```rust
@ -103,7 +103,7 @@ struct MyType {
```
The compiler implements `Clone` for `MyType` as you would expect: it clones each field of `MyType` individually and
then constructs a new `MyType` instance using the cloned fields.
then constructs a new `MyType` instance using the cloned fields.\
Remember that you can use `cargo expand` (or your IDE) to explore the code generated by `derive` macros.
## References

View file

@ -12,7 +12,7 @@ fn example() {
}
```
It'll compile without errors! What's going on here? What's the difference between `String` and `u32`
It'll compile without errors! What's going on here? What's the difference between `String` and `u32`
that makes the latter work without `.clone()`?
## `Copy`
@ -26,65 +26,65 @@ pub trait Copy: Clone { }
It is a marker trait, just like `Sized`.
If a type implements `Copy`, there's no need to call `.clone()` to create a new instance of the type:
Rust does it **implicitly** for you.
Rust does it **implicitly** for you.\
`u32` is an example of a type that implements `Copy`, which is why the example above compiles without errors:
when `consumer(s)` is called, Rust creates a new `u32` instance by performing a **bitwise copy** of `s`,
when `consumer(s)` is called, Rust creates a new `u32` instance by performing a **bitwise copy** of `s`,
and then passes that new instance to `consumer`. It all happens behind the scenes, without you having to do anything.
## What can be `Copy`?
`Copy` is not equivalent to "automatic cloning", although it implies it.
`Copy` is not equivalent to "automatic cloning", although it implies it.\
Types must meet a few requirements in order to be allowed to implement `Copy`.
First of all, it must implement `Clone`, since `Copy` is a subtrait of `Clone`.
This makes sense: if Rust can create a new instance of a type _implicitly_, it should
This makes sense: if Rust can create a new instance of a type _implicitly_, it should
also be able to create a new instance _explicitly_ by calling `.clone()`.
That's not all, though. A few more conditions must be met:
1. The type doesn't manage any _additional_ resources (e.g. heap memory, file handles, etc.) beyond the `std::mem::size_of`
bytes that it occupies in memory.
bytes that it occupies in memory.
2. The type is not a mutable reference (`&mut T`).
If both conditions are met, then Rust can safely create a new instance of the type by performing a **bitwise copy**
If both conditions are met, then Rust can safely create a new instance of the type by performing a **bitwise copy**
of the original instance—this is often referred to as a `memcpy` operation, after the C standard library function
that performs the bitwise copy.
### Case study 1: `String`
`String` is a type that doesn't implement `Copy`.
`String` is a type that doesn't implement `Copy`.\
Why? Because it manages an additional resource: the heap-allocated memory buffer that stores the string's data.
Let's imagine that Rust allowed `String` to implement `Copy`.
Let's imagine that Rust allowed `String` to implement `Copy`.\
Then, when a new `String` instance is created by performing a bitwise copy of the original instance, both the original
and the new instance would point to the same memory buffer:
and the new instance would point to the same memory buffer:
```text
s copied_s
+---------+--------+----------+ +---------+--------+----------+
| pointer | length | capacity | | pointer | length | capacity |
| | | 5 | 5 | | | | 5 | 5 |
+--|------+--------+----------+ +--|------+--------+----------+
| |
| |
v |
+---+---+---+---+---+ |
| H | e | l | l | o | |
+---+---+---+---+---+ |
^ |
| |
+------------------------------------+
s copied_s
+---------+--------+----------+ +---------+--------+----------+
| pointer | length | capacity | | pointer | length | capacity |
| | | 5 | 5 | | | | 5 | 5 |
+--|------+--------+----------+ +--|------+--------+----------+
| |
| |
v |
+---+---+---+---+---+ |
| H | e | l | l | o | |
+---+---+---+---+---+ |
^ |
| |
+------------------------------------+
```
This is bad!
Both `String` instances would try to free the memory buffer when they go out of scope,
Both `String` instances would try to free the memory buffer when they go out of scope,
leading to a double-free error.
You could also create two distinct `&mut String` references that point to the same memory buffer,
violating Rust's borrowing rules.
### Case study 2: `u32`
`u32` implements `Copy`. All integer types do, in fact.
`u32` implements `Copy`. All integer types do, in fact.\
An integer is "just" the bytes that represent the number in memory. There's nothing more!
If you copy those bytes, you get another perfectly valid integer instance.
Nothing bad can happen, so Rust allows it.
@ -92,12 +92,12 @@ Nothing bad can happen, so Rust allows it.
### Case study 3: `&mut u32`
When we introduced ownership and mutable borrows, we stated one rule quite clearly: there
can only ever be *one* mutable borrow of a value at any given time.
can only ever be _one_ mutable borrow of a value at any given time.\
That's why `&mut u32` doesn't implement `Copy`, even though `u32` does.
If `&mut u32` implemented `Copy`, you could create multiple mutable references to
If `&mut u32` implemented `Copy`, you could create multiple mutable references to
the same value and modify it in multiple places at the same time.
That'd be a violation of Rust's borrowing rules!
That'd be a violation of Rust's borrowing rules!
It follows that `&mut T` never implements `Copy`, no matter what `T` is.
## Implementing `Copy`

View file

@ -15,7 +15,7 @@ pub trait Drop {
```
The `Drop` trait is a mechanism for you to define _additional_ cleanup logic for your types,
beyond what the compiler does for you automatically.
beyond what the compiler does for you automatically.\
Whatever you put in the `drop` method will be executed when the value goes out of scope.
## `Drop` and `Copy`
@ -24,7 +24,7 @@ When talking about the `Copy` trait, we said that a type can't implement `Copy`
manages additional resources beyond the `std::mem::size_of` bytes that it occupies in memory.
You might wonder: how does the compiler know if a type manages additional resources?
That's right: `Drop` trait implementations!
That's right: `Drop` trait implementations!\
If your type has an explicit `Drop` implementation, the compiler will assume
that your type has additional resources attached to it and won't allow you to implement `Copy`.

View file

@ -6,26 +6,26 @@ so often when writing Rust code that they'll soon become second nature.
## Closing thoughts
Traits are powerful, but don't overuse them.
Traits are powerful, but don't overuse them.\
A few guidelines to keep in mind:
- Don't make a function generic if it is always invoked with a single type. It introduces indirection in your
codebase, making it harder to understand and maintain.
- Don't make a function generic if it is always invoked with a single type. It introduces indirection in your
codebase, making it harder to understand and maintain.
- Don't create a trait if you only have one implementation. It's a sign that the trait is not needed.
- Implement standard traits for your types (`Debug`, `PartialEq`, etc.) whenever it makes sense.
It will make your types more idiomatic and easier to work with, unlocking a lot of functionality provided
It will make your types more idiomatic and easier to work with, unlocking a lot of functionality provided
by the standard library and ecosystem crates.
- Implement traits from third-party crates if you need the functionality they unlock within their ecosystem.
- Beware of making code generic solely to use mocks in your tests. The maintainability cost of this approach
can be high, and it's often better to use a different testing strategy. Check out the
[testing masterclass](https://github.com/mainmatter/rust-advanced-testing-workshop)
- Implement traits from third-party crates if you need the functionality they unlock within their ecosystem.
- Beware of making code generic solely to use mocks in your tests. The maintainability cost of this approach
can be high, and it's often better to use a different testing strategy. Check out the
[testing masterclass](https://github.com/mainmatter/rust-advanced-testing-workshop)
for details on high-fidelity testing.
## Testing your knowledge
Before moving on, let's go through one last exercise to consolidate what we've learned.
Before moving on, let's go through one last exercise to consolidate what we've learned.
You'll have minimal guidance this time—just the exercise description and the tests to guide you.
## References
- The exercise for this section is located in `exercises/04_traits/14_outro`
- The exercise for this section is located in `exercises/04_traits/14_outro`

View file

@ -1,7 +1,7 @@
# Modelling A Ticket, pt. 2
The `Ticket` struct we worked on in the previous chapters is a good start,
but it still screams "I'm a beginner Rustacean!".
The `Ticket` struct we worked on in the previous chapters is a good start,
but it still screams "I'm a beginner Rustacean!".
We'll use this chapter to refine our Rust domain modelling skills.
We'll need to introduce a few more concepts along the way:

View file

@ -1,8 +1,8 @@
# Enumerations
Based on the validation logic you wrote [in a previous chapter](../03_ticket_v1/02_validation.md),
there are only a few valid statuses for a ticket: `To-Do`, `InProgress` and `Done`.
This is not obvious if we look at the `status` field in the `Ticket` struct or at the type of the `status`
Based on the validation logic you wrote [in a previous chapter](../03_ticket_v1/02_validation.md),
there are only a few valid statuses for a ticket: `To-Do`, `InProgress` and `Done`.\
This is not obvious if we look at the `status` field in the `Ticket` struct or at the type of the `status`
parameter in the `new` method:
```rust
@ -29,7 +29,7 @@ We can do better than that with **enumerations**.
## `enum`
An enumeration is a type that can have a fixed set of values, called **variants**.
An enumeration is a type that can have a fixed set of values, called **variants**.\
In Rust, you define an enumeration using the `enum` keyword:
```rust

View file

@ -1,6 +1,6 @@
# `match`
You may be wondering—what can you actually **do** with an enum?
You may be wondering—what can you actually **do** with an enum?\
The most common operation is to **match** on it.
```rust
@ -22,13 +22,13 @@ impl Status {
}
```
A `match` statement that lets you compare a Rust value against a series of **patterns**.
A `match` statement that lets you compare a Rust value against a series of **patterns**.\
You can think of it as a type-level `if`. If `status` is a `Done` variant, execute the first block;
if it's a `InProgress` or `ToDo` variant, execute the second block.
## Exhaustiveness
There's one key detail here: `match` is **exhaustive**. You must handle all enum variants.
There's one key detail here: `match` is **exhaustive**. You must handle all enum variants.\
If you forget to handle a variant, Rust will stop you **at compile-time** with an error.
E.g. if we forget to handle the `ToDo` variant:
@ -50,7 +50,7 @@ error[E0004]: non-exhaustive patterns: `ToDo` not covered
| ^^^^^^^^^^^^ pattern `ToDo` not covered
```
This is a big deal!
This is a big deal!\
Codebases evolve over time—you might add a new status down the line, e.g. `Blocked`. The Rust compiler
will emit an error for every single `match` statement that's missing logic for the new variant.
That's why Rust developers often sing the praises of "compiler-driven refactoring"—the compiler tells you

View file

@ -1,4 +1,4 @@
# Variants can hold data
# Variants can hold data
```rust
enum Status {
@ -8,17 +8,17 @@ enum Status {
}
```
Our `Status` enum is what's usually called a **C-style enum**.
Each variant is a simple label, a bit like a named constant. You can find this kind of enum in many programming
Our `Status` enum is what's usually called a **C-style enum**.\
Each variant is a simple label, a bit like a named constant. You can find this kind of enum in many programming
languages, like C, C++, Java, C#, Python, etc.
Rust enums can go further though. We can **attach data to each variant**.
## Variants
Let's say that we want to store the name of the person who's currently working on a ticket.
We would only have this information if the ticket is in progress. It wouldn't be there for a to-do ticket or
a done ticket.
Let's say that we want to store the name of the person who's currently working on a ticket.\
We would only have this information if the ticket is in progress. It wouldn't be there for a to-do ticket or
a done ticket.
We can model this by attaching a `String` field to the `InProgress` variant:
```rust
@ -31,7 +31,7 @@ enum Status {
}
```
`InProgress` is now a **struct-like variant**.
`InProgress` is now a **struct-like variant**.\
The syntax mirrors, in fact, the one we used to define a struct—it's just "inlined" inside the enum, as a variant.
## Accessing variant data
@ -55,7 +55,7 @@ error[E0609]: no field `assigned_to` on type `Status`
| ^^^^^^^^^^^ unknown field
```
`assigned_to` is **variant-specific**, it's not available on all `Status` instances.
`assigned_to` is **variant-specific**, it's not available on all `Status` instances.\
To access `assigned_to`, we need to use **pattern matching**:
```rust
@ -71,9 +71,9 @@ match status {
## Bindings
In the match pattern `Status::InProgress { assigned_to }`, `assigned_to` is a **binding**.
We're **destructuring** the `Status::InProgress` variant and binding the `assigned_to` field to
a new variable, also named `assigned_to`.
In the match pattern `Status::InProgress { assigned_to }`, `assigned_to` is a **binding**.\
We're **destructuring** the `Status::InProgress` variant and binding the `assigned_to` field to
a new variable, also named `assigned_to`.\
If we wanted, we could bind the field to a different variable name:
```rust

View file

@ -15,14 +15,14 @@ impl Ticket {
}
```
You only care about the `Status::InProgress` variant.
You only care about the `Status::InProgress` variant.
Do you really need to match on all the other variants?
New constructs to the rescue!
## `if let`
The `if let` construct allows you to match on a single variant of an enum,
The `if let` construct allows you to match on a single variant of an enum,
without having to handle all the other variants.
Here's how you can use `if let` to simplify the `assigned_to` method:
@ -61,8 +61,8 @@ as the code that precedes it.
## Style
Both `if let` and `let/else` are idiomatic Rust constructs.
Use them as you see fit to improve the readability of your code,
Both `if let` and `let/else` are idiomatic Rust constructs.\
Use them as you see fit to improve the readability of your code,
but don't overdo it: `match` is always there when you need it.
## References

View file

@ -1,11 +1,11 @@
# Nullability
Our implementation of the `assigned` method is fairly blunt: panicking for to-do and done tickets is far from ideal.
Our implementation of the `assigned` method is fairly blunt: panicking for to-do and done tickets is far from ideal.\
We can do better using **Rust's `Option` type**.
## `Option`
`Option` is a Rust type that represents **nullable values**.
`Option` is a Rust type that represents **nullable values**.\
It is an enum, defined in Rust's standard library:
```rust
@ -15,10 +15,10 @@ enum Option<T> {
}
```
`Option` encodes the idea that a value might be present (`Some(T)`) or absent (`None`).
`Option` encodes the idea that a value might be present (`Some(T)`) or absent (`None`).\
It also forces you to **explicitly handle both cases**. You'll get a compiler error if you are working with
a nullable value and you forget to handle the `None` case.
This is a significant improvement over "implicit" nullability in other languages, where you can forget to check
a nullable value and you forget to handle the `None` case.\
This is a significant improvement over "implicit" nullability in other languages, where you can forget to check
for `null` and thus trigger a runtime error.
## `Option`'s definition
@ -27,11 +27,11 @@ for `null` and thus trigger a runtime error.
### Tuple-like variants
`Option` has two variants: `Some(T)` and `None`.
`Some` is a **tuple-like variant**: it's a variant that holds **unnamed fields**.
`Option` has two variants: `Some(T)` and `None`.\
`Some` is a **tuple-like variant**: it's a variant that holds **unnamed fields**.
Tuple-like variants are often used when there is a single field to store, especially when we're looking at a
"wrapper" type like `Option`.
Tuple-like variants are often used when there is a single field to store, especially when we're looking at a
"wrapper" type like `Option`.
### Tuple-like structs
@ -51,7 +51,7 @@ let y = point.1;
### Tuples
It's weird say that something is tuple-like when we haven't seen tuples yet!
It's weird say that something is tuple-like when we haven't seen tuples yet!\
Tuples are another example of a primitive Rust type.
They group together a fixed number of values with (potentially different) types:

View file

@ -27,8 +27,8 @@ impl Ticket {
}
```
As soon as one of the checks fails, the function panics.
This is not ideal, as it doesn't give the caller a chance to **handle the error**.
As soon as one of the checks fails, the function panics.
This is not ideal, as it doesn't give the caller a chance to **handle the error**.
It's time to introduce the `Result` type, Rust's primary mechanism for error handling.
@ -52,22 +52,22 @@ Both `Ok` and `Err` are generic, allowing you to specify your own types for the
## No exceptions
Recoverable errors in Rust are **represented as values**.
Recoverable errors in Rust are **represented as values**.\
They're just an instance of a type, being passed around and manipulated like any other value.
This is a significant difference from other languages, such as Python or C#, where **exceptions** are used to signal errors.
Exceptions create a separate control flow path that can be hard to reason about.
Exceptions create a separate control flow path that can be hard to reason about.\
You don't know, just by looking at a function's signature, if it can throw an exception or not.
You don't know, just by looking at a function's signature, **which** exception types it can throw.
You don't know, just by looking at a function's signature, **which** exception types it can throw.\
You must either read the function's documentation or look at its implementation to find out.
Exception handling logic has very poor locality: the code that throws the exception is far removed from the code
Exception handling logic has very poor locality: the code that throws the exception is far removed from the code
that catches it, and there's no direct link between the two.
## Fallibility is encoded in the type system
Rust, with `Result`, forces you to **encode fallibility in the function's signature**.
If a function can fail (and you want the caller to have a shot at handling the error), it must return a `Result`.
Rust, with `Result`, forces you to **encode fallibility in the function's signature**.\
If a function can fail (and you want the caller to have a shot at handling the error), it must return a `Result`.
```rust
// Just by looking at the signature, you know that this function can fail.
@ -77,7 +77,7 @@ fn parse_int(s: &str) -> Result<i32, ParseIntError> {
}
```
That's the big advantage of `Result`: it makes fallibility explicit.
That's the big advantage of `Result`: it makes fallibility explicit.
Keep in mind, though, that panics exist. They aren't tracked by the type system, just like exceptions in other languages.
But they're meant for **unrecoverable errors** and should be used sparingly.

View file

@ -1,11 +1,11 @@
# Unwrapping
`Ticket::new` now returns a `Result` instead of panicking on invalid inputs.
`Ticket::new` now returns a `Result` instead of panicking on invalid inputs.\
What does this mean for the caller?
## Failures can't be (implicitly) ignored
Unlike exceptions, Rust's `Result` forces you to **handle errors at the call site**.
Unlike exceptions, Rust's `Result` forces you to **handle errors at the call site**.\
If you call a function that returns a `Result`, Rust won't allow you to implicitly ignore the error case.
```rust
@ -30,7 +30,7 @@ When you call a function that returns a `Result`, you have two key options:
let number = parse_int("42").unwrap();
// `expect` lets you specify a custom panic message.
let number = parse_int("42").expect("Failed to parse integer");
```
```
- Destructure the `Result` using a `match` expression to deal with the error case explicitly.
```rust
match parse_int("42") {
@ -41,4 +41,4 @@ When you call a function that returns a `Result`, you have two key options:
## References
- The exercise for this section is located in `exercises/05_ticket_v2/07_unwrap`
- The exercise for this section is located in `exercises/05_ticket_v2/07_unwrap`

View file

@ -1,6 +1,6 @@
# Error enums
Your solution to the previous exercise may have felt awkward: matching on strings is not ideal!
Your solution to the previous exercise may have felt awkward: matching on strings is not ideal!\
A colleague might rework the error messages returned by `Ticket::new` (e.g. to improve readability) and,
all of a sudden, your calling code would break.
@ -22,7 +22,7 @@ enum U32ParseError {
```
Using an error enum, you're encoding the different error cases in the type system—they become part of the
signature of the fallible function.
signature of the fallible function.\
This simplifies error handling for the caller, as they can use a `match` expression to react to the different
error cases:

View file

@ -2,8 +2,8 @@
## Error reporting
In the previous exercise you had to destructure the `InvalidTitle` variant to extract the error message and
pass it to the `panic!` macro.
In the previous exercise you had to destructure the `InvalidTitle` variant to extract the error message and
pass it to the `panic!` macro.\
This is a (rudimentary) example of **error reporting**: transforming an error type into a representation that can be
shown to a user, a service operator, or a developer.
@ -13,7 +13,7 @@ That's why Rust provides the `std::error::Error` trait.
## The `Error` trait
There are no constraints on the type of the `Err` variant in a `Result`, but it's a good practice to use a type
There are no constraints on the type of the `Err` variant in a `Result`, but it's a good practice to use a type
that implements the `Error` trait.
`Error` is the cornerstone of Rust's error handling story:
@ -31,7 +31,7 @@ implement `Debug` and `Display`.
We've already encountered the `Debug` trait in [a previous exercise](../04_traits/04_derive.md)—it's the trait used by
`assert_eq!` to display the values of the variables it's comparing when the assertion fails.
From a "mechanical" perspective, `Display` and `Debug` are identical—they encode how a type should be converted
From a "mechanical" perspective, `Display` and `Debug` are identical—they encode how a type should be converted
into a string-like representation:
```rust
@ -46,8 +46,8 @@ pub trait Display {
}
```
The difference is in their *purpose*: `Display` returns a representation that's meant for "end-users",
while `Debug` provides a low-level representation that's more suitable to developers and service operators.
The difference is in their _purpose_: `Display` returns a representation that's meant for "end-users",
while `Debug` provides a low-level representation that's more suitable to developers and service operators.\
That's why `Debug` can be automatically implemented using the `#[derive(Debug)]` attribute, while `Display`
**requires** a manual implementation.

View file

@ -1,36 +1,36 @@
# Libraries and binaries
It took a bit of code to implement the `Error` trait for `TicketNewError`, didn't it?
It took a bit of code to implement the `Error` trait for `TicketNewError`, didn't it?\
A manual `Display` implementation, plus an `Error` impl block.
We can remove some of the boilerplate by using [`thiserror`](https://docs.rs/thiserror/latest/thiserror/),
a Rust crate that provides a **procedural macro** to simplify the creation of custom error types.
But we're getting ahead of ourselves: `thiserror` is a third-party crate, it'd be our first dependency!
We can remove some of the boilerplate by using [`thiserror`](https://docs.rs/thiserror/latest/thiserror/),
a Rust crate that provides a **procedural macro** to simplify the creation of custom error types.\
But we're getting ahead of ourselves: `thiserror` is a third-party crate, it'd be our first dependency!
Let's take a step back to talk about Rust's packaging system before we dive into dependencies.
## What is a package?
A Rust package is defined by the `[package]` section in a `Cargo.toml` file, also known as its **manifest**.
A Rust package is defined by the `[package]` section in a `Cargo.toml` file, also known as its **manifest**.
Within `[package]` you can set the package's metadata, such as its name and version.
Go check the `Cargo.toml` file in the directory of this section's exercise!
## What is a crate?
Inside a package, you can have one or more **crates**, also known as **targets**.
Inside a package, you can have one or more **crates**, also known as **targets**.\
The two most common crate types are **binary crates** and **library crates**.
### Binaries
A binary is a program that can be compiled to an **executable file**.
A binary is a program that can be compiled to an **executable file**.\
It must include a function named `main`—the program's entry point. `main` is invoked when the program is executed.
### Libraries
Libraries, on the other hand, are not executable on their own. You can't _run_ a library,
but you can _import its code_ from another package that depends on it.
A library groups together code (i.e. functions, types, etc.) that can be leveraged by other packages as a **dependency**.
Libraries, on the other hand, are not executable on their own. You can't _run_ a library,
but you can _import its code_ from another package that depends on it.\
A library groups together code (i.e. functions, types, etc.) that can be leveraged by other packages as a **dependency**.
All the exercises you've solved so far have been structured as libraries, with a test suite attached to them.
@ -55,7 +55,7 @@ You can use `cargo` to scaffold a new package:
cargo new my-binary
```
This will create a new folder, `my-binary`, containing a new Rust package with the same name and a single
This will create a new folder, `my-binary`, containing a new Rust package with the same name and a single
binary crate inside. If you want to create a library crate instead, you can use the `--lib` flag:
```bash

View file

@ -1,6 +1,6 @@
# Dependencies
A package can depend on other packages by listing them in the `[dependencies]` section of its `Cargo.toml` file.
A package can depend on other packages by listing them in the `[dependencies]` section of its `Cargo.toml` file.\
The most common way to specify a dependency is by providing its name and version:
```toml
@ -8,7 +8,7 @@ The most common way to specify a dependency is by providing its name and version
thiserror = "1"
```
This will add `thiserror` as a dependency to your package, with a **minimum** version of `1.0.0`.
This will add `thiserror` as a dependency to your package, with a **minimum** version of `1.0.0`.
`thiserror` will be pulled from [crates.io](https://crates.io), Rust's official package registry.
When you run `cargo build`, `cargo` will go through a few stages:
@ -17,10 +17,10 @@ When you run `cargo build`, `cargo` will go through a few stages:
- Compiling your project (your own code and the dependencies)
Dependency resolution is skipped if your project has a `Cargo.lock` file and your manifest files are unchanged.
A lockfile is automatically generated by `cargo` after a successful round of dependency resolution: it contains
the exact versions of all dependencies used in your project, and is used to ensure that the same versions are
A lockfile is automatically generated by `cargo` after a successful round of dependency resolution: it contains
the exact versions of all dependencies used in your project, and is used to ensure that the same versions are
consistently used across different builds (e.g. in CI). If you're working on a project with multiple developers,
you should commit the `Cargo.lock` file to your version control system.
you should commit the `Cargo.lock` file to your version control system.
You can use `cargo update` to update the `Cargo.lock` file with the latest (compatible) versions of all your dependencies.
@ -43,7 +43,7 @@ details on where you can get dependencies from and how to specify them in your `
## Dev dependencies
You can also specify dependencies that are only needed for development—i.e. they only get pulled in when you're
running `cargo test`.
running `cargo test`.\
They go in the `[dev-dependencies]` section of your `Cargo.toml` file:
```toml

View file

@ -1,11 +1,11 @@
# `thiserror`
That was a bit of detour, wasn't it? But a necessary one!
That was a bit of detour, wasn't it? But a necessary one!\
Let's get back on track now: custom error types and `thiserror`.
## Custom error types
We've seen how to implement the `Error` trait "manually" for a custom error type.
We've seen how to implement the `Error` trait "manually" for a custom error type.\
Imagine that you have to do this for most error types in your codebase. That's a lot of boilerplate, isn't it?
We can remove some of the boilerplate by using [`thiserror`](https://docs.rs/thiserror/latest/thiserror/),
@ -23,12 +23,12 @@ enum TicketNewError {
## You can write your own macros
All the `derive` macros we've seen so far were provided by the Rust standard library.
All the `derive` macros we've seen so far were provided by the Rust standard library.\
`thiserror::Error` is the first example of a **third-party** `derive` macro.
`derive` macros are a subset of **procedural macros**, a way to generate Rust code at compile time.
`derive` macros are a subset of **procedural macros**, a way to generate Rust code at compile time.
We won't get into the details of how to write a procedural macro in this course, but it's important
to know that you can write your own!
to know that you can write your own!\
A topic to approach in a more advanced Rust course.
## Custom syntax
@ -39,7 +39,7 @@ In the case of `thiserror`, we have:
- `#[derive(thiserror::Error)]`: this is the syntax to derive the `Error` trait for a custom error type, helped by `thiserror`.
- `#[error("{0}")]`: this is the syntax to define a `Display` implementation for each variant of the custom error type.
`{0}` is replaced by the zero-th field of the variant (`String`, in this case) when the error is displayed.
## References
- The exercise for this section is located in `exercises/05_ticket_v2/12_thiserror`

View file

@ -1,10 +1,10 @@
# `TryFrom` and `TryInto`
In the previous chapter we looked at the [`From` and `Into` traits](../04_traits/09_from.md),
Rust's idiomatic interfaces for **infallible** type conversions.
In the previous chapter we looked at the [`From` and `Into` traits](../04_traits/09_from.md),
Rust's idiomatic interfaces for **infallible** type conversions.\
But what if the conversion is not guaranteed to succeed?
We now know enough about errors to discuss the **fallible** counterparts of `From` and `Into`:
We now know enough about errors to discuss the **fallible** counterparts of `From` and `Into`:
`TryFrom` and `TryInto`.
## `TryFrom` and `TryInto`
@ -23,7 +23,7 @@ pub trait TryInto<T>: Sized {
}
```
The main difference between `From`/`Into` and `TryFrom`/`TryInto` is that the latter return a `Result` type.
The main difference between `From`/`Into` and `TryFrom`/`TryInto` is that the latter return a `Result` type.\
This allows the conversion to fail, returning an error instead of panicking.
## `Self::Error`
@ -36,7 +36,7 @@ being attempted.
## Duality
Just like `From` and `Into`, `TryFrom` and `TryInto` are dual traits.
Just like `From` and `Into`, `TryFrom` and `TryInto` are dual traits.\
If you implement `TryFrom` for a type, you get `TryInto` for free.
## References

View file

@ -11,15 +11,15 @@ pub trait Error: Debug + Display {
}
```
The `source` method is a way to access the **error cause**, if any.
The `source` method is a way to access the **error cause**, if any.\
Errors are often chained, meaning that one error is the cause of another: you have a high-level error (e.g.
cannot connect to the database) that is caused by a lower-level error (e.g. can't resolve the database hostname).
The `source` method allows you to "walk" the full chain of errors, often used when capturing error context in logs.
## Implementing `source`
The `Error` trait provides a default implementation that always returns `None` (i.e. no underlying cause). That's why
you didn't have to care about `source` in the previous exercises.
The `Error` trait provides a default implementation that always returns `None` (i.e. no underlying cause). That's why
you didn't have to care about `source` in the previous exercises.\
You can override this default implementation to provide a cause for your error type.
```rust
@ -48,14 +48,14 @@ We then override the `source` method to return this source when called.
## `&(dyn Error + 'static)`
What's this `&(dyn Error + 'static)` type?
What's this `&(dyn Error + 'static)` type?\
Let's unpack it:
- `dyn Error` is a **trait object**. It's a way to refer to any type that implements the `Error` trait.
- `'static` is a special **lifetime specifier**.
`'static` implies that the reference is valid for "as long as we need it", i.e. the entire program execution.
Combined: `&(dyn Error + 'static)` is a reference to a trait object that implements the `Error` trait
Combined: `&(dyn Error + 'static)` is a reference to a trait object that implements the `Error` trait
and is valid for the entire program execution.
Don't worry too much about either of these concepts for now. We'll cover them in more detail in future chapters.
@ -75,7 +75,7 @@ Don't worry too much about either of these concepts for now. We'll cover them in
source: std::io::Error
}
}
```
```
- A field annotated with the `#[source]` attribute will automatically be used as the source of the error.
```rust
use thiserror::Error;
@ -88,8 +88,8 @@ Don't worry too much about either of these concepts for now. We'll cover them in
inner: std::io::Error
}
}
```
- A field annotated with the `#[from]` attribute will automatically be used as the source of the error **and**
```
- A field annotated with the `#[from]` attribute will automatically be used as the source of the error **and**
`thiserror` will automatically generate a `From` implementation to convert the annotated type into your error type.
```rust
use thiserror::Error;
@ -102,11 +102,11 @@ Don't worry too much about either of these concepts for now. We'll cover them in
inner: std::io::Error
}
}
```
```
## The `?` operator
The `?` operator is a shorthand for propagating errors.
The `?` operator is a shorthand for propagating errors.\
When used in a function that returns a `Result`, it will return early with an error if the `Result` is `Err`.
For example:
@ -145,7 +145,7 @@ fn read_file() -> Result<String, std::io::Error> {
}
```
You can use the `?` operator to shorten your error handling code significantly.
You can use the `?` operator to shorten your error handling code significantly.\
In particular, the `?` operator will automatically convert the error type of the fallible operation into the error type
of the function, if a conversion is possible (i.e. if there is a suitable `From` implementation)

View file

@ -1,14 +1,14 @@
# Wrapping up
When it comes to domain modelling, the devil is in the details.
When it comes to domain modelling, the devil is in the details.\
Rust offers a wide range of tools to help you represent the constraints of your domain directly in the type system,
but it takes some practice to get it right and write code that looks idiomatic.
Let's close the chapter with one final refinement of our `Ticket` model.
We'll introduce a new type for each of the fields in `Ticket` to encapsulate the respective constraints.
Every time someone accesses a `Ticket` field, they'll get back a value that's guaranteed to be valid—i.e. a
Let's close the chapter with one final refinement of our `Ticket` model.\
We'll introduce a new type for each of the fields in `Ticket` to encapsulate the respective constraints.\
Every time someone accesses a `Ticket` field, they'll get back a value that's guaranteed to be valid—i.e. a
`TicketTitle` instead of a `String`. They won't have to worry about the title being empty elsewhere in the code:
as long as they have a `TicketTitle`, they know it's valid **by construction**.
as long as they have a `TicketTitle`, they know it's valid **by construction**.
This is just an example of how you can use Rust's type system to make your code safer and more expressive.
@ -19,4 +19,4 @@ This is just an example of how you can use Rust's type system to make your code
## Further reading
- [Parse, don't validate](https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/)
- [Using types to guarantee domain invariants](https://www.lpalmieri.com/posts/2020-12-11-zero-to-production-6-domain-modelling/)
- [Using types to guarantee domain invariants](https://www.lpalmieri.com/posts/2020-12-11-zero-to-production-6-domain-modelling/)

View file

@ -15,4 +15,4 @@ The task will give us an opportunity to explore new Rust concepts, such as:
- `HashMap` and `BTreeMap`, two key-value data structures
- `Eq` and `Hash`, to compare keys in a `HashMap`
- `Ord` and `PartialOrd`, to work with a `BTreeMap`
- `Index` and `IndexMut`, to access elements in a collection
- `Index` and `IndexMut`, to access elements in a collection

View file

@ -1,14 +1,14 @@
# Arrays
As soon as we start talking about "ticket management" we need to think about a way to store _multiple_ tickets.
In turn, this means we need to think about collections. In particular, homogeneous collections:
In turn, this means we need to think about collections. In particular, homogeneous collections:
we want to store multiple instances of the same type.
What does Rust have to offer in this regard?
## Arrays
A first attempt could be to use an **array**.
A first attempt could be to use an **array**.\
Arrays in Rust are fixed-size collections of elements of the same type.
Here's how you can define an array:
@ -18,7 +18,7 @@ Here's how you can define an array:
let numbers: [u32; 3] = [1, 2, 3];
```
This creates an array of 3 integers, initialized with the values `1`, `2`, and `3`.
This creates an array of 3 integers, initialized with the values `1`, `2`, and `3`.\
The type of the array is `[u32; 3]`, which reads as "an array of `u32`s with a length of 3".
### Accessing elements
@ -31,8 +31,8 @@ let second = numbers[1];
let third = numbers[2];
```
The index must be of type `usize`.
Arrays are **zero-indexed**, like everything in Rust. You've seen this before with string slices and field indexing in
The index must be of type `usize`.\
Arrays are **zero-indexed**, like everything in Rust. You've seen this before with string slices and field indexing in
tuples/tuple-like variants.
### Out-of-bounds access
@ -44,8 +44,8 @@ let numbers: [u32; 3] = [1, 2, 3];
let fourth = numbers[3]; // This will panic
```
This is enforced at runtime using **bounds checking**. It comes with a small performance overhead, but it's how
Rust prevents buffer overflows.
This is enforced at runtime using **bounds checking**. It comes with a small performance overhead, but it's how
Rust prevents buffer overflows.\
In some scenarios the Rust compiler can optimize away bounds checks, especially if iterators are involved—we'll speak
more about this later on.
@ -77,5 +77,5 @@ Stack: | 1 | 2 | 3 |
```
In other words, the size of an array is `std::mem::size_of::<T>() * N`, where `T` is the type of the elements and `N` is
the number of elements.
You can access and replace each element in `O(1)` time.
the number of elements.\
You can access and replace each element in `O(1)` time.

View file

@ -18,11 +18,11 @@ error[E0435]: attempt to use a non-constant value in a constant
```
Arrays wouldn't work for our ticket management system—we don't know how many tickets we'll need to store at compile-time.
This is where `Vec` comes in.
This is where `Vec` comes in.
## `Vec`
`Vec` is a growable array type, provided by the standard library.
`Vec` is a growable array type, provided by the standard library.\
You can create an empty array using the `Vec::new` function:
```rust
@ -37,7 +37,7 @@ numbers.push(2);
numbers.push(3);
```
New values are added to the end of the vector.
New values are added to the end of the vector.\
You can also create an initialized vector using the `vec!` macro, if you know the values at creation time:
```rust
@ -55,7 +55,7 @@ let second = numbers[1];
let third = numbers[2];
```
The index must be of type `usize`.
The index must be of type `usize`.\
You can also use the `get` method, which returns an `Option<&T>`:
```rust
@ -70,7 +70,7 @@ Access is bounds-checked, just element access with arrays. It has O(1) complexit
## Memory layout
`Vec` is a heap-allocated data structure.
`Vec` is a heap-allocated data structure.\
When you create a `Vec`, it allocates memory on the heap to store the elements.
If you run the following code:
@ -102,11 +102,11 @@ Heap: | 1 | 2 | ? |
- The **length** of the vector, i.e. how many elements are in the vector.
- The **capacity** of the vector, i.e. the number of elements that can fit in the space reserved on the heap.
This layout should look familiar: it's exactly the same as `String`!
This layout should look familiar: it's exactly the same as `String`!\
That's not a coincidence: `String` is defined as a vector of bytes, `Vec<u8>`, under the hood:
```rust
pub struct String {
vec: Vec<u8>,
}
```
```

View file

@ -11,15 +11,15 @@ numbers.push(3); // Max capacity reached
numbers.push(4); // What happens here?
```
The `Vec` will **resize** itself.
The `Vec` will **resize** itself.\
It will ask the allocator for a new (larger) chunk of heap memory, copy the elements over, and deallocate the old memory.
This operation can be expensive, as it involves a new memory allocation and copying all existing elements.
This operation can be expensive, as it involves a new memory allocation and copying all existing elements.
## `Vec::with_capacity`
If you have a rough idea of how many elements you'll store in a `Vec`, you can use the `Vec::with_capacity`
method to pre-allocate enough memory upfront.
If you have a rough idea of how many elements you'll store in a `Vec`, you can use the `Vec::with_capacity`
method to pre-allocate enough memory upfront.\
This can avoid a new allocation when the `Vec` grows, but it may waste memory if you overestimate actual usage.
Evaluate on a case-by-case basis.

View file

@ -1,6 +1,6 @@
# Iteration
During the very first exercises, you learned that Rust lets you iterate over collections using `for` loops.
During the very first exercises, you learned that Rust lets you iterate over collections using `for` loops.
We were looking at ranges at that point (e.g. `0..5`), but the same holds true for collections like arrays and vectors.
```rust
@ -35,13 +35,13 @@ loop {
}
```
`loop` is another looping construct, on top of `for` and `while`.
`loop` is another looping construct, on top of `for` and `while`.\
A `loop` block will run forever, unless you explicitly `break` out of it.
## `Iterator` trait
The `next` method in the previous code snippet comes from the `Iterator` trait.
The `Iterator` trait is defined in Rust's standard library and provides a shared interface for
The `Iterator` trait is defined in Rust's standard library and provides a shared interface for
types that can produce a sequence of values:
```rust
@ -53,16 +53,16 @@ trait Iterator {
The `Item` associated type specifies the type of the values produced by the iterator.
`next` returns the next value in the sequence.
It returns `Some(value)` if there's a value to return, and `None` when there isn't.
`next` returns the next value in the sequence.\
It returns `Some(value)` if there's a value to return, and `None` when there isn't.
Be careful: there is no guarantee that an iterator is exhausted when it returns `None`. That's only
guaranteed if the iterator implements the (more restrictive)
guaranteed if the iterator implements the (more restrictive)
[`FusedIterator`](https://doc.rust-lang.org/std/iter/trait.FusedIterator.html) trait.
## `IntoIterator` trait
Not all types implement `Iterator`, but many can be converted into a type that does.
Not all types implement `Iterator`, but many can be converted into a type that does.\
That's where the `IntoIterator` trait comes in:
```rust
@ -73,15 +73,15 @@ trait IntoIterator {
}
```
The `into_iter` method consumes the original value and returns an iterator over its elements.
The `into_iter` method consumes the original value and returns an iterator over its elements.\
A type can only have one implementation of `IntoIterator`: there can be no ambiguity as to what `for` should desugar to.
One detail: every type that implements `Iterator` automatically implements `IntoIterator` as well.
One detail: every type that implements `Iterator` automatically implements `IntoIterator` as well.
They just return themselves from `into_iter`!
## Bounds checks
Iterating over iterators has a nice side effect: you can't go out of bounds, by design.
Iterating over iterators has a nice side effect: you can't go out of bounds, by design.\
This allows Rust to remove bounds checks from the generated machine code, making iteration faster.
In other words,
@ -103,5 +103,5 @@ for i in 0..v.len() {
```
There are exceptions to this rule: the compiler can sometimes prove that you're not going out of bounds even
with manual indexing, thus removing the bounds checks anyway. But in general, prefer iteration to indexing
with manual indexing, thus removing the bounds checks anyway. But in general, prefer iteration to indexing
where possible.

View file

@ -1,6 +1,6 @@
# `.iter()`
`IntoIterator` **consumes** `self` to create an iterator.
`IntoIterator` **consumes** `self` to create an iterator.
This has its benefits: you get **owned** values from the iterator.
For example: if you call `.into_iter()` on a `Vec<Ticket>` you'll get an iterator that returns `Ticket` values.
@ -21,7 +21,7 @@ for n in numbers.iter() {
```
This pattern can be simplified by implementing `IntoIterator` for a **reference to the collection**.
In our example above, that would be `&Vec<Ticket>`.
In our example above, that would be `&Vec<Ticket>`.\
The standard library does this, that's why the following code works:
```rust
@ -39,4 +39,4 @@ It's idiomatic to provide both options:
- An implementation of `IntoIterator` for a reference to the collection.
- An `.iter()` method that returns an iterator over references to the collection's elements.
The former is convenient in `for` loops, the latter is more explicit and can be used in other contexts.
The former is convenient in `for` loops, the latter is more explicit and can be used in other contexts.

View file

@ -1,6 +1,6 @@
# Lifetimes
Let's try to complete the previous exercise by adding an implementation of `IntoIterator` for `&TicketStore`, for
Let's try to complete the previous exercise by adding an implementation of `IntoIterator` for `&TicketStore`, for
maximum convenience in `for` loops.
Let's start by filling in the most "obvious" parts of the implementation:
@ -16,8 +16,8 @@ impl IntoIterator for &TicketStore {
}
```
What should `type IntoIter` be set to?
Intuitively, it should be the type returned by `self.tickets.iter()`, i.e. the type returned by `Vec::iter()`.
What should `type IntoIter` be set to?\
Intuitively, it should be the type returned by `self.tickets.iter()`, i.e. the type returned by `Vec::iter()`.\
If you check the standard library documentation, you'll find that `Vec::iter()` returns an `std::slice::Iter`.
The definition of `Iter` is:
@ -29,8 +29,8 @@ pub struct Iter<'a, T> { /* fields omitted */ }
## Lifetime parameters
Lifetimes are **labels** used by the Rust compiler to keep track of how long a reference (either mutable or
immutable) is valid.
Lifetimes are **labels** used by the Rust compiler to keep track of how long a reference (either mutable or
immutable) is valid.\
The lifetime of a reference is constrained by the scope of the value it refers to. Rust always makes sure, at compile-time,
that references are not used after the value they refer to has been dropped, to avoid dangling pointers and use-after-free bugs.
@ -49,8 +49,8 @@ impl <T> Vec<T> {
}
```
`Vec::iter()` is generic over a lifetime parameter, named `'a`.
`'a` is used to **tie together** the lifetime of the `Vec` and the lifetime of the `Iter` returned by `iter()`.
`Vec::iter()` is generic over a lifetime parameter, named `'a`.\
`'a` is used to **tie together** the lifetime of the `Vec` and the lifetime of the `Iter` returned by `iter()`.
In plain English: the `Iter` returned by `iter()` cannot outlive the `Vec` reference (`&self`) it was created from.
This is important because `Vec::iter`, as we discussed, returns an iterator over **references** to the `Vec`'s elements.
@ -74,11 +74,11 @@ No explicit lifetime parameter is present in the signature of `Vec::iter()`.
Elision rules imply that the lifetime of the `Iter` returned by `iter()` is tied to the lifetime of the `&self` reference.
You can think of `'_` as a **placeholder** for the lifetime of the `&self` reference.
See the [References](#references) section for a link to the official documentation on lifetime elision.
See the [References](#references) section for a link to the official documentation on lifetime elision.\
In most cases, you can rely on the compiler telling you when you need to add explicit lifetime annotations.
## References
- [std::vec::Vec::iter](https://doc.rust-lang.org/std/vec/struct.Vec.html#method.iter)
- [std::slice::Iter](https://doc.rust-lang.org/std/slice/struct.Iter.html)
- [Lifetime elision rules](https://doc.rust-lang.org/reference/lifetime-elision.html)
- [Lifetime elision rules](https://doc.rust-lang.org/reference/lifetime-elision.html)

View file

@ -1,7 +1,7 @@
# Combinators
Iterators can do so much more than `for` loops!
If you look at the documentation for the `Iterator` trait, you'll find a **vast** collections of
Iterators can do so much more than `for` loops!\
If you look at the documentation for the `Iterator` trait, you'll find a **vast** collections of
methods that you can leverage to transform, filter, and combine iterators in various ways.
Let's mention the most common ones:
@ -15,7 +15,7 @@ Let's mention the most common ones:
- `take` stops the iterator after `n` elements.
- `chain` combines two iterators into one.
These methods are called **combinators**.
These methods are called **combinators**.\
They are usually **chained** together to create complex transformations in a concise and readable way:
```rust
@ -29,10 +29,10 @@ let outcome: u32 = numbers.iter()
## Closures
What's going on with the `filter` and `map` methods above?
What's going on with the `filter` and `map` methods above?\
They take **closures** as arguments.
Closures are **anonymous functions**, i.e. functions that are not defined using the `fn` syntax we are used to.
Closures are **anonymous functions**, i.e. functions that are not defined using the `fn` syntax we are used to.\
They are defined using the `|args| body` syntax, where `args` are the arguments and `body` is the function body.
`body` can be a block of code or a single expression.
For example:
@ -70,10 +70,10 @@ let add_one: fn(i32) -> i32 = |x| x + 1;
## `collect`
What happens when you're done transforming an iterator using combinators?
What happens when you're done transforming an iterator using combinators?\
You either iterate over the transformed values using a `for` loop, or you collect them into a collection.
The latter is done using the `collect` method.
The latter is done using the `collect` method.\
`collect` consumes the iterator and collects its elements into a collection of your choice.
For example, you can collect the squares of the even numbers into a `Vec`:
@ -86,7 +86,7 @@ let squares_of_evens: Vec<u32> = numbers.iter()
.collect();
```
`collect` is generic over its **return type**.
`collect` is generic over its **return type**.\
Therefore you usually need to provide a type hint to help the compiler infer the correct type.
In the example above, we annotated the type of `squares_of_evens` to be `Vec<u32>`.
Alternatively, you can use the **turbofish syntax** to specify the type:
@ -104,4 +104,4 @@ let squares_of_evens = numbers.iter()
- [`Iterator`'s documentation](https://doc.rust-lang.org/std/iter/trait.Iterator.html) gives you an
overview of the methods available for iterators in `std`.
- [The `itertools` crate](https://docs.rs/itertools/) defines even **more** combinators for iterators.
- [The `itertools` crate](https://docs.rs/itertools/) defines even **more** combinators for iterators.

View file

@ -1,6 +1,6 @@
# `impl Trait`
`TicketStore::to_dos` returns a `Vec<&Ticket>`.
`TicketStore::to_dos` returns a `Vec<&Ticket>`.\
That signature introduces a new heap allocation every time `to_dos` is called, which may be unnecessary depending
on what the caller needs to do with the result.
It'd be better if `to_dos` returned an iterator instead of a `Vec`, thus empowering the caller to decide whether to
@ -25,9 +25,9 @@ The `filter` method returns an instance of `std::iter::Filter`, which has the fo
pub struct Filter<I, P> { /* fields omitted */ }
```
where `I` is the type of the iterator being filtered on and `P` is the predicate used to filter the elements.
We know that `I` is `std::slice::Iter<'_, Ticket>` in this case, but what about `P`?
`P` is a closure, an **anonymous function**. As the name suggests, closures don't have a name,
where `I` is the type of the iterator being filtered on and `P` is the predicate used to filter the elements.\
We know that `I` is `std::slice::Iter<'_, Ticket>` in this case, but what about `P`?\
`P` is a closure, an **anonymous function**. As the name suggests, closures don't have a name,
so we can't write them down in our code.
Rust has a solution for this: **impl Trait**.
@ -51,19 +51,19 @@ That's it!
## Generic?
`impl Trait` in return position is **not** a generic parameter.
`impl Trait` in return position is **not** a generic parameter.
Generics are placeholders for types that are filled in by the caller of the function.
A function with a generic parameter is **polymorphic**: it can be called with different types, and the compiler will generate
a different implementation for each type.
That's not the case with `impl Trait`.
The return type of a function with `impl Trait` is **fixed** at compile time, and the compiler will generate
The return type of a function with `impl Trait` is **fixed** at compile time, and the compiler will generate
a single implementation for it.
This is why `impl Trait` is also called **opaque return type**: the caller doesn't know the exact type of the return value,
only that it implements the specified trait(s). But the compiler knows the exact type, there is no polymorphism involved.
## RPIT
If you read RFCs or deep-dives about Rust, you might come across the acronym **RPIT**.
It stands for **"Return Position Impl Trait"** and refers to the use of `impl Trait` in return position.
If you read RFCs or deep-dives about Rust, you might come across the acronym **RPIT**.\
It stands for **"Return Position Impl Trait"** and refers to the use of `impl Trait` in return position.

View file

@ -1,6 +1,6 @@
# `impl Trait` in argument position
In the previous section, we saw how `impl Trait` can be used to return a type without specifying its name.
In the previous section, we saw how `impl Trait` can be used to return a type without specifying its name.\
The same syntax can also be used in **argument position**:
```rust
@ -11,7 +11,7 @@ fn print_iter(iter: impl Iterator<Item = i32>) {
}
```
`print_iter` takes an iterator of `i32`s and prints each element.
`print_iter` takes an iterator of `i32`s and prints each element.\
When used in **argument position**, `impl Trait` is equivalent to a generic parameter with a trait bound:
```rust
@ -27,6 +27,6 @@ where
## Downsides
As a rule of thumb, prefer generics over `impl Trait` in argument position.
Generics allow the caller to explicitly specify the type of the argument, using the turbofish syntax (`::<>`),
which can be useful for disambiguation. That's not the case with `impl Trait`.
As a rule of thumb, prefer generics over `impl Trait` in argument position.\
Generics allow the caller to explicitly specify the type of the argument, using the turbofish syntax (`::<>`),
which can be useful for disambiguation. That's not the case with `impl Trait`.

View file

@ -21,12 +21,12 @@ Heap: | 1 | 2 | ? |
+---+---+---+
```
We already remarked how `String` is just a `Vec<u8>` in disguise.
We already remarked how `String` is just a `Vec<u8>` in disguise.\
The similarity should prompt you to ask: "What's the equivalent of `&str` for `Vec`?"
## `&[T]`
`[T]` is a **slice** of a contiguous sequence of elements of type `T`.
`[T]` is a **slice** of a contiguous sequence of elements of type `T`.\
It's most commonly used in its borrowed form, `&[T]`.
There are various ways to create a slice reference from a `Vec`:
@ -54,7 +54,7 @@ let sum: i32 = numbers.iter().sum();
### Memory layout
A `&[T]` is a **fat pointer**, just like `&str`.
A `&[T]` is a **fat pointer**, just like `&str`.\
It consists of a pointer to the first element of the slice and the length of the slice.
If you have a `Vec` with three elements:
@ -90,10 +90,10 @@ Heap: | 1 | 2 | 3 | ? | |
### `&Vec<T>` vs `&[T]`
When you need to pass an immutable reference to a `Vec` to a function, prefer `&[T]` over `&Vec<T>`.
When you need to pass an immutable reference to a `Vec` to a function, prefer `&[T]` over `&Vec<T>`.\
This allows the function to accept any kind of slice, not necessarily one backed by a `Vec`.
For example, you can then pass a subset of the elements in a `Vec`.
For example, you can then pass a subset of the elements in a `Vec`.
But it goes further than that—you could also pass a **slice of an array**:
```rust
@ -102,5 +102,5 @@ let slice: &[i32] = &array;
```
Array slices and `Vec` slices are the same type: they're fat pointers to a contiguous sequence of elements.
In the case of arrays, the pointer points to the stack rather than the heap, but that doesn't matter
when it comes to using the slice.
In the case of arrays, the pointer points to the stack rather than the heap, but that doesn't matter
when it comes to using the slice.

View file

@ -1,6 +1,6 @@
# Mutable slices
Every time we've talked about slice types (like `str` and `[T]`), we've used their immutable borrow form (`&str` and `&[T]`).
Every time we've talked about slice types (like `str` and `[T]`), we've used their immutable borrow form (`&str` and `&[T]`).\
But slices can also be mutable!
Here's how you create a mutable slice:
@ -21,7 +21,7 @@ This will change the first element of the `Vec` to `42`.
## Limitations
When working with immutable borrows, the recommendation was clear: prefer slice references over references to
the owned type (e.g. `&[T]` over `&Vec<T>`).
the owned type (e.g. `&[T]` over `&Vec<T>`).\
That's **not** the case with mutable borrows.
Consider this scenario:
@ -32,10 +32,10 @@ let mut slice: &mut [i32] = &mut numbers;
slice.push(1);
```
It won't compile!
`push` is a method on `Vec`, not on slices. This is the manifestation of a more general principle: Rust won't
allow you to add or remove elements from a slice. You will only be able to modify/replace the elements that are
It won't compile!\
`push` is a method on `Vec`, not on slices. This is the manifestation of a more general principle: Rust won't
allow you to add or remove elements from a slice. You will only be able to modify/replace the elements that are
already there.
In this regard, a `&mut Vec` or a `&mut String` are strictly more powerful than a `&mut [T]` or a `&mut str`.
Choose the type that best fits based on the operations you need to perform.
In this regard, a `&mut Vec` or a `&mut String` are strictly more powerful than a `&mut [T]` or a `&mut str`.\
Choose the type that best fits based on the operations you need to perform.

View file

@ -1,6 +1,6 @@
# Ticket ids
Let's think again about our ticket management system.
Let's think again about our ticket management system.\
Our ticket model right now looks like this:
```rust
@ -11,13 +11,13 @@ pub struct Ticket {
}
```
One thing is missing here: an **identifier** to uniquely identify a ticket.
That identifier should be unique for each ticket. That can be guaranteed by generating it automatically when
One thing is missing here: an **identifier** to uniquely identify a ticket.\
That identifier should be unique for each ticket. That can be guaranteed by generating it automatically when
a new ticket is created.
## Refining the model
Where should the id be stored?
Where should the id be stored?\
We could add a new field to the `Ticket` struct:
```rust
@ -29,7 +29,7 @@ pub struct Ticket {
}
```
But we don't know the id before creating the ticket. So it can't be there from the get-go.
But we don't know the id before creating the ticket. So it can't be there from the get-go.\
It'd have to be optional:
```rust
@ -61,7 +61,7 @@ pub struct Ticket {
}
```
A `TicketDraft` is a ticket that hasn't been created yet. It doesn't have an id, and it doesn't have a status.
A `Ticket` is a ticket that has been created. It has an id and a status.
A `TicketDraft` is a ticket that hasn't been created yet. It doesn't have an id, and it doesn't have a status.\
A `Ticket` is a ticket that has been created. It has an id and a status.\
Since each field in `TicketDraft` and `Ticket` embeds its own constraints, we don't have to duplicate logic
across the two types.
across the two types.

View file

@ -1,7 +1,7 @@
# Indexing
`TicketStore::get` returns an `Option<&Ticket>` for a given `TicketId`.
We've seen before how to access elements of arrays and vectors using Rust's
`TicketStore::get` returns an `Option<&Ticket>` for a given `TicketId`.\
We've seen before how to access elements of arrays and vectors using Rust's
indexing syntax:
```rust
@ -9,7 +9,7 @@ let v = vec![0, 1, 2];
assert_eq!(v[0], 0);
```
How can we provide the same experience for `TicketStore`?
How can we provide the same experience for `TicketStore`?\
You guessed right: we need to implement a trait, `Index`!
## `Index`
@ -34,4 +34,4 @@ It has:
Notice how the `index` method doesn't return an `Option`. The assumption is that
`index` will panic if you try to access an element that's not there, as it happens
for array and vec indexing.
for array and vec indexing.

View file

@ -1,6 +1,6 @@
# Mutable indexing
`Index` allows read-only access. It doesn't let you mutate the value you
`Index` allows read-only access. It doesn't let you mutate the value you
retrieved.
## `IndexMut`
@ -17,4 +17,4 @@ pub trait IndexMut<Idx>: Index<Idx>
```
`IndexMut` can only be implemented if the type already implements `Index`,
since it unlocks an _additional_ capability.
since it unlocks an _additional_ capability.

View file

@ -2,7 +2,7 @@
Our implementation of `Index`/`IndexMut` is not ideal: we need to iterate over the entire
`Vec` to retrieve a ticket by id; the algorithmic complexity is `O(n)`, where
`n` is the number of tickets in the store.
`n` is the number of tickets in the store.
We can do better by using a different data structure for storing tickets: a `HashMap<K, V>`.
@ -20,7 +20,7 @@ book_reviews.insert(
```
`HashMap` works with key-value pairs. It's generic over both: `K` is the generic
parameter for the key type, while `V` is the one for the value type.
parameter for the key type, while `V` is the one for the value type.
The expected cost of insertions, retrievals and removals is **constant**, `O(1)`.
That sounds perfect for our usecase, doesn't it?
@ -42,17 +42,17 @@ where
}
```
The key type must implement the `Eq` and `Hash` traits.
The key type must implement the `Eq` and `Hash` traits.\
Let's dig into those two.
## `Hash`
A hashing function (or hasher) maps a potentially infinite set of a values (e.g.
all possible strings) to a bounded range (e.g. a `u64` value).
There are many different hashing functions around, each with different properties
all possible strings) to a bounded range (e.g. a `u64` value).\
There are many different hashing functions around, each with different properties
(speed, collision risk, reversibility, etc.).
A `HashMap`, as the name suggests, uses a hashing function behind the scene.
A `HashMap`, as the name suggests, uses a hashing function behind the scene.
It hashes your key and then uses that hash to store/retrieve the associated value.
This strategy requires the key type must be hashable, hence the `Hash` trait bound on `K`.
@ -81,10 +81,10 @@ struct Person {
`HashMap` must be able to compare keys for equality. This is particularly important
when dealing with hash collisions—i.e. when two different keys hash to the same value.
You may wonder: isn't that what the `PartialEq` trait is for? Almost!
`PartialEq` is not enough for `HashMap` because it doesn't guarantee reflexivity, i.e. `a == a` is always `true`.
For example, floating point numbers (`f32` and `f64`) implement `PartialEq`,
but they don't satisfy the reflexivity property: `f32::NAN == f32::NAN` is `false`.
You may wonder: isn't that what the `PartialEq` trait is for? Almost!\
`PartialEq` is not enough for `HashMap` because it doesn't guarantee reflexivity, i.e. `a == a` is always `true`.\
For example, floating point numbers (`f32` and `f64`) implement `PartialEq`,
but they don't satisfy the reflexivity property: `f32::NAN == f32::NAN` is `false`.\
Reflexivity is crucial for `HashMap` to work correctly: without it, you wouldn't be able to retrieve a value
from the map using the same key you used to insert it.
@ -97,7 +97,7 @@ pub trait Eq: PartialEq {
```
It's a marker trait: it doesn't add any new methods, it's just a way for you to say to the compiler
that the equality logic implemented in `PartialEq` is reflexive.
that the equality logic implemented in `PartialEq` is reflexive.
You can derive `Eq` automatically when you derive `PartialEq`:
@ -113,4 +113,4 @@ struct Person {
There is an implicit contract between `Eq` and `Hash`: if two keys are equal, their hashes must be equal too.
This is crucial for `HashMap` to work correctly. If you break this contract, you'll get nonsensical results
when using `HashMap`.
when using `HashMap`.

View file

@ -1,20 +1,20 @@
# Ordering
By moving from a `Vec` to a `HashMap` we have improved the performance of our ticket management system,
and simplified our code in the process.
It's not all roses, though. When iterating over a `Vec`-backed store, we could be sure that the tickets
would be returned in the order they were added.
and simplified our code in the process.\
It's not all roses, though. When iterating over a `Vec`-backed store, we could be sure that the tickets
would be returned in the order they were added.\
That's not the case with a `HashMap`: you can iterate over the tickets, but the order is random.
We can recover a consistent ordering by switching from a `HashMap` to a `BTreeMap`.
## `BTreeMap`
A `BTreeMap` guarantees that entries are sorted by their keys.
A `BTreeMap` guarantees that entries are sorted by their keys.\
This is useful when you need to iterate over the entries in a specific order, or if you need to
perform range queries (e.g. "give me all tickets with an id between 10 and 20").
Just like `HashMap`, you won't find trait bounds on the definition of `BTreeMap`.
Just like `HashMap`, you won't find trait bounds on the definition of `BTreeMap`.
But you'll find trait bounds on its methods. Let's look at `insert`:
```rust
@ -34,7 +34,7 @@ impl<K, V> BTreeMap<K, V> {
## `Ord`
The `Ord` trait is used to compare values.
The `Ord` trait is used to compare values.\
While `PartialEq` is used to compare for equality, `Ord` is used to compare for ordering.
It's defined in `std::cmp`:
@ -45,8 +45,8 @@ pub trait Ord: Eq + PartialOrd {
}
```
The `cmp` method returns an `Ordering` enum, which can be one
of `Less`, `Equal`, or `Greater`.
The `cmp` method returns an `Ordering` enum, which can be one
of `Less`, `Equal`, or `Greater`.\
`Ord` requires that two other traits are implemented: `Eq` and `PartialOrd`.
## `PartialOrd`
@ -60,9 +60,9 @@ pub trait PartialOrd: PartialEq {
}
```
`PartialOrd::partial_cmp` returns an `Option`—it is not guaranteed that two values can
be compared.
For example, `f32` doesn't implement `Ord` because `NaN` values are not comparable,
`PartialOrd::partial_cmp` returns an `Option`—it is not guaranteed that two values can
be compared.\
For example, `f32` doesn't implement `Ord` because `NaN` values are not comparable,
the same reason why `f32` doesn't implement `Eq`.
## Implementing `Ord` and `PartialOrd`
@ -79,4 +79,4 @@ struct TicketId(u64);
If you choose (or need) to implement them manually, be careful:
- `Ord` and `PartialOrd` must be consistent with `Eq` and `PartialEq`.
- `Ord` and `PartialOrd` must be consistent with each other.
- `Ord` and `PartialOrd` must be consistent with each other.

View file

@ -1,10 +1,10 @@
# Intro
One of Rust's big promises is *fearless concurrency*: making it easier to write safe, concurrent programs.
We haven't seen much of that yet. All the work we've done so far has been single-threaded.
One of Rust's big promises is _fearless concurrency_: making it easier to write safe, concurrent programs.
We haven't seen much of that yet. All the work we've done so far has been single-threaded.
Time to change that!
In this chapter we'll make our ticket store multithreaded.
In this chapter we'll make our ticket store multithreaded.\
We'll have the opportunity to touch most of Rust's core concurrency features, including:
- Threads, using the `std::thread` module
@ -12,4 +12,4 @@ We'll have the opportunity to touch most of Rust's core concurrency features, in
- Shared state, using `Arc`, `Mutex` and `RwLock`
- `Send` and `Sync`, the traits that encode Rust's concurrency guarantees
We'll also discuss various design patterns for multithreaded systems and some their trade-offs.
We'll also discuss various design patterns for multithreaded systems and some their trade-offs.

View file

@ -1,26 +1,26 @@
# Threads
Before we start writing multithreaded code, let's take a step back and talk about what threads are
Before we start writing multithreaded code, let's take a step back and talk about what threads are
and why we might want to use them.
## What is a thread?
A **thread** is an execution context managed by the underlying operating system.
A **thread** is an execution context managed by the underlying operating system.\
Each thread has its own stack, instruction pointer, and program counter.
A single **process** can manage multiple threads.
These threads share the same memory space, which means they can access the same data.
Threads are a **logical** construct. In the end, you can only run one set of instructions
at a time on a CPU core, the **physical** execution unit.
Threads are a **logical** construct. In the end, you can only run one set of instructions
at a time on a CPU core, the **physical** execution unit.\
Since there can be many more threads than there are CPU cores, the operating system's
**scheduler** is in charge of deciding which thread to run at any given time,
partitioning CPU time among them to maximize throughput and responsiveness.
## `main`
When a Rust program starts, it runs on a single thread, the **main thread**.
This thread is created by the operating system and is responsible for running the `main`
When a Rust program starts, it runs on a single thread, the **main thread**.\
This thread is created by the operating system and is responsible for running the `main`
function.
```rust
@ -37,8 +37,8 @@ fn main() {
## `std::thread`
Rust's standard library provides a module, `std::thread`, that allows you to create
and manage threads.
Rust's standard library provides a module, `std::thread`, that allows you to create
and manage threads.
### `spawn`
@ -66,12 +66,12 @@ fn main() {
```
If you execute this program on the [Rust playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=afedf7062298ca8f5a248bc551062eaa)
you'll see that the main thread and the spawned thread run concurrently.
you'll see that the main thread and the spawned thread run concurrently.\
Each thread makes progress independently of the other.
### Process termination
When the main thread finishes, the overall process will exit.
When the main thread finishes, the overall process will exit.\
A spawned thread will continue running until it finishes or the main thread finishes.
```rust
@ -90,7 +90,7 @@ fn main() {
}
```
In the example above, you can expect to see the message "Hello from a thread!" printed roughly five times.
In the example above, you can expect to see the message "Hello from a thread!" printed roughly five times.\
Then the main thread will finish (when the `sleep` call returns), and the spawned thread will be terminated
since the overall process exits.
@ -109,7 +109,7 @@ fn main() {
}
```
In this example, the main thread will wait for the spawned thread to finish before exiting.
This introduces a form of **synchronization** between the two threads: you're guaranteed to see the message
In this example, the main thread will wait for the spawned thread to finish before exiting.\
This introduces a form of **synchronization** between the two threads: you're guaranteed to see the message
"Hello from a thread!" printed before the program exits, because the main thread won't exit
until the spawned thread has finished.

View file

@ -20,12 +20,12 @@ error[E0597]: `v` does not live long enough
`argument requires that v is borrowed for 'static`, what does that mean?
The `'static` lifetime is a special lifetime in Rust.
The `'static` lifetime is a special lifetime in Rust.\
It means that the value will be valid for the entire duration of the program.
## Detached threads
A thread launched via `thread::spawn` can **outlive** the thread that spawned it.
A thread launched via `thread::spawn` can **outlive** the thread that spawned it.\
For example:
```rust
@ -43,11 +43,11 @@ fn f() {
}
```
In this example, the first spawned thread will in turn spawn
a child thread that prints a message every second.
In this example, the first spawned thread will in turn spawn
a child thread that prints a message every second.\
The first thread will then finish and exit. When that happens,
its child thread will **continue running** for as long as the
overall process is running.
its child thread will **continue running** for as long as the
overall process is running.\
In Rust's lingo, we say that the child thread has **outlived**
its parent.
@ -59,7 +59,7 @@ Since a spawned thread can:
- run until the program exits
it must not borrow any values that might be dropped before the program exits;
violating this constraint would expose us to a use-after-free bug.
violating this constraint would expose us to a use-after-free bug.\
That's why `std::thread::spawn`'s signature requires that the closure passed to it
has the `'static` lifetime:
@ -77,9 +77,9 @@ where
All values in Rust have a lifetime, not just references.
In particular, a type that owns its data (like a `Vec` or a `String`)
In particular, a type that owns its data (like a `Vec` or a `String`)
satisfies the `'static` constraint: if you own it, you can keep working with it
for as long as you want, even after the function that originally created it
for as long as you want, even after the function that originally created it
has returned.
You can thus interpret `'static` as a way to say:
@ -104,9 +104,9 @@ The most common case is a reference to **static data**, such as string literals:
let s: &'static str = "Hello world!";
```
Since string literals are known at compile-time, Rust stores them *inside* your executable,
in a region known as **read-only data segment**.
All references pointing to that region will therefore be valid for as long as
Since string literals are known at compile-time, Rust stores them _inside_ your executable,
in a region known as **read-only data segment**.
All references pointing to that region will therefore be valid for as long as
the program runs; they satisfy the `'static` contract.
## Further reading

View file

@ -1,9 +1,9 @@
# Leaking data
The main concern around passing references to spawned threads is use-after-free bugs:
accessing data using a pointer to a memory region that's already been freed/de-allocated.
The main concern around passing references to spawned threads is use-after-free bugs:
accessing data using a pointer to a memory region that's already been freed/de-allocated.\
If you're working with heap-allocated data, you can avoid the issue by
telling Rust that you'll never reclaim that memory: you choose to **leak memory**,
telling Rust that you'll never reclaim that memory: you choose to **leak memory**,
intentionally.
This can be done, for example, using the `Box::leak` method from Rust's standard library:
@ -19,7 +19,7 @@ let static_ref: &'static mut u32 = Box::leak(x);
## Data leakage is process-scoped
Leaking data is dangerous: if you keep leaking memory, you'll eventually
run out and crash with an out-of-memory error.
run out and crash with an out-of-memory error.
```rust
// If you leave this running for a while,
@ -32,14 +32,14 @@ fn oom_trigger() {
}
```
At the same time, memory leaked via `Box::leak` is not truly forgotten.
At the same time, memory leaked via `Box::leak` is not truly forgotten.\
The operating system can map each memory region to the process responsible for it.
When the process exits, the operating system will reclaim that memory.
Keeping this in mind, it can be OK to leak memory when:
- The amount of memory you need to leak is not unbounded/known upfront, or
- Your process is short-lived and you're confident you won't exhaust
- Your process is short-lived and you're confident you won't exhaust
all the available memory before it exits
"Let the OS deal with it" is a perfectly valid memory management strategy

View file

@ -1,7 +1,7 @@
# Scoped threads
All the lifetime issues we discussed so far have a common source:
the spawned thread can outlive its parent.
All the lifetime issues we discussed so far have a common source:
the spawned thread can outlive its parent.\
We can sidestep this issue by using **scoped threads**.
```rust
@ -26,16 +26,16 @@ Let's unpack what's happening.
## `scope`
The `std::thread::scope` function creates a new **scope**.
`std::thread::scope` takes as input a closure, with a single argument: a `Scope` instance.
The `std::thread::scope` function creates a new **scope**.\
`std::thread::scope` takes as input a closure, with a single argument: a `Scope` instance.
## Scoped spawns
`Scope` exposes a `spawn` method.
Unlike `std::thread::spawn`, all threads spawned using a `Scope` will be
**automatically joined** when the scope ends.
`Scope` exposes a `spawn` method.\
Unlike `std::thread::spawn`, all threads spawned using a `Scope` will be
**automatically joined** when the scope ends.
If we were to "translate" the previous example to `std::thread::spawn`,
If we were to "translate" the previous example to `std::thread::spawn`,
it'd look like this:
```rust
@ -61,13 +61,13 @@ println!("Here's v: {v:?}");
The translated example wouldn't compile, though: the compiler would complain
that `&v` can't be used from our spawned threads since its lifetime isn't
`'static`.
`'static`.
That's not an issue with `std::thread::scope`—you can **safely borrow from the environment**.
In our example, `v` is created before the spawning points.
It will only be dropped _after_ `scope` returns. At the same time,
all threads spawned inside `scope` are guaranteed to finish _before_ `scope` returns,
therefore there is no risk of having dangling references.
therefore there is no risk of having dangling references.
The compiler won't complain!

View file

@ -1,18 +1,18 @@
# Channels
All our spawned threads have been fairly short-lived so far.
All our spawned threads have been fairly short-lived so far.\
Get some input, run a computation, return the result, shut down.
For our ticket management system, we want to do something different:
For our ticket management system, we want to do something different:
a client-server architecture.
We will have **one long-running server thread**, responsible for managing
our state, the stored tickets.
We will have **one long-running server thread**, responsible for managing
our state, the stored tickets.
We will then have **multiple client threads**.
Each client will be able to send **commands** and **queries** to
the stateful thread, in order to change its state (e.g. add a new ticket)
or retrieve information (e.g. get the status of a ticket).
We will then have **multiple client threads**.\
Each client will be able to send **commands** and **queries** to
the stateful thread, in order to change its state (e.g. add a new ticket)
or retrieve information (e.g. get the status of a ticket).\
Client threads will run concurrently.
## Communication
@ -22,16 +22,16 @@ So far we've only had very limited parent-child communication:
- The spawned thread borrowed/consumed data from the parent context
- The spawned thread returned data to the parent when joined
This isn't enough for a client-server design.
Clients need to be able to send and receive data from the server thread
_after_ it has been launched.
This isn't enough for a client-server design.\
Clients need to be able to send and receive data from the server thread
_after_ it has been launched.
We can solve the issue using **channels**.
## Channels
Rust's standard library provides **multi-producer, single-consumer** (mpsc) channels
in its `std::sync::mpsc` module.
in its `std::sync::mpsc` module.\
There are two channel flavours: bounded and unbounded. We'll stick to the unbounded
version for now, but we'll discuss the pros and cons later on.
@ -43,8 +43,8 @@ use std::sync::mpsc::channel;
let (sender, receiver) = channel();
```
You get a sender and a receiver.
You call `send` on the sender to push data into the channel.
You get a sender and a receiver.\
You call `send` on the sender to push data into the channel.\
You call `recv` on the receiver to pull data from the channel.
### Multiple senders
@ -53,21 +53,21 @@ You call `recv` on the receiver to pull data from the channel.
each client thread) and they will all push data into the same channel.
`Receiver`, instead, is not clonable: there can only be a single receiver
for a given channel.
for a given channel.
That's what **mpsc** (multi-producer single-consumer) stands for!
### Message type
Both `Sender` and `Receiver` are generic over a type parameter `T`.
Both `Sender` and `Receiver` are generic over a type parameter `T`.\
That's the type of the _messages_ that can travel on our channel.
It could be a `u64`, a struct, an enum, etc.
### Errors
Both `send` and `recv` can fail.
`send` returns an error if the receiver has been dropped.
Both `send` and `recv` can fail.\
`send` returns an error if the receiver has been dropped.\
`recv` returns an error if all senders have been dropped and the channel is empty.
In other words, `send` and `recv` error when the channel is effectively closed.
In other words, `send` and `recv` error when the channel is effectively closed.

View file

@ -10,13 +10,13 @@ impl<T> Sender<T> {
}
```
`send` takes `&self` as its argument.
`send` takes `&self` as its argument.\
But it's clearly causing a mutation: it's adding a new message to the channel.
What's even more interesting is that `Sender` is cloneable: we can have multiple instances of `Sender`
trying to modify the channel state **at the same time**, from different threads.
That's the key property we are using to build this client-server architecture. But why does it work?
Doesn't it violate Rust's rules about borrowing? How are we performing mutations via an _immutable_ reference?
Doesn't it violate Rust's rules about borrowing? How are we performing mutations via an _immutable_ reference?
## Shared rather than immutable references
@ -31,31 +31,31 @@ It would have been more accurate to name them:
- exclusive references (`&mut T`)
Immutable/mutable is a mental model that works for the vast majority of cases, and it's a great one to get started
with Rust. But it's not the whole story, as you've just seen: `&T` doesn't actually guarantee that the data it
points to is immutable.
Don't worry, though: Rust is still keeping its promises.
with Rust. But it's not the whole story, as you've just seen: `&T` doesn't actually guarantee that the data it
points to is immutable.\
Don't worry, though: Rust is still keeping its promises.
It's just that the terms are a bit more nuanced than they might seem at first.
## `UnsafeCell`
Whenever a type allows you to mutate data through a shared reference, you're dealing with **interior mutability**.
Whenever a type allows you to mutate data through a shared reference, you're dealing with **interior mutability**.
By default, the Rust compiler assumes that shared references are immutable. It **optimises your code** based on that assumption.
By default, the Rust compiler assumes that shared references are immutable. It **optimises your code** based on that assumption.\
The compiler can reorder operations, cache values, and do all sorts of magic to make your code faster.
You can tell the compiler "No, this shared reference is actually mutable" by wrapping the data in an `UnsafeCell`.
You can tell the compiler "No, this shared reference is actually mutable" by wrapping the data in an `UnsafeCell`.\
Every time you see a type that allows interior mutability, you can be certain that `UnsafeCell` is involved,
either directly or indirectly.
either directly or indirectly.\
Using `UnsafeCell`, raw pointers and `unsafe` code, you can mutate data through shared references.
Let's be clear, though: `UnsafeCell` isn't a magic wand that allows you to ignore the borrow-checker!
`unsafe` code is still subject to Rust's rules about borrowing and aliasing.
It's an (advanced) tool that you can leverage to build **safe abstractions** whose safety can't be directly expressed
in Rust's type system. Whenever you use the `unsafe` keyword you're telling the compiler:
Let's be clear, though: `UnsafeCell` isn't a magic wand that allows you to ignore the borrow-checker!\
`unsafe` code is still subject to Rust's rules about borrowing and aliasing.
It's an (advanced) tool that you can leverage to build **safe abstractions** whose safety can't be directly expressed
in Rust's type system. Whenever you use the `unsafe` keyword you're telling the compiler:
"I know what I'm doing, I won't violate your invariants, trust me."
Every time you call an `unsafe` function, there will be documentation explaining its **safety preconditions**:
under what circumstances it's safe to execute its `unsafe` block. You can find the ones for `UnsafeCell`
Every time you call an `unsafe` function, there will be documentation explaining its **safety preconditions**:
under what circumstances it's safe to execute its `unsafe` block. You can find the ones for `UnsafeCell`
[in `std`'s documentation](https://doc.rust-lang.org/std/cell/struct.UnsafeCell.html).
We won't be using `UnsafeCell` directly in this course, nor will we be writing `unsafe` code.
@ -64,15 +64,15 @@ every day in Rust.
## Key examples
Let's go through a couple of important `std` types that leverage interior mutability.
These are types that you'll encounter somewhat often in Rust code, especially if you peek under the hood of
Let's go through a couple of important `std` types that leverage interior mutability.\
These are types that you'll encounter somewhat often in Rust code, especially if you peek under the hood of
some the libraries you use.
### Reference counting
`Rc` is a reference-counted pointer.
It wraps around a value and keeps track of how many references to the value exist.
When the last reference is dropped, the value is deallocated.
`Rc` is a reference-counted pointer.\
It wraps around a value and keeps track of how many references to the value exist.
When the last reference is dropped, the value is deallocated.\
The value wrapped in an `Rc` is immutable: you can only get shared references to it.
```rust
@ -91,7 +91,7 @@ assert_eq!(Rc::strong_count(&b), 2);
// and share the same reference counter.
```
`Rc` uses `UnsafeCell` internally to allow shared references to increment and decrement the reference count.
`Rc` uses `UnsafeCell` internally to allow shared references to increment and decrement the reference count.
### `RefCell`
@ -99,7 +99,7 @@ assert_eq!(Rc::strong_count(&b), 2);
It allows you to mutate the value wrapped in a `RefCell` even if you only have an
immutable reference to the `RefCell` itself.
This is done via **runtime borrow checking**.
This is done via **runtime borrow checking**.
The `RefCell` keeps track of the number (and type) of references to the value it contains at runtime.
If you try to borrow the value mutably while it's already borrowed immutably,
the program will panic, ensuring that Rust's borrowing rules are always enforced.
@ -111,4 +111,4 @@ let x = RefCell::new(42);
let y = x.borrow(); // Immutable borrow
let z = x.borrow_mut(); // Panics! There is an active immutable borrow.
```
```

View file

@ -1,6 +1,6 @@
# Two-way communication
In our current client-server implementation, communication flows in one direction: from the client to the server.
In our current client-server implementation, communication flows in one direction: from the client to the server.\
The client has no way of knowing if the server received the message, executed it successfully, or failed.
That's not ideal.
@ -8,9 +8,9 @@ To solve this issue, we can introduce a two-way communication system.
## Response channel
We need a way for the server to send a response back to the client.
We need a way for the server to send a response back to the client.\
There are various ways to do this, but the simplest option is to include a `Sender` channel in
the message that the client sends to the server. After processing the message, the server can use
this channel to send a response back to the client.
This is a fairly common pattern in Rust applications built on top of message-passing primitives.
This is a fairly common pattern in Rust applications built on top of message-passing primitives.

View file

@ -1,8 +1,8 @@
# A dedicated `Client` type
All the interactions from the client side have been fairly low-level: you have to
All the interactions from the client side have been fairly low-level: you have to
manually create a response channel, build the command, send it to the server, and
then call `recv` on the response channel to get the response.
then call `recv` on the response channel to get the response.
This is a lot of boilerplate code that could be abstracted away, and that's
exactly what we're going to do in this exercise.
This is a lot of boilerplate code that could be abstracted away, and that's
exactly what we're going to do in this exercise.

View file

@ -1,18 +1,18 @@
# Bounded vs unbounded channels
So far we've been using unbounded channels.
You can send as many messages as you want, and the channel will grow to accommodate them.
In a multi-producer single-consumer scenario, this can be problematic: if the producers
So far we've been using unbounded channels.\
You can send as many messages as you want, and the channel will grow to accommodate them.\
In a multi-producer single-consumer scenario, this can be problematic: if the producers
enqueues messages at a faster rate than the consumer can process them, the channel will
keep growing, potentially consuming all available memory.
Our recommendation is to **never** use an unbounded channel in a production system.
You should always enforce an upper limit on the number of messages that can be enqueued using a
Our recommendation is to **never** use an unbounded channel in a production system.\
You should always enforce an upper limit on the number of messages that can be enqueued using a
**bounded channel**.
## Bounded channels
A bounded channel has a fixed capacity.
A bounded channel has a fixed capacity.\
You can create one by calling `sync_channel` with a capacity greater than zero:
```rust
@ -21,23 +21,23 @@ use std::sync::mpsc::sync_channel;
let (sender, receiver) = sync_channel(10);
```
`receiver` has the same type as before, `Receiver<T>`.
`sender`, instead, is an instance of `SyncSender<T>`.
`receiver` has the same type as before, `Receiver<T>`.\
`sender`, instead, is an instance of `SyncSender<T>`.
### Sending messages
You have two different methods to send messages through a `SyncSender`:
- `send`: if there is space in the channel, it will enqueue the message and return `Ok(())`.
- `send`: if there is space in the channel, it will enqueue the message and return `Ok(())`.\
If the channel is full, it will block and wait until there is space available.
- `try_send`: if there is space in the channel, it will enqueue the message and return `Ok(())`.
- `try_send`: if there is space in the channel, it will enqueue the message and return `Ok(())`.\
If the channel is full, it will return `Err(TrySendError::Full(value))`, where `value` is the message that couldn't be sent.
Depending on your use case, you might want to use one or the other.
Depending on your use case, you might want to use one or the other.
### Backpressure
The main advantage of using bounded channels is that they provide a form of **backpressure**.
The main advantage of using bounded channels is that they provide a form of **backpressure**.\
They force the producers to slow down if the consumer can't keep up.
The backpressure can then propagate through the system, potentially affecting the whole architecture and
preventing end users from overwhelming the system with requests.
preventing end users from overwhelming the system with requests.

View file

@ -1,16 +1,16 @@
# Update operations
So far we've implemented only insertion and retrieval operations.
So far we've implemented only insertion and retrieval operations.\
Let's see how we can expand the system to provide an update operation.
## Legacy updates
In the non-threaded version of the system, updates were fairly straightforward: `TicketStore` exposed a
In the non-threaded version of the system, updates were fairly straightforward: `TicketStore` exposed a
`get_mut` method that allowed the caller to obtain a mutable reference to a ticket, and then modify it.
## Multithreaded updates
The same strategy won't work in the current multi-threaded version,
The same strategy won't work in the current multi-threaded version,
because the mutable reference would have to be sent over a channel. The borrow checker would
stop us, because `&mut Ticket` doesn't satisfy the `'static` lifetime requirement of `SyncSender::send`.
@ -18,7 +18,7 @@ There are a few ways to work around this limitation. We'll explore a few of them
### Patching
We can't send a `&mut Ticket` over a channel, therefore we can't mutate on the client-side.
We can't send a `&mut Ticket` over a channel, therefore we can't mutate on the client-side.\
Can we mutate on the server-side?
We can, if we tell the server what needs to be changed. In other words, if we send a **patch** to the server:
@ -32,8 +32,8 @@ struct TicketPatch {
}
```
The `id` field is mandatory, since it's required to identify the ticket that needs to be updated.
The `id` field is mandatory, since it's required to identify the ticket that needs to be updated.\
All other fields are optional:
- If a field is `None`, it means that the field should not be changed.
- If a field is `Some(value)`, it means that the field should be changed to `value`.
- If a field is `Some(value)`, it means that the field should be changed to `value`.

View file

@ -1,39 +1,39 @@
# Locks, `Send` and `Arc`
The patching strategy you just implemented has a major drawback: it's racy.
The patching strategy you just implemented has a major drawback: it's racy.\
If two clients send patches for the same ticket roughly at same time, the server will apply them in an arbitrary order.
Whoever enqueues their patch last will overwrite the changes made by the other client.
## Version numbers
We could try to fix this by using a **version number**.
Each ticket gets assigned a version number upon creation, set to `0`.
Whenever a client sends a patch, they must include the current version number of the ticket alongside the
We could try to fix this by using a **version number**.\
Each ticket gets assigned a version number upon creation, set to `0`.\
Whenever a client sends a patch, they must include the current version number of the ticket alongside the
desired changes. The server will only apply the patch if the version number matches the one it has stored.
In the scenario described above, the server would reject the second patch, because the version number would
In the scenario described above, the server would reject the second patch, because the version number would
have been incremented by the first patch and thus wouldn't match the one sent by the second client.
This approach is fairly common in distributed systems (e.g. when client and servers don't share memory),
and it is known as **optimistic concurrency control**.
and it is known as **optimistic concurrency control**.\
The idea is that most of the time, conflicts won't happen, so we can optimize for the common case.
You know enough about Rust by now to implement this strategy on your own as a bonus exercise, if you want to.
## Locking
We can also fix the race condition by introducing a **lock**.
We can also fix the race condition by introducing a **lock**.\
Whenever a client wants to update a ticket, they must first acquire a lock on it. While the lock is active,
no other client can modify the ticket.
Rust's standard library provides two different locking primitives: `Mutex<T>` and `RwLock<T>`.
Rust's standard library provides two different locking primitives: `Mutex<T>` and `RwLock<T>`.\
Let's start with `Mutex<T>`. It stands for **mut**ual **ex**clusion, and it's the simplest kind of lock:
it allows only one thread to access the data, no matter if it's for reading or writing.
`Mutex<T>` wraps the data it protects, and it's therefore generic over the type of the data.
You can't access the data directly: the type system forces you to acquire a lock first using either `Mutex::lock` or
`Mutex<T>` wraps the data it protects, and it's therefore generic over the type of the data.\
You can't access the data directly: the type system forces you to acquire a lock first using either `Mutex::lock` or
`Mutex::try_lock`. The former blocks until the lock is acquired, the latter returns immediately with an error if the lock
can't be acquired.
Both methods return a guard object that dereferences to the data, allowing you to modify it. The lock is released when
can't be acquired.\
Both methods return a guard object that dereferences to the data, allowing you to modify it. The lock is released when
the guard is dropped.
```rust
@ -57,10 +57,10 @@ drop(guard)
## Locking granularity
What should our `Mutex` wrap?
The simplest option would be the wrap the entire `TicketStore` in a single `Mutex`.
What should our `Mutex` wrap?\
The simplest option would be the wrap the entire `TicketStore` in a single `Mutex`.\
This would work, but it would severely limit the system's performance: you wouldn't be able to read tickets in parallel,
because every read would have to wait for the lock to be released.
because every read would have to wait for the lock to be released.\
This is known as **coarse-grained locking**.
It would be better to use **fine-grained locking**, where each ticket is protected by its own lock.
@ -73,17 +73,17 @@ struct TicketStore {
}
```
This approach is more efficient, but it has a downside: `TicketStore` has to become **aware** of the multithreaded
nature of the system; up until now, `TicketStore` has been blissfully ignored the existence of threads.
This approach is more efficient, but it has a downside: `TicketStore` has to become **aware** of the multithreaded
nature of the system; up until now, `TicketStore` has been blissfully ignored the existence of threads.\
Let's go for it anyway.
## Who holds the lock?
For the whole scheme to work, the lock must be passed to the client that wants to modify the ticket.
For the whole scheme to work, the lock must be passed to the client that wants to modify the ticket.\
The client can then directly modify the ticket (as if they had a `&mut Ticket`) and release the lock when they're done.
This is a bit tricky.
We can't send a `Mutex<Ticket>` over a channel, because `Mutex` is not `Clone` and
This is a bit tricky.\
We can't send a `Mutex<Ticket>` over a channel, because `Mutex` is not `Clone` and
we can't move it out of the `TicketStore`. Could we send the `MutexGuard` instead?
Let's test the idea with a small example:
@ -131,22 +131,22 @@ note: required because it's used within this closure
## `Send`
`Send` is a marker trait that indicates that a type can be safely transferred from one thread to another.
`Send` is a marker trait that indicates that a type can be safely transferred from one thread to another.\
`Send` is also an auto-trait, just like `Sized`; it's automatically implemented (or not implemented) for your type
by the compiler, based on its definition.
You can also implement `Send` manually for your types, but it requires `unsafe` since you have to guarantee that the
by the compiler, based on its definition.\
You can also implement `Send` manually for your types, but it requires `unsafe` since you have to guarantee that the
type is indeed safe to send between threads for reasons that the compiler can't automatically verify.
### Channel requirements
`Sender<T>`, `SyncSender<T>` and `Receiver<T>` are `Send` if and only if `T` is `Send`.
`Sender<T>`, `SyncSender<T>` and `Receiver<T>` are `Send` if and only if `T` is `Send`.\
That's because they are used to send values between threads, and if the value itself is not `Send`, it would be
unsafe to send it between threads.
### `MutexGuard`
`MutexGuard` is not `Send` because the underlying operating system primitives that `Mutex` uses to implement
the lock require (on some platforms) that the lock must be released by the same thread that acquired it.
the lock require (on some platforms) that the lock must be released by the same thread that acquired it.\
If we were to send a `MutexGuard` to another thread, the lock would be released by a different thread, which would
lead to undefined behavior.
@ -154,13 +154,13 @@ lead to undefined behavior.
Summing it up:
- We can't send a `MutexGuard` over a channel. So we can't lock on the server-side and then modify the ticket on the
- We can't send a `MutexGuard` over a channel. So we can't lock on the server-side and then modify the ticket on the
client-side.
- We can send a `Mutex` over a channel because it's `Send` as long as the data it protects is `Send`, which is the
case for `Ticket`.
At the same time, we can't move the `Mutex` out of the `TicketStore` nor clone it.
- We can send a `Mutex` over a channel because it's `Send` as long as the data it protects is `Send`, which is the
case for `Ticket`.
At the same time, we can't move the `Mutex` out of the `TicketStore` nor clone it.
How can we solve this conundrum?
How can we solve this conundrum?\
We need to look at the problem from a different angle.
To lock a `Mutex`, we don't need an owned value. A shared reference is enough, since `Mutex` uses internal mutability:
@ -173,15 +173,15 @@ impl<T> Mutex<T> {
}
```
It is therefore enough to send a shared reference to the client.
We can't do that directly, though, because the reference would have to be `'static` and that's not the case.
It is therefore enough to send a shared reference to the client.\
We can't do that directly, though, because the reference would have to be `'static` and that's not the case.\
In a way, we need an "owned shared reference". It turns out that Rust has a type that fits the bill: `Arc`.
## `Arc` to the rescue
`Arc` stands for **atomic reference counting**.
`Arc` stands for **atomic reference counting**.\
`Arc` wraps around a value and keeps track of how many references to the value exist.
When the last reference is dropped, the value is deallocated.
When the last reference is dropped, the value is deallocated.\
The value wrapped in an `Arc` is immutable: you can only get shared references to it.
```rust
@ -196,9 +196,9 @@ let data_ref: &u32 = &data;
```
If you're having a déjà vu moment, you're right: `Arc` sounds very similar to `Rc`, the reference-counted pointer we
introduced when talking about interior mutability. The difference is thread-safety: `Rc` is not `Send`, while `Arc` is.
It boils down to the way the reference count is implemented: `Rc` uses a "normal" integer, while `Arc` uses an
**atomic** integer, which can be safely shared and modified across threads.
introduced when talking about interior mutability. The difference is thread-safety: `Rc` is not `Send`, while `Arc` is.
It boils down to the way the reference count is implemented: `Rc` uses a "normal" integer, while `Arc` uses an
**atomic** integer, which can be safely shared and modified across threads.
## `Arc<Mutex<T>>`
@ -209,9 +209,9 @@ If we pair `Arc` with `Mutex`, we finally get a type that:
- `Mutex` is `Send` if `T` is `Send`.
- `T` is `Ticket`, which is `Send`.
- Can be cloned, because `Arc` is `Clone` no matter what `T` is.
Cloning an `Arc` increments the reference count, the data is not copied.
Cloning an `Arc` increments the reference count, the data is not copied.
- Can be used to modify the data it wraps, because `Arc` lets you get a shared
reference to `Mutex<T>` which can in turn be used to acquire a lock.
reference to `Mutex<T>` which can in turn be used to acquire a lock.
We have all the pieces we need to implement the locking strategy for our ticket store.
@ -219,4 +219,4 @@ We have all the pieces we need to implement the locking strategy for our ticket
- We won't be covering the details of atomic operations in this course, but you can find more information
[in the `std` documentation](https://doc.rust-lang.org/std/sync/atomic/index.html) as well as in the
["Rust atomics and locks" book](https://marabos.nl/atomics/).
["Rust atomics and locks" book](https://marabos.nl/atomics/).

View file

@ -3,11 +3,11 @@
Our new `TicketStore` works, but its read performance is not great: there can only be one client at a time
reading a specific ticket, because `Mutex<T>` doesn't distinguish between readers and writers.
We can solve the issue by using a different locking primitive: `RwLock<T>`.
`RwLock<T>` stands for **read-write lock**. It allows **multiple readers** to access the data simultaneously,
but only one writer at a time.
We can solve the issue by using a different locking primitive: `RwLock<T>`.\
`RwLock<T>` stands for **read-write lock**. It allows **multiple readers** to access the data simultaneously,
but only one writer at a time.
`RwLock<T>` has two methods to acquire a lock: `read` and `write`.
`RwLock<T>` has two methods to acquire a lock: `read` and `write`.\
`read` returns a guard that allows you to read the data, while `write` returns a guard that allows you to modify it.
```rust
@ -26,20 +26,20 @@ let guard2 = lock.read().unwrap();
## Trade-offs
On the surface, `RwLock<T>` seems like a no-brainer: it provides a superset of the functionality of `Mutex<T>`.
On the surface, `RwLock<T>` seems like a no-brainer: it provides a superset of the functionality of `Mutex<T>`.
Why would you ever use `Mutex<T>` if you can use `RwLock<T>` instead?
There are two key reasons:
- Locking a `RwLock<T>` is more expensive than locking a `Mutex<T>`.
This is because `RwLock<T>` has to keep track of the number of active readers and writers, while `Mutex<T>`
- Locking a `RwLock<T>` is more expensive than locking a `Mutex<T>`.\
This is because `RwLock<T>` has to keep track of the number of active readers and writers, while `Mutex<T>`
only has to keep track of whether the lock is held or not.
This performance overhead is not an issue if there are more readers than writers, but if the workload
is write-heavy `Mutex<T>` might be a better choice.
- `RwLock<T>` can cause **writer starvation**.
If there are always readers waiting to acquire the lock, writers might never get a chance to run.
is write-heavy `Mutex<T>` might be a better choice.
- `RwLock<T>` can cause **writer starvation**.\
If there are always readers waiting to acquire the lock, writers might never get a chance to run.\
`RwLock<T>` doesn't provide any guarantees about the order in which readers and writers are granted access to the lock.
It depends on the policy implemented by the underlying OS, which might not be fair to writers.
In our case, we can expect the workload to be read-heavy (since most clients will be reading tickets, not modifying them),
so `RwLock<T>` is a good choice.
so `RwLock<T>` is a good choice.

View file

@ -1,30 +1,30 @@
# Design review
Let's take a moment to review the journey we've been through.
Let's take a moment to review the journey we've been through.
## Lockless with channel serialization
Our first implementation of a multithreaded ticket store used:
- a single long-lived thread (server), to hold the shared state
- multiple clients sending requests to it via channels from their own threads.
- multiple clients sending requests to it via channels from their own threads.
No locking of the state was necessary, since the server was the only one modifying the state. That's because
the "inbox" channel naturally **serialized** incoming requests: the server would process them one by one.
We've already discussed the limitations of this approach when it comes to patching behaviour, but we didn't
No locking of the state was necessary, since the server was the only one modifying the state. That's because
the "inbox" channel naturally **serialized** incoming requests: the server would process them one by one.\
We've already discussed the limitations of this approach when it comes to patching behaviour, but we didn't
discuss the performance implications of the original design: the server could only process one request at a time,
including reads.
## Fine-grained locking
We then moved to a more sophisticated design, where each ticket was protected by its own lock and
clients could independently decide if they wanted to read or atomically modify a ticket, acquiring the appropriate lock.
clients could independently decide if they wanted to read or atomically modify a ticket, acquiring the appropriate lock.
This design allows for better parallelism (i.e. multiple clients can read tickets at the same time), but it is
This design allows for better parallelism (i.e. multiple clients can read tickets at the same time), but it is
still fundamentally **serial**: the server processes commands one by one. In particular, it hands out locks to clients
one by one.
Could we remove the channels entirely and allow clients to directly access the `TicketStore`, relying exclusively on
Could we remove the channels entirely and allow clients to directly access the `TicketStore`, relying exclusively on
locks to synchronize access?
## Removing channels
@ -37,18 +37,18 @@ We have two problems to solve:
### Sharing `TicketStore` across threads
We want all threads to refer to the same state, otherwise we don't really have a multithreaded system—we're just
running multiple single-threaded systems in parallel.
running multiple single-threaded systems in parallel.\
We've already encountered this problem when we tried to share a lock across threads: we can use an `Arc`.
### Synchronizing access to the store
There is one interaction that's still lockless thanks to the serialization provided by the channels: inserting
(or removing) a ticket from the store.
(or removing) a ticket from the store.\
If we remove the channels, we need to introduce (another) lock to synchronize access to the `TicketStore` itself.
If we use a `Mutex`, then it makes no sense to use an additional `RwLock` for each ticket: the `Mutex` will
already serialize access to the entire store, so we wouldn't be able to read tickets in parallel anyway.
already serialize access to the entire store, so we wouldn't be able to read tickets in parallel anyway.\
If we use a `RwLock`, instead, we can read tickets in parallel. We just to pause all reads while inserting
or removing a ticket.
Let's go down this path and see where it leads us.
Let's go down this path and see where it leads us.

View file

@ -1,28 +1,28 @@
# `Sync`
Before we wrap up this chapter, let's talk about another key trait in Rust's standard library: `Sync`.
Before we wrap up this chapter, let's talk about another key trait in Rust's standard library: `Sync`.
`Sync` is an auto trait, just like `Send`.
`Sync` is an auto trait, just like `Send`.\
It is automatically implemented by all types that can be safely **shared** between threads.
In order words: `T: Sync` means that `&T` is `Send`.
## `Sync` doesn't imply `Send`
It's important to note that `Sync` doesn't imply `Send`.
For example: `MutexGuard` is not `Send`, but it is `Sync`.
It's important to note that `Sync` doesn't imply `Send`.\
For example: `MutexGuard` is not `Send`, but it is `Sync`.
It isn't `Send` because the lock must be released on the same thread that acquired it, therefore we don't
want `MutexGuard` to be dropped on a different thread.
It isn't `Send` because the lock must be released on the same thread that acquired it, therefore we don't
want `MutexGuard` to be dropped on a different thread.\
But it is `Sync`, because giving a `&MutexGuard` to another thread has no impact on where the lock is released.
## `Send` doesn't imply `Sync`
The opposite is also true: `Send` doesn't imply `Sync`.
For example: `RefCell<T>` is `Send` (if `T` is `Send`), but it is not `Sync`.
The opposite is also true: `Send` doesn't imply `Sync`.\
For example: `RefCell<T>` is `Send` (if `T` is `Send`), but it is not `Sync`.
`RefCell<T>` performs runtime borrow checking, but the counters it uses to track borrows are not thread-safe.
Therefore, having multiple threads holding a `&RefCell` would lead to a data race, with potentially
multiple threads obtaining mutable references to the same data. Hence `RefCell` is not `Sync`.
multiple threads obtaining mutable references to the same data. Hence `RefCell` is not `Sync`.\
`Send` is fine, instead, because when we send a `RefCell` to another thread we're not
leaving behind any references to the data it contains, hence no risk of concurrent mutable access.
leaving behind any references to the data it contains, hence no risk of concurrent mutable access.

View file

@ -1,11 +1,11 @@
# Async Rust
Threads are not the only way to write concurrent programs in Rust.
In this chapter we'll explore another approach: **asynchronous programming**.
Threads are not the only way to write concurrent programs in Rust.\
In this chapter we'll explore another approach: **asynchronous programming**.
In particular, you'll get an introduction to:
- The `async`/`.await` keywords, to write asynchronous code effortlessly
- The `Future` trait, to represent computations that may not be complete yet
- `tokio`, the most popular runtime for running asynchronous code
- The cooperative nature of Rust asynchronous model, and how this affects your code
- The cooperative nature of Rust asynchronous model, and how this affects your code

View file

@ -1,16 +1,16 @@
# Asynchronous functions
All the functions and methods you've written so far were eager.
All the functions and methods you've written so far were eager.\
Nothing happened until you invoked them. But once you did, they ran to
completion: they did **all** their work, and then returned their output.
Sometimes that's undesirable.
Sometimes that's undesirable.\
For example, if you're writing an HTTP server, there might be a lot of
**waiting**: waiting for the request body to arrive, waiting for the
database to respond, waiting for a downstream service to reply, etc.
What if you could do something else while you're waiting?
What if you could choose to give up midway through a computation?
What if you could do something else while you're waiting?\
What if you could choose to give up midway through a computation?\
What if you could choose to prioritise another task over the current one?
That's where **asynchronous functions** come in.
@ -38,7 +38,7 @@ fn run() {
}
```
Nothing happens!
Nothing happens!\
Rust doesn't start executing `bind_random` when you call it,
not even as a background task (as you might expect based on your experience
with other languages).
@ -68,18 +68,18 @@ async fn run() {
}
```
`.await` doesn't return control to the caller until the asynchronous function
`.await` doesn't return control to the caller until the asynchronous function
has run to completion—e.g. until the `TcpListener` has been created in the example above.
## Runtimes
## Runtimes
If you're puzzled, you're right to be!
If you're puzzled, you're right to be!\
We've just said that the perk of asynchronous functions
is that they don't do **all** their work at once. We then introduced `.await`, which
doesn't return until the asynchronous function has run to completion. Haven't we
just re-introduced the problem we were trying to solve? What's the point?
Not quite! A lot happens behind the scenes when you call `.await`!
Not quite! A lot happens behind the scenes when you call `.await`!\
You're yielding control to an **async runtime**, also known as an **async executor**.
Executors are where the magic happens: they are in charge of managing all your
ongoing asynchronous **tasks**. In particular, they balance two different goals:
@ -95,8 +95,8 @@ no default runtime. The standard library doesn't ship with one. You need to
bring your own!
In most cases, you'll choose one of the options available in the ecosystem.
Some runtimes are designed to be broadly applicable, a solid option for most applications.
`tokio` and `async-std` belong to this category. Other runtimes are optimised for
Some runtimes are designed to be broadly applicable, a solid option for most applications.
`tokio` and `async-std` belong to this category. Other runtimes are optimised for
specific use cases—e.g. `embassy` for embedded systems.
Throughout this course we'll rely on `tokio`, the most popular runtime for general-purpose
@ -130,10 +130,10 @@ fn main() {
### `#[tokio::test]`
The same goes for tests: they must be synchronous functions.
The same goes for tests: they must be synchronous functions.\
Each test function is run in its own thread, and you're responsible for
setting up and launching an async runtime if you need to run async code
in your tests.
in your tests.\
`tokio` provides a `#[tokio::test]` macro to make this easier:
```rust
@ -141,4 +141,4 @@ in your tests.
async fn my_test() {
// Your async test code goes here
}
```
```

View file

@ -12,12 +12,12 @@ pub async fn echo(listener: TcpListener) -> Result<(), anyhow::Error> {
}
```
This is not bad!
This is not bad!\
If a long time passes between two incoming connections, the `echo` function will be idle
(since `TcpListener::accept` is an asynchronous function), thus allowing the executor
to run other tasks in the meantime.
But how can we actually have multiple tasks running concurrently?
But how can we actually have multiple tasks running concurrently?\
If we always run our asynchronous functions until completion (by using `.await`), we'll never
have more than one task running at a time.
@ -25,7 +25,7 @@ This is where the `tokio::spawn` function comes in.
## `tokio::spawn`
`tokio::spawn` allows you to hand off a task to the executor, **without waiting for it to complete**.
`tokio::spawn` allows you to hand off a task to the executor, **without waiting for it to complete**.\
Whenever you invoke `tokio::spawn`, you're telling `tokio` to continue running
the spawned task, in the background, **concurrently** with the task that spawned it.
@ -51,12 +51,12 @@ pub async fn echo(listener: TcpListener) -> Result<(), anyhow::Error> {
### Asynchronous blocks
In this example, we've passed an **asynchronous block** to `tokio::spawn`: `async move { /* */ }`
Asynchronous blocks are a quick way to mark a region of code as asynchronous without having
Asynchronous blocks are a quick way to mark a region of code as asynchronous without having
to define a separate async function.
### `JoinHandle`
`tokio::spawn` returns a `JoinHandle`.
`tokio::spawn` returns a `JoinHandle`.\
You can use `JoinHandle` to `.await` the background task, in the same way
we used `join` for spawned threads.
@ -83,10 +83,10 @@ pub async fn do_work() {
### Panic boundary
If a task spawned with `tokio::spawn` panics, the panic will be caught by the executor.
If a task spawned with `tokio::spawn` panics, the panic will be caught by the executor.\
If you don't `.await` the corresponding `JoinHandle`, the panic won't be propagated to the spawner.
Even if you do `.await` the `JoinHandle`, the panic won't be propagated automatically.
Awaiting a `JoinHandle` returns a `Result`, with [`JoinError`](https://docs.rs/tokio/latest/tokio/task/struct.JoinError.html)
Even if you do `.await` the `JoinHandle`, the panic won't be propagated automatically.
Awaiting a `JoinHandle` returns a `Result`, with [`JoinError`](https://docs.rs/tokio/latest/tokio/task/struct.JoinError.html)
as its error type. You can then check if the task panicked by calling `JoinError::is_panic` and
choose what to do with the panic—either log it, ignore it, or propagate it.
@ -112,11 +112,11 @@ pub async fn work() {
### `std::thread::spawn` vs `tokio::spawn`
You can think of `tokio::spawn` as the asynchronous sibling of `std::spawn::thread`.
You can think of `tokio::spawn` as the asynchronous sibling of `std::spawn::thread`.
Notice a key difference: with `std::thread::spawn`, you're delegating control to the OS scheduler.
You're not in control of how threads are scheduled.
With `tokio::spawn`, you're delegating to an async executor that runs entirely in
user space. The underlying OS scheduler is not involved in the decision of which task
to run next. We're in charge of that decision now, via the executor we chose to use.
user space. The underlying OS scheduler is not involved in the decision of which task
to run next. We're in charge of that decision now, via the executor we chose to use.

View file

@ -6,9 +6,9 @@ it has an impact on our code.
## Flavors
`tokio` ships two different runtime _flavors_.
`tokio` ships two different runtime _flavors_.
You can configure your runtime via `tokio::runtime::Builder`:
You can configure your runtime via `tokio::runtime::Builder`:
- `Builder::new_multi_thread` gives you a **multithreaded `tokio` runtime**
- `Builder::new_current_thread` will instead rely on the **current thread** for execution.
@ -19,29 +19,29 @@ You can configure your runtime via `tokio::runtime::Builder`:
### Current thread runtime
The current-thread runtime, as the name implies, relies exclusively on the OS thread
it was launched on to schedule and execute tasks.
it was launched on to schedule and execute tasks.\
When using the current-thread runtime, you have **concurrency** but no **parallelism**:
asynchronous tasks will be interleaved, but there will always be at most one task running
at any given time.
### Multithreaded runtime
When using the multithreaded runtime, instead, there can up to `N` tasks running
_in parallel_ at any given time, where `N` is the number of threads used by the
runtime. By default, `N` matches the number of available CPU cores.
When using the multithreaded runtime, instead, there can up to `N` tasks running
_in parallel_ at any given time, where `N` is the number of threads used by the
runtime. By default, `N` matches the number of available CPU cores.
There's more: `tokio` performs **work-stealing**.
There's more: `tokio` performs **work-stealing**.\
If a thread is idle, it won't wait around: it'll try to find a new task that's ready for
execution, either from a global queue or by stealing it from the local queue of another
thread.
Work-stealing can have significant performance benefits, especially on tail latencies,
thread.\
Work-stealing can have significant performance benefits, especially on tail latencies,
whenever your application is dealing with workloads that are not perfectly balanced
across threads.
## Implications
`tokio::spawn` is flavor-agnostic: it'll work no matter if you're running on the multithreaded
or current-thread runtime. The downside is that the signature assume the worst case
`tokio::spawn` is flavor-agnostic: it'll work no matter if you're running on the multithreaded
or current-thread runtime. The downside is that the signature assume the worst case
(i.e. multithreaded) and is constrained accordingly:
```rust
@ -52,7 +52,7 @@ where
{ /* */ }
```
Let's ignore the `Future` trait for now to focus on the rest.
Let's ignore the `Future` trait for now to focus on the rest.\
`spawn` is asking all its inputs to be `Send` and have a `'static` lifetime.
The `'static` constraint follows the same rationale of the `'static` constraint
@ -85,4 +85,4 @@ fn spawner(input: Rc<u64>) {
println!("{}", input);
})
}
```
```

View file

@ -12,11 +12,11 @@ pub fn spawn<F>(future: F) -> JoinHandle<F::Output>
{ /* */ }
```
What does it _actually_ mean for `F` to be `Send`?
What does it _actually_ mean for `F` to be `Send`?\
It implies, as we saw in the previous section, that whatever value it captures from the
spawning environment has to be `Send`. But it goes further than that.
Any value that's _held across a .await point_ has to be `Send`.
Any value that's _held across a .await point_ has to be `Send`.\
Let's look at an example:
```rust
@ -65,13 +65,13 @@ note: required by a bound in `tokio::spawn`
| ^^^^ required by this bound in `spawn`
```
To understand why that's the case, we need to refine our understanding of
To understand why that's the case, we need to refine our understanding of
Rust's asynchronous model.
## The `Future` trait
We stated early on that `async` functions return **futures**, types that implement
the `Future` trait. You can think of a future as a **state machine**.
the `Future` trait. You can think of a future as a **state machine**.
It's in one of two states:
- **pending**: the computation has not finished yet.
@ -90,27 +90,27 @@ trait Future {
### `poll`
The `poll` method is the heart of the `Future` trait.
A future on its own doesn't do anything. It needs to be **polled** to make progress.
The `poll` method is the heart of the `Future` trait.\
A future on its own doesn't do anything. It needs to be **polled** to make progress.\
When you call `poll`, you're asking the future to do some work.
`poll` tries to make progress, and then returns one of the following:
- `Poll::Pending`: the future is not ready yet. You need to call `poll` again later.
- `Poll::Ready(value)`: the future has finished. `value` is the result of the computation,
of type `Self::Output`.
of type `Self::Output`.
Once `Future::poll` returns `Poll::Ready`, it should not be polled again: the future has
completed, there's nothing left to do.
completed, there's nothing left to do.
### The role of the runtime
You'll rarely, if ever, be calling poll directly.
You'll rarely, if ever, be calling poll directly.\
That's the job of your async runtime: it has all the required information (the `Context`
in `poll`'s signature) to ensure that your futures are making progress whenever they can.
## `async fn` and futures
We've worked with the high-level interface, asynchronous functions.
We've worked with the high-level interface, asynchronous functions.\
We've now looked at the low-level primitive, the `Future trait`.
How are they related?
@ -143,23 +143,23 @@ pub enum ExampleFuture {
```
When `example` is called, it returns `ExampleFuture::NotStarted`. The future has never
been polled yet, so nothing has happened.
been polled yet, so nothing has happened.\
When the runtime polls it the first time, `ExampleFuture` will advance until the next
`.await` point: it'll stop at the `ExampleFuture::YieldNow(Rc<i32>)` stage of the state
machine, returning `Poll::Pending`.
When it's polled again, it'll execute the remaining code (`println!`) and
return `Poll::Ready(())`.
machine, returning `Poll::Pending`.\
When it's polled again, it'll execute the remaining code (`println!`) and
return `Poll::Ready(())`.
When you look at its state machine representation, `ExampleFuture`,
When you look at its state machine representation, `ExampleFuture`,
it is now clear why `example` is not `Send`: it holds an `Rc`, therefore
it cannot be `Send`.
## Yield points
As you've just seen with `example`, every `.await` point creates a new intermediate
state in the lifecycle of a future.
state in the lifecycle of a future.\
That's why `.await` points are also known as **yield points**: your future _yields control_
back to the runtime that was polling it, allowing the runtime to pause it and (if necessary)
schedule another task for execution, thus making progress on multiple fronts concurrently.
We'll come back to the importance of yielding in a later section.
We'll come back to the importance of yielding in a later section.

View file

@ -1,6 +1,6 @@
# Don't block the runtime
Let's circle back to yield points.
Let's circle back to yield points.\
Unlike threads, **Rust tasks cannot be preempted**.
`tokio` cannot, on its own, decide to pause a task and run another one in its place.
@ -11,13 +11,13 @@ you `.await` a future.
This exposes the runtime to a risk: if a task never yields, the runtime will never
be able to run another task. This is called **blocking the runtime**.
## What is blocking?
## What is blocking?
How long is too long? How much time can a task spend without yielding before it
becomes a problem?
It depends on the runtime, the application, the number of in-flight tasks, and
many other factors. But, as a general rule of thumb, try to spend less than 100
many other factors. But, as a general rule of thumb, try to spend less than 100
microseconds between yield points.
## Consequences
@ -27,7 +27,7 @@ Blocking the runtime can lead to:
- **Deadlocks**: if the task that's not yielding is waiting for another task to
complete, and that task is waiting for the first one to yield, you have a deadlock.
No progress can be made, unless the runtime is able to schedule the other task on
a different thread.
a different thread.
- **Starvation**: other tasks might not be able to run, or might run after a long
delay, which can lead to poor performances (e.g. high tail latencies).
@ -46,12 +46,12 @@ of entries.
## How to avoid blocking
OK, so how do you avoid blocking the runtime assuming you _must_ perform an operation
that qualifies or risks qualifying as blocking?
that qualifies or risks qualifying as blocking?\
You need to move the work to a different thread. You don't want to use the so-called
runtime threads, the ones used by `tokio` to run tasks.
`tokio` provides a dedicated threadpool for this purpose, called the **blocking pool**.
You can spawn a synchronous operation on the blocking pool using the
You can spawn a synchronous operation on the blocking pool using the
`tokio::task::spawn_blocking` function. `spawn_blocking` returns a future that resolves
to the result of the operation when it completes.
@ -76,4 +76,4 @@ because the cost of thread initialization is amortized over multiple calls.
## Further reading
- Check out [Alice Ryhl's blog post](https://ryhl.io/blog/async-what-is-blocking/)
on the topic.
on the topic.

View file

@ -32,9 +32,9 @@ async fn http_call(v: &[u64]) {
### `std::sync::MutexGuard` and yield points
This code will compile, but it's dangerous.
This code will compile, but it's dangerous.
We try to acquire a lock over a `Mutex` from `std` in an asynchronous context.
We try to acquire a lock over a `Mutex` from `std` in an asynchronous context.
We then hold on to the resulting `MutexGuard` across a yield point (the `.await` on
`http_call`).
@ -42,18 +42,18 @@ Let's imagine that there are two tasks executing `run`, concurrently, on a singl
runtime. We observe the following sequence of scheduling events:
```text
Task A Task B
|
Acquire lock
Yields to runtime
|
+--------------+
|
Tries to acquire lock
Task A Task B
|
Acquire lock
Yields to runtime
|
+--------------+
|
Tries to acquire lock
```
We have a deadlock. Task B we'll never manage to acquire the lock, because the lock
is currently held by task A, which has yielded to the runtime before releasing the
is currently held by task A, which has yielded to the runtime before releasing the
lock and won't be scheduled again because the runtime cannot preempt task B.
### `tokio::sync::Mutex`
@ -73,32 +73,32 @@ async fn run(m: Arc<Mutex<Vec<u64>>>) {
```
Acquiring the lock is now an asynchronous operation, which yields back to the runtime
if it can't make progress.
if it can't make progress.\
Going back to the previous scenario, the following would happen:
```text
Task A Task B
|
Acquires the lock
Starts `http_call`
Yields to runtime
|
+--------------+
|
Tries to acquire the lock
Cannot acquire the lock
Yields to runtime
|
+--------------+
|
`http_call` completes
Releases the lock
Yield to runtime
|
+--------------+
|
Acquires the lock
[...]
Task A Task B
|
Acquires the lock
Starts `http_call`
Yields to runtime
|
+--------------+
|
Tries to acquire the lock
Cannot acquire the lock
Yields to runtime
|
+--------------+
|
`http_call` completes
Releases the lock
Yield to runtime
|
+--------------+
|
Acquires the lock
[...]
```
All good!
@ -107,14 +107,14 @@ All good!
We've used a single-threaded runtime as the execution context in our
previous example, but the same risk persists even when using a multithreaded
runtime.
runtime.\
The only difference is in the number of concurrent tasks required to create the deadlock:
in a single-threaded runtime, 2 are enough; in a multithreaded runtime, we
would need `N+1` tasks, where `N` is the number of runtime threads.
would need `N+1` tasks, where `N` is the number of runtime threads.
### Downsides
Having an async-aware `Mutex` comes with a performance penalty.
Having an async-aware `Mutex` comes with a performance penalty.\
If you're confident that the lock isn't under significant contention
_and_ you're careful to never hold it across a yield point, you can
still use `std::sync::Mutex` in an asynchronous context.
@ -124,6 +124,6 @@ will incur.
## Other primitives
We used `Mutex` as an example, but the same applies to `RwLock`, semaphores, etc.
We used `Mutex` as an example, but the same applies to `RwLock`, semaphores, etc.\
Prefer async-aware versions when working in an asynchronous context to minimise
the risk of issues.
the risk of issues.

View file

@ -1,6 +1,6 @@
# Cancellation
What happens when a pending future is dropped?
What happens when a pending future is dropped?\
The runtime will no longer poll it, therefore it won't make any further progress.
In other words, its execution has been **cancelled**.
@ -38,9 +38,9 @@ async fn http_call() {
}
```
Each yield point becomes a **cancellation point**.
Each yield point becomes a **cancellation point**.\
`http_call` can't be preempted by the runtime, so it can only be discarded after
it has yielded control back to the executor via `.await`.
it has yielded control back to the executor via `.await`.
This applies recursively—e.g. `stream.write_all(&request)` is likely to have multiple
yield points in its implementation. It is perfectly possible to see `http_call` pushing
a _partial_ request before being cancelled, thus dropping the connection and never
@ -49,7 +49,7 @@ finishing transmitting the body.
## Clean up
Rust's cancellation mechanism is quite powerful—it allows the caller to cancel an ongoing task
without needing any form of cooperation from the task itself.
without needing any form of cooperation from the task itself.\
At the same time, this can be quite dangerous. It may be desirable to perform a
**graceful cancellation**, to ensure that some clean-up tasks are performed
before aborting the operation.
@ -71,7 +71,7 @@ async fn transfer_money(
```
On cancellation, it'd be ideal to explicitly abort the pending transaction rather
than leaving it hanging.
than leaving it hanging.
Rust, unfortunately, doesn't provide a bullet-proof mechanism for this kind of
**asynchronous** clean up operations.
@ -86,8 +86,8 @@ The optimal choice is contextual.
## Cancelling spawned tasks
When you spawn a task using `tokio::spawn`, you can no longer drop it;
it belongs to the runtime.
When you spawn a task using `tokio::spawn`, you can no longer drop it;
it belongs to the runtime.\
Nonetheless, you can use its `JoinHandle` to cancel it if needed:
```rust
@ -102,8 +102,8 @@ async fn run() {
- Be extremely careful when using `tokio`'s `select!` macro to "race" two different futures.
Retrying the same task in a loop is dangerous unless you can ensure **cancellation safety**.
Check out [`select!`'s documentation](https://tokio.rs/tokio/tutorial/select) for more details.
If you need to interleave two asynchronous streams of data (e.g. a socket and a channel), prefer using
[`StreamExt::merge`](https://docs.rs/tokio-stream/latest/tokio_stream/trait.StreamExt.html#method.merge) instead.
- Rather than "abrupt" cancellation, it can be preferable to rely
on [`CancellationToken`](https://docs.rs/tokio-util/latest/tokio_util/sync/struct.CancellationToken.html).
Check out [`select!`'s documentation](https://tokio.rs/tokio/tutorial/select) for more details.\
If you need to interleave two asynchronous streams of data (e.g. a socket and a channel), prefer using
[`StreamExt::merge`](https://docs.rs/tokio-stream/latest/tokio_stream/trait.StreamExt.html#method.merge) instead.
- Rather than "abrupt" cancellation, it can be preferable to rely
on [`CancellationToken`](https://docs.rs/tokio-util/latest/tokio_util/sync/struct.CancellationToken.html).

View file

@ -1,7 +1,7 @@
# Outro
Rust's asynchronous model is quite powerful, but it does introduce additional
complexity. Take time to know your tools: dive deep into `tokio`'s documentation
complexity. Take time to know your tools: dive deep into `tokio`'s documentation
and get familiar with its primitives to make the most out of it.
Keep in mind, as well, that there is ongoing work at the language and `std` level
@ -10,25 +10,25 @@ rough edges in your day-to-day work due to some of these missing pieces.
A few recommendations for a mostly-pain-free async experience:
- **Pick a runtime and stick to it.**
- **Pick a runtime and stick to it.**\
Some primitives (e.g. timers, I/O) are not portable across runtimes. Trying to
mix runtimes is likely to cause you pain. Trying to write code that's runtime
agnostic can significantly increase the complexity of your codebase. Avoid it
if you can.
- **There is no stable `Stream`/`AsyncIterator` interface yet.**
An `AsyncIterator` is, conceptually, an iterator that yields new items
agnostic can significantly increase the complexity of your codebase. Avoid it
if you can.
- **There is no stable `Stream`/`AsyncIterator` interface yet.**\
An `AsyncIterator` is, conceptually, an iterator that yields new items
asynchronously. There is ongoing design work, but no consensus (yet).
If you're using `tokio`, refer to [`tokio_stream`](https://docs.rs/tokio-stream/latest/tokio_stream/)
If you're using `tokio`, refer to [`tokio_stream`](https://docs.rs/tokio-stream/latest/tokio_stream/)
as your go-to interface.
- **Be careful with buffering.**
It is often the cause of subtle bugs. Check out
- **Be careful with buffering.**\
It is often the cause of subtle bugs. Check out
["Barbara battles buffered streams"](https://rust-lang.github.io/wg-async/vision/submitted_stories/status_quo/barbara_battles_buffered_streams.html)
for more details.
- **There is no equivalent of scoped threads for asynchronous tasks**.
for more details.
- **There is no equivalent of scoped threads for asynchronous tasks**.\
Check out ["The scoped task trilemma"](https://without.boats/blog/the-scoped-task-trilemma/)
for more details.
Don't let these caveats scare you: asynchronous Rust is being used effectively
at _massive_ scale (e.g. AWS, Meta) to power foundational services.
at _massive_ scale (e.g. AWS, Meta) to power foundational services.\
You will have to master it if you're planning building networked applications
in Rust.

Some files were not shown because too many files have changed in this diff Show more