Reading Club: The Book Ch 4 "Understanding Ownership" [PROJECT]

maegul (he/they)@lemmy.ml · 7 months ago

Reading Club: The Book Ch 4 "Understanding Ownership" [PROJECT]

maegul (he/they)@lemmy.ml · 7 months ago

If I had to explain ownership in rust (based on The Book, Ch 4)

I had a crack at this and found myself writing for a while. I thought I’d pitch at a basic level and try to provide a sort core and essential conceptual basis (something which I think The Book might be lacking a little??)

Dunno if this will be useful or interesting to anyone, but I found it useful to write. If anyone does find any significant errors, please let me know!

General Idea or Purpose

Generally, the whole point is to prevent memory mismanagement.
IE “undefined behaviour”: whenever memory can be read/written when it is no longer controlled by a variable in the program.
- Rust leans a little cautious in preventing this. It will raise compilation errors for some code that won’t actually cause undefined. And this is in large part, AFAICT, because its means of detecting how long a variable “lives” can be somewhat course/incomplete (see the Rustonomicon). Thus, rust enforces relatively clean variable management, and simply copying data will probably be worth it at times.

Ownership

Variables live in, or are “owned by” a particular scope (or stack frames, eg functions).
Data, memory, or “values” are owned by variables, and only one at a time.
Variables are stuck in their scopes (they live and die in a single scope and can’t be moved out).
Data or memory can be moved from one owning variable to another. In doing so they can also move from one scope to another (eg, by passing a variable into a function).
Once a variable has its data/memory moved to another, that variable is dead.
If data/memory is not moved away from its variable by the completion of its scope, that data/memory “dies” along with the variable (IE, the memory is deallocated).

// > ON THE HEAP

// Ownership will be "moved" into this function's scope
fn take_ownership_heap(_: Vec<i32>) {}

let a = vec![1, 2, 3];
take_ownership_heap(a);

// ERROR
let b = a[0];
// CAN'T DO: value of `a` is borrowed/used after move
// `a` is now "dead", it died in `take_ownership_heap()`;

Variables of data on the stack (eg integers) are implicitly copied (as copying basic data types like integers is cheap and unproblematic), so ownership isn’t so much of an issue.
Copying (or cloning) data/memory on the heap is not trivial and so must be done explicitly (eg, with my_variable.copy()) and in the case of custom types (eg structs) added to or implemented for that particular type (which isn’t necessarily difficult).

// > ON THE STACK

// An integer will copied into `_`, and no ownership will be moved
fn take_ownership_stack(_: i32) {}

let x = 11;
take_ownership_stack(x);

let y = x * 10;
// NOT A PROBLEM, as x was copied into take_ownerhsip_stack

Borrowing (with references)

Data can be “borrowed” without taking ownership.
This kind of variable is a “reference” (AKA a “non-owning pointer”).
As the variable doesn’t “own” the data, the data can “outlive” the reference.
- Useful for passing a variable’s data into a function without it “dying” in that function.

fn borrow_heap(_: &Vec<i32>) {}

let e = vec![1, 2, 3];
// pass in a reference
borrow_heap(&e);

let f = e[0];
// NOT A PROBLEM, as the data survived `borrow_heap`
// because `e` retained ownership.
// &e, a reference, only "borrowed" the data

But it also means that the abilities or “permissions” of the reference with respect to the data are limited and more closely managed in order to prevent undefined behaviour.
The chief limitation is that two references cannot exist at the same time where one can mutate the data it points to and another can read the same data.
Multiple references can exist that only have permission to read the same data, that’s fine.
The basic idea is to prevent data from being altered/mutated while something else is reading the same data, as this is a common cause of problems.
Commonly expressed as Pointer Safety Principle: data should never be aliased and mutated at the same time.
For this reason, shared references are “read only” references, while unique references are mutable references that enable their underlying data to be mutated (AKA, mutable references).
- A minor confusion that can arise here is between mutable or unique references and reference variables that are mutable. A unique reference is able to mutate the data pointed to. While a mutable variable that is also a reference can have its pointer and the data/memory and points to mutated. These are independent aspects and can be freely combined.
- Perhaps easily understood by recognising that a reference is just another variable whose data is a pointer or memory address.
Additionally, while variables of data on the stack typically don’t run into ownership issues because whenever ownership would be moved the data is implicitly copied, references to such variables can exist and they are subject to the same rules and monitoring by the compiler.

// >>> Can have multiple "shared references"

let e_ref1 = &e;
let e_ref2 = &e;

let e1 = e_ref1[0];
let e2 = e_ref2[0];

// >>> CANNOT have shared and mutable/unique references

let mut j = vec![1, 2, 3];

// A single mutable or "unique" reference
let j_mut_ref = &mut j;
// can mutate the actual vector
j_mut_ref[0] = 11;

// ERROR
let j_ref = &j;
// CANNOT now have another shared/read-only reference while also having a mutable one (j_mut_ref)
// mutation actually needs to occur after the shared reference is created
// in order for rust to care, otherwise it can recognise that the mutable
// reference is no longer used and so doesn't matter any more
j_mut_ref[1] = 22;

// same as above but for stack data
let mut j_int = 11;
let j_int_mut_ref = &mut j_int;
// ERROR
let j_int_ref = &j_int;
// CANNOT assign another reference as mutable reference already exists

*j_int_mut_ref = 22;
// dereference to mutate here and force rust to think the mutable reference is still "alive"

Ownership and read/write permissions are altered when references are created

The state of a variable’s ownership and read-only or mutable permissions is not really static.
Instead, they are altered as variables and references are created, used, left unused and then “die” (ie, depending on their “life times”).
This is because the problem being averted is multiple variables mangling the same data. So what a variable or reference can or cannot do depends on what other related variables exist and what they are able to do.
Generally, these “abilities” can be thought of as “permissions”.
- “Ownership”: the permission a variable has to move its ownership to another variable or “kill” the “owned” data/memory when the variable falls out of scope.
- “Read”: permission to read the data
- “Write”: permission to mutate the data or write to the referenced heap memory
As an example of permissions changing: a variable loses “ownership” of its data when a reference to it is created. This prevents a variable from taking its data into another scope and then potentially “dying” and being deallocated for a reference to that memory to then be used and read or write random/arbitrary data from the now deallocated memory.
Similarly, a variable that owns its data/memory/value will lose all permissions if a mutable reference (or unique reference) is made to the same data/variable. This is why a mutable reference is also known as a unique reference.
Permissions are returned when the reference(s) that altered permissions are no longer used, or “die” (IE, their lifetime comes to an end).

// >>> References remove ownership permissions

fn take_ownership_heap(_: Vec<i32>) {}

let k = vec![1, 2, 3];

let k_ref = &k;

// ERROR
take_ownership_heap(k);
// Cannot move out of `k` into `take_ownership_heap()` as it is currently borrowed
let k1 = k_ref[0];
// if the shared reference weren't used here, rust be happy...
// as the reference's lifetime would be considered over

// >>> Mutable reference remove read permissions

let mut m = 13;

let m_mut_ref = &mut m;

// ERROR
let n = m * 10;
// CANNOT read or use `m` as it's mutably borrowed
*m_mut_ref += 1;
// again, must use the mutable reference here to "keep it alive"

Lifetimes are coming

fn first_or(strings: &Vec<String>, default: &String) -> &String {
    if strings.len() > 0 {
        &strings[0]
    } else {
        default
    }
}

// Does not compile
error[E0106]: missing lifetime specifier
 --> test.rs:1:57
  |
1 | fn first_or(strings: &Vec<String>, default: &String) -> &String {
  |                      ------------           -------     ^ expected named lifetime parameter
  |
  = help: this function's return type contains a borrowed value, but the signature does not say whether it is borrowed from `strings` or `default`

In all of the above, the dynamics of what permissions are available depends on how long a variable is used for, or its “lifetime”.
Lifetimes are something that rust detects by inspecting the code. As stated above, it can be a bit cautious or course in this detection.
This can get to the point where you will need to explicitly provide information as to the length of a variable’s lifetime in the code base. This is done with lifetime annotations and are the 'as in the following code: fn longest<'a>(x: &'a str, y: &'a str) -> &'a str.
They won’t be covered here … but they’re coming.
Suffice it to appreciate why this is a problem needing a solution, with the code above as an example:
- the function first_or takes two references but returns only one reference that will, depending entirely on runtime logic, depend on one of the two input references. IE, depending on what happens at runtime, one of the input references have a longer lifetime than the other. As Rust cannot be sure of the lifetimes of all three references, the programmer has to provide that information. A topic for later.

Reading Club: The Book Ch 4 "Understanding Ownership" [PROJECT]

Reading Club: The Book Ch 4 "Understanding Ownership" [PROJECT]

Understanding Ownership - The Rust Programming Language

If I had to explain ownership in rust (based on The Book, Ch 4)

General Idea or Purpose

Ownership

Borrowing (with references)

Ownership and read/write permissions are altered when references are created

Lifetimes are coming