Definitions for rust's ownership

Written on 2021-12-04 in 1508 words ✍️.
Part of reflection cs software-development programming-languages rustlang

Motivation

I recently stumbled upon two different definitions of ownership in rust. Are they compatible?

The definitions

Students wrote “A Practical Analysis of Rust’s Concurrency Story” (by Saligrama, Shen, and Gjengset in 2019) which explains the basic elements of rust. Ownership is presented with the following wording:

Ownership: In Rust, every variable is owned by some scope. When a scope ends, it is responsible for cleaning up any resources used by the variables that it owns. For example, when a pointer to heap-allocated memory leaves a scope, that automatically frees the allocated memory. Similarly, if a file is opened, the handle to that file will close the file when it goes out of scope. Each variable can only have one owner at a given time, but ownership can be passed to other scopes through function calls or returns. Variables that have gone out of scope cannot be accessed any more (e.g., no dangling pointers), and this is checked at compile-time.

The focus shall be put on the first sentence. The immediate conclusion is that a scope is the owner of a variable.

In contrast, the official rust book uses the following definition:

First, let’s take a look at the ownership rules. Keep these rules in mind as we work through the examples that illustrate them:

  • Each value in Rust has a variable that’s called its owner.

  • There can only be one owner at a time.

  • When the owner goes out of scope, the value will be dropped.

— rust book

Here, the variable is the owner of a variable.

For more examples, depth-first.com uses the second definition whereas tutorialedge.net uses the first one.

A generic example

I tend to use the following example to illustrate ownership:

#[derive(Debug)]
struct Stats { score: u32 }

fn sub(mut s: Stats) {        // (3)
  s.score += 1;               // (4)
}

fn main() {
  let a = Stats { score: 8 }; // (1)
  sub(a);                     // (2)
  println!("{:?}", a);        // (5)
}

A Stats instance is created (1) and supplied as argument to subroutine sub (2). To make it explicit that a copy is supplied, I put some mut into sub (3), but not at the declaration (1). Inside sub, the Status instance is modified (4). The final use of a (5) is invalid. An error will be thrown:

error[E0382]: borrow of moved value: `a`
  --> src/main.rs:10:20
   |
8  |     let a = Stats { score: 8 };
   |         - move occurs because `a` has type `Stats`,
   |           which does not implement the `Copy` trait
9  |     sub(a);
   |         - value moved here
10 |     println!("{}", a);
   |                    ^
   |         value borrowed here after move

Skipping line (5) will lead to a compiling program.

Remarks on the terminology

  1. A value is the state of allocated bits in memory. For example, an instance of Stats.

  2. A variable is the name a value binds to.

  3. A scope is a block denoted in rust with curly braces { … }.

Ownership applied to the generic example

With respect to the first definition, the behavior can be explained the following way:

  1. Let main be the scope of the main function. Let sub be the scope of the subroutine.

  2. Variable a is owned by scope main.

  3. The ownership of a in main is transferred to s in sub.

  4. Since variable s goes out of this scope, it will be dropped.

  5. Due to ownership transfer, the variable a cannot be used after calling sub anymore.

With respect to the second definition, the behavior can be explained the following way:

  1. Variable a owns value Stats { score: 8 }.

  2. The ownership of a is transferred to s.

  3. Since variable s goes out of this scope, it will be dropped.

  4. Due to ownership transfer, the variable a cannot be used after calling sub anymore.

These definitions seem to be compatible. But I think I found a counterexample, where the second definition fits but the first one does not.

A counterexample

#[derive(Debug)]
struct Stats {
    score: u8,
}

fn main() {
    let a = Stats { score: 8 };
    let b = a;
    println!("{:?}", a);
}

This example also triggers an error:

error[E0382]: borrow of moved value: `a`
 --> src/main.rs:9:22
  |
7 |     let a = Stats{ score: 8 };
  |         - move occurs because `a` has type `Stats`, which does not implement the `Copy` trait
8 |     let b = a;
  |             - value moved here
9 |     println!("{:?}", a);
  |                      ^ value borrowed here after move

For more information about this error, try `rustc --explain E0382`.

The second definition can be applied the following way:

  1. Variable a owns value Stats { score: 8 }.

  2. The ownership of a is transferred to b.

  3. Since a is not the owner anymore, it cannot use the Stats instance and an error occurs.

The first definition does not have a proper explanation:

  1. Let main be the scope of the main function.

  2. Variable a is owned by main. Variable b is owned by main.

  3. No ownership transfer occurs, because the scopes are the same.

Conclusion

I think only the second definition shall be used to explain ownership in rust. To summarize ownership in complete depth, this is the terminology, I use, to explain rust concepts:

ownership

One variable uniquely owns a value.

ownership transfer

An ownership transfer occurs if a non-copyable value is assigned to a variable or supplied as non-reference argument in a function call.

borrowing

A value is supplied as reference argument in a function call (and thus ownership is transferred back to the caller once the function is finished).

drop

Whenever the owner of a value goes out of scope, the value will be dropped. A drop usually means deallocation, but it does not need to. For example, in the case of zero strong references and non-zero weak references in an Rc, the drop occurs without deallocation. Be aware that rust’s memory safety model does not include the lack of memory leak since Drop trait implementation can easily lead to memory leaks.

The most important conclusion, why ownership is useful is that we can pin-point where a value will be dropped without an explicit need for free() calls. Accidental double-frees or use-after-frees are not possible and checked at compile-time.