If you read the Rust naming guidelines, you are presented to this table:

Prefix Cost Ownership
as_ Free borrowed -> borrowed
to_ Expensive borrowed -> borrowed
borrowed -> owned (non-Copy types)
owned -> owned (Copy types)
into_ Variable owned -> owned (non-Copy types)

But what does the ownership notation mean (and what does the cost mean?)? Can i come up with some rusty code that explains this to me, a common man with only a fragile understanding of the vast and complex type system of the most powerful programming language ever created? CHALLANGE ACCEPTED!

Ownership

For me to start understanding this, i need to lay out what ownership really is. Again: for the layman (not the professor type rust linuguist magician). What is ownership exactly? Taken directly from the docs:

  • Each value in Rust has an owner
  • There can only be one owner at a time
  • When the owner goes out of scope, the value will be dropped

Borrow vs owned vs non-copy vs copy - a journey

Owned

An owned value is simple, so lets begin with that. Lets freaking OWN an i32

let my_own_int = 42;

there we are. We now own an i32. Lets take over the world :)

Borrow (Borrowing and borrowed)

This leads us to the first question that i have: what is borrowing and what is borrowed? How is it described in rust?. According to the docs,

> Creating a reference is borrowing

YES, EASY. I can describe that with a bit of code:

let my_own_int = 42;
let borrowed_int = &my_own_int; // creating this reference is called borrowing, and you are creating a borrowed value

So by creating a reference, we are borrowing a value, and thus we have created a borrowed variable, in our case borrowed_int contains the borrowed value: CONTAINS, the variable is not the borrowed reference, it just contains it.

Copy and non-copy

TLDR

  1. copy: when structs is marked with Clone and Copy
  2. non-copy: when structs aren't marked with Clone ~and ~Copy

Non-TLDR

When i stated my career in IT, we did copy a lot on the Xerox machine. A marvelous piece of machinery! Nowadays we don't use this piece of magic so much, which i like because … well … environment, climate, trees etc. I like those things a lot! But i still use copying when i program: it makes programming easier, and i think it is actaully those types of copying the also refer to in the table above: types that can or can't be copied by copying it bit by bit.

In rust, a copy type is simply a struct marked with the Clone and Copy traits, like this:

#[derive(Copy, Clone, Debug)] // we need Debug else we can't call println on the type
struct MyType;

This type is a copy type! It makes it easy to reference it in another variable, like this

let first = MyType;
let second = first;

We can now use first as well as second. This will compile:

println!("{:?}", second);
println!("{:?}", first);

Simply because first is copied, into second: no move is made.

A non-copy does not have the Copy and Clone marker types! And the above example will not compile, because a move is made: we move first into second.

So, a TLDR on the first part

  1. Borrowing: when we are creating a reference
  2. Borrowed: a variable that contains a reference to an owned value
  3. Copy: when structs is marked with Clone and Copy
  4. Non-copy: when structs aren't marked with Clone and Copy

Cost

The last part: what does cost mean in the table? Especially when we are talking about owned, borrowed, copy and non-copy. I will try to give my shot at it:

as_

Will always just give a view into something. The functions operate on a borrowed self, &self and returns a borrowed value. Fx str::as_bytes(). It takes a borowed str (the self is &self), and returns the underlying byte arrary, which is borrowed: &[u8]. We are just viewing into the underlying structure, and we are not creating new owned objects, or doing expensive checks. I think of this as slicing into a string, where you just want some part if it.

An example: borrowed -> borrowed

A somewhat simple example would look like this

struct AnObject {
    pub val: String,
}

impl AnObject {
    fn as_bytes(&self) -> &[u8] {   
        self.val.as_bytes()
    }
}

Here we just return as_bytes() on the underlying String because, this just returns a reference to the underlying data structure of the String that is … well … a vector of bytes. Because we return a reference, we are not creating somethgin new, and we are not validating something: we are just returning a reference, which is free.

Here we also see that as_bytes is just a function accepting a borrowed value (&self) and returning a borrowed value (&[u8]).

to_

This is always expensive, even if we take in a &self and returning a borrowed value. There might be an expensive UTF-8 check, a traversal of all the bytes, or a conversion. It typically involves copying stuff, creating new stuff from borrowed stuff (that is: from borrowed to owned), and validating stuff, which is expensive. Going from owned to owned (where the self is not a borrowed self which involves copying self) for copy types, does sound cheap for a f64, but it does involve copying and creating a new object, which might be cheap, for simple types, but it is not cheap for complex types. It is still more expensive than borrowing it, which i think always making this expensive compared to just borrowing.

It is commonly that we stay at the same level of abstraction: that means that we convert from &str to a String. We typically don't go from a String to a range of bytes. I do this in my examples, because i want to emphansize the fact that it is still techically okay to do it, but it is not the typical case :)

An example: borrowed -> borrowed

I have made up an example of something i call guarded bytes … it means that the bytes in the chars must only be between a range fro c to v. This is to show that to_ often contains something that could be expensive: the bigger the string, the bigger the loop. This returns None if it is flagged as not valid: that is, if the a given char falls out of that range. It is implemented for a struct that simply have a String as it underlying data structure.

Note that we take a borrowed self as input, &self and returns a reference to the underlying byte vector, &[u8] thus making it borrowed -> borrowed. The expensive part is the for loop.

struct AnObject {
    pub val: String,
}

impl AnObject {
    // to_ borrowed (&self) -> borrowed (&[u8])
    fn to_guarded_bytes(&self) -> Option<&[u8]> {
        // this is really a dumb example, but it kind of proves the point: we convert our Target to a str. The input and output smells like the above as_ and the as_ is free, but this really isn't, because we are doing some expensive   validation first  
        let mut is_valid = true;    

        for s in self.val.chars() {
            if s > 'b' && s < 'w' {
                is_valid = false;
                break;
            }
        }

        if is_valid {
            return Some(self.val.as_bytes());
        }

        None
    }
}

An example: borrowed -> owned (non-copy type)

Here we build onto of the previous example, but for a non-copy type. It is still expensive: we have the for loop from the guarded call. It returns a new object, both my structs isn't copy types. It returns a new object that is owned.


struct AnObject {
    pub val: String,
}

struct TheOtherObject {
    pub val: Vec<u8>,
}

impl AnObject {
    fn as_bytes(&self) -> &[u8] {   
        self.val.as_bytes()
    }

    fn to_guarded_bytes(&self) -> Option<&[u8]> {
        let mut is_valid = true;

        for s in self.val.chars() {
            if s > 'b' && s < 'w' {
                is_valid = false;
                break;
            }
        }

        if is_valid {
            return Some(self.val.as_bytes());
        }

        None
    }

    // to_ borrowed (&self) -> owned (TheOtherObject) (non- copy types)  
    fn to_guarded_object(&self) -> Option<TheOtherObject> { 
        // This performs our previous check, and if everything is fine, it converts our object to the new object, simply creating it, and making some more to_ calls. Really expensive, but no copy is made of `self`
        let guarded_bytes = self.to_guarded_bytes();

        match guarded_bytes {
            Some(bytes) => {
                Some(TheOtherObject {val: bytes.to_vec()})
            },
            None => None
        }
    }
}

An example: owned -> owned (copy type)

This is where it gets interesting: we have a copy type, our CopyObject. The method takes a self thus making a copy of itself, because it implements Copy, Clone. Then it returns another object that contains the copy of it's underlying data structure.

#[derive(Copy, Clone, Debug)]
struct CopyObject<'a> {
    pub val: &'a str,
}

struct TheOtherObject {
    pub val: Vec<u8>,
}

impl<'a> CopyObject<'a> {

    // to_ owned (self) -> owned (TheOtherObject) (copy types)
    fn to_object(self) -> TheOtherObject {  
        // because self is not a reference, it is actually not consumed (likewise it isn't consumed when we do a &self ... yes  we could do a &self here, but it kind of defeats the purpose of doing the example), but it is copied. The copy itself though is more expensive than just doing a reference. In our example, we are actually being quite expensive doing a to_vec conversion of our free as_ view

        TheOtherObject { val: self.val.as_bytes().to_vec()}
    }
}

The special case here is that we can actually use the initial object afterwards, like this:

fn main() {
    let cp = CopyObject {val: "hello world"};
    let copied = cp.to_object();
    
    println!("{}", cp.val);
    println!("{:?}", copied.val);
}

I can use both cp and copied, because it copies self when to_object is called. The potential expensive part here is the copy. This isn't possible with the non-copy example from before. If the CopyObject didn't have a Copy, Clone, we couldn't do this. Expensive indeed!

into_

The cost of into_ is variable. Check the docs if you want to be sure what cost it has. Sometimes it doesn't cost a thing calling into_ and sometimes it does cost something. It will always expose the underlying data structure, as with as_, but the difference is that it also hands over ownership to the new object it creates, thus it takes a self as parameter, and not &self. It has to be non-copy, because you are going from something INTO something different. It also decreases abstraction, as with as_, so fx you can go from a String to bytes. It exposes and gives you the underlying data structure.

An example: owned -> owned (non-copy type)

This example returns the underlying bytes for the string, and in the process it also takes ownership of the String, thus allowing you to only use the bytes afterwards. The extraction of the bytes is free, so this particully example it isn't expensive to do this.

struct AnObject {
    pub val: String,
}

impl AnObject {
    // into_ owned -> owned (non-copy types)
     fn into_bytes(self) -> Vec<u8> {
         // because the underlying data structure of a String is a Vec<u8>, the into_bytes actually just returns the underlying data structure, really just making this a dumb proxy
         self.val.into_bytes()
     }
}

Some into_ calls is expensive. The docs has a pretty good example where the underlying writer is returned, calling a flush on the writer beforehand. This is potentially an expensive thing to do.

The small things

Please note that i don't follow the normal conversion because my to_ don't stay at the same level of abstraction, that is: i am not returning a String that is guarded, but a range of bytes. I still accept my own code here, because it is not a rule cut in stone. In the linked doc, it says:

Conversions prefixed to_, on the other hand, typically stay at the same level of abstraction but do some work to change from one representation to another.

So i am still allowed to do this kind of to_.

General assumptions

My rule of thumb is going to be (when reading code): as_ is free, the others are expensive. If you want to know if a into_ is free, you need to read the docs carefully.

Abstraction levels

as_ and into_ decreses abstraction, exposing the underlying data structure or gives a view of it. to_ typically stays on the same level, but changes the representation hereof.

Code

The code for the above examples can be found herre:

Did you know that as_ often also have an into_. as_ gives you a view into the underlying data, and into_ gives you ownership. So bacically as_bytes and into_bytes on a String, is free as in beer.

TLDR

  1. as_ cost is NOTHING, gives you a view of the underlying data structure
  2. to_ cost is EXPENSIVE, stays on the same level. Converts stuff
  3. into_ cost is EXPENSIVE, until you read the docs. Exposes the underlying data structure, giving you ownership