• Sign In / Suggest an Article

Current ISO C++ status

Upcoming ISO C++ meetings

Upcoming C++ conferences

Compiler conformance status

ISO C++ committee meeting

March 18-23, Tokyo, Japan

April 17-20, Bristol, UK

using std::cpp 2024

April 24-26, Leganes, Spain  

C++ Now 2024

May 7-12, Aspen, CO, USA

June 24-29, St. Louis, MO, USA

July 2-5, Folkestone, Kent, UK

What is a reference?

An alias (an alternate name) for an object.

References are frequently used for pass-by-reference:

Here i and j are aliases for main’s x and y respectively. In other words, i is x — not a pointer to x , nor a copy of x , but x itself. Anything you do to i gets done to x , and vice versa. This includes taking the address of it. The values of &i and &x are identical.

That’s how you should think of references as a programmer. Now, at the risk of confusing you by giving you a different perspective, here’s how references are implemented. Underneath it all, a reference i to object x is typically the machine address of the object x . But when the programmer says i++ , the compiler generates code that increments x . In particular, the address bits that the compiler uses to find x are not changed. A C programmer will think of this as if you used the C style pass-by-pointer, with the syntactic variant of (1) moving the & from the caller into the callee, and (2) eliminating the * s. In other words, a C programmer will think of i as a macro for (*p) , where p is a pointer to x (e.g., the compiler automatically dereferences the underlying pointer; i++ is changed to (*p)++ ; i = 7 is automatically changed to *p = 7 ).

Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.

What happens if you assign to a reference?

You change the state of the referent (the referent is the object to which the reference refers).

Remember: the reference is the referent, so changing the reference changes the state of the referent. In compiler writer lingo, a reference is an “lvalue” (something that can appear on the left hand side of an assignment operator ).

What happens if you return a reference?

The function call can appear on the left hand side of an assignment operator .

This ability may seem strange at first. For example, no one thinks the expression f() = 7 makes sense. Yet, if a is an object of class Array , most people think that a[i] = 7 makes sense even though a[i] is really just a function call in disguise (it calls Array::operator[](int) , which is the subscript operator for class Array ).

What does object.method1().method2() mean?

It chains these method calls, which is why this is called method chaining .

The first thing that gets executed is object.method1() . This returns some object, which might be a reference to object (i.e., method1() might end with return *this; ), or it might be some other object. Let’s call the returned object objectB . Then objectB becomes the this object of method2() .

The most common use of method chaining is in the iostream library. E.g., cout << x << y works because cout << x is a function that returns cout .

A less common, but still rather slick, use for method chaining is in the Named Parameter Idiom .

How can you reseat a reference to make it refer to a different object?

You can’t separate the reference from the referent.

Unlike a pointer, once a reference is bound to an object, it can not be “reseated” to another object. The reference isn’t a separate object. It has no identity. Taking the address of a reference gives you the address of the referent. Remember: the reference is its referent.

In that sense, a reference is similar to a const pointer such as int* const p (as opposed to a pointer to const such as const int* p ). But please don’t confuse references with pointers; they’re very different from the programmer’s standpoint.

Why does C++ have both pointers and references?

C++ inherited pointers from C, so they couldn’t be removed without causing serious compatibility problems. References are useful for several things, but the direct reason they were introduced in C++ was to support operator overloading. For example:

More generally, if you want to have both the functionality of pointers and the functionality of references, you need either two different types (as in C++) or two different sets of operations on a single type. For example, with a single type you need both an operation to assign to the object referred to and an operation to assign to the reference/pointer. This can be done using separate operators (as in Simula). For example:

Alternatively, you could rely on type checking (overloading). For example:

When should I use references, and when should I use pointers?

Use references when you can, and pointers when you have to.

References are usually preferred over pointers whenever you don’t need “reseating” . This usually means that references are most useful in a class’s public interface. References typically appear on the skin of an object, and pointers on the inside.

The exception to the above is where a function’s parameter or return value needs a “sentinel” reference — a reference that does not refer to an object. This is usually best done by returning/taking a pointer, and giving the nullptr value this special significance ( references must always alias objects, not a dereferenced null pointer ).

Note: Old line C programmers sometimes don’t like references since they provide reference semantics that isn’t explicit in the caller’s code. After some C++ experience, however, one quickly realizes this is a form of information hiding, which is an asset rather than a liability. E.g., programmers should write code in the language of the problem rather than the language of the machine.

What does it mean that a reference must refer to an object, not a dereferenced null pointer?

It means this is illegal:

NOTE: Please do not email us saying the above works on your particular version of your particular compiler. It’s still illegal. The C++ language, as defined by the C++ standard, says it’s illegal; that makes it illegal. The C++ standard does not require a diagnostic for this particular error, which means your particular compiler is not obliged to notice that p is nullptr or to give an error message, but it’s still illegal. The C++ language also does not require the compiler to generate code that would blow up at runtime. In fact, your particular version of your particular compiler may, or may not, generate code that you think makes sense if you do the above. But that’s the point: since the compiler is not required to generate sensible code, you don’t know what the compiler will do. So please do not email us saying your particular compiler generates good code; we don’t care. It’s still illegal. See the C++ standard for more, for example, C++ 2014 section 8.3.2 [dcl.ref] p5.

By way of example and not by way of limitation, some compilers do optimize nullptr tests since they “know” all references refer to real objects — that references are never (legally) a dereferenced nullptr . That can cause a compiler to optimize away the following test:

As stated above, this is just an example of the sort of thing your compiler might do based on the language rule that says a reference must refer to a valid object. Do not limit your thinking to the above example; the message of this FAQ is that the compiler is not required to do something sensible if you violate the rules. So don’t violate the rules.

Patient: “Doctor, doctor, my eye hurts when I poke it with a spoon.” Doctor: “Don’t poke it, then.”

What is a handle to an object? Is it a pointer? Is it a reference? Is it a pointer-to-a-pointer? What is it?

The term handle is used to mean any technique that lets you get to another object — a generalized pseudo-pointer. The term is (intentionally) ambiguous and vague.

Ambiguity is actually an asset in certain cases. For example, during early design you might not be ready to commit to a specific representation for the handles. You might not be sure whether you’ll want simple pointers vs. references vs. pointers-to-pointers vs. references-to-pointers vs. integer indices into an array vs. strings (or other key) that can be looked up in a hash-table (or other data structure) vs. database keys vs. some other technique. If you merely know that you’ll need some sort of thingy that will uniquely identify and get to an object, you call the thingy a Handle.

So if your ultimate goal is to enable a glop of code to uniquely identify/look-up a specific object of some class Fred , you need to pass a Fred handle into that glop of code. The handle might be a string that can be used as a key in some well-known lookup table (e.g., a key in a std::map<std::string,Fred> or a std::map<std::string,Fred*> ), or it might be an integer that would be an index into some well-known array (e.g., Fred* array = new Fred[maxNumFreds] ), or it might be a simple Fred* , or it might be something else.

Novices often think in terms of pointers, but in reality there are downside risks to using raw pointers. E.g., what if the Fred object needs to move? How do we know when it’s safe to delete the Fred objects? What if the Fred object needs to (temporarily) get serialized on disk? etc., etc. Most of the time we add more layers of indirection to manage situations like these. For example, the handles might be Fred** , where the pointed-to Fred* pointers are guaranteed to never move but when the Fred objects need to move, you just update the pointed-to Fred* pointers. Or you make the handle an integer then have the Fred objects (or pointers to the Fred objects) looked up in a table/array/whatever. Or whatever.

The point is that we use the word Handle when we don’t yet know the details of what we’re going to do.

Another time we use the word Handle is when we want to be vague about what we’ve already done (sometimes the term magic cookie is used for this as well, as in, “The software passes around a magic cookie that is used to uniquely identify and locate the appropriate Fred object”). The reason we (sometimes) want to be vague about what we’ve already done is to minimize the ripple effect if/when the specific details/representation of the handle change. E.g., if/when someone changes the handle from a string that is used in a lookup table to an integer that is looked up in an array, we don’t want to go and update a zillion lines of code.

To further ease maintenance if/when the details/representation of a handle changes (or to generally make the code easier to read/write), we often encapsulate the handle in a class. This class often overloads operators operator-> and operator* (since the handle acts like a pointer, it might as well look like a pointer).

Should I use call-by-value or call-by-reference?

(Note: This FAQ needs to be updated for C++11.)

That depends on what you are trying to achieve:

  • If you want to change the object passed, call by reference or use a pointer; e.g., void f(X&); or void f(X*); .
  • If you don’t want to change the object passed and it is big, call by const reference; e.g., void f(const X&); .
  • Otherwise, call by value; e.g. void f(X); .

What does “big” mean? Anything larger than a couple of words.

Why would you want to change an argument? Well, often we have to, but often we have an alternative: produce a new value. Consider:

For a reader, incr2() is likely easier to understand. That is, incr1() is more likely to lead to mistakes and errors. So, we should prefer the style that returns a new value over the one that modifies a value as long as the creation and copy of a new value isn’t expensive.

What if you do want to change the argument, should you use a pointer or use a reference? If passing “not an object” (e.g., a null pointer) is acceptable, using a pointer makes sense. One style is to use a pointer when you want to modify an object because in some contexts that makes it easier to spot that a modification is possible.

Note also that a call of a member function is essentially a call-by-reference on the object, so we often use member functions when we want to modify the value/state of an object.

Why is this not a reference?

Because this was introduced into C++ (really into C with Classes) before references were added. Also, Stroustrup chose this to follow Simula usage, rather than the (later) Smalltalk use of self .

Please Login to submit a recommendation.

If you don’t have an account, you can register for free.

Learn C++

12.3 — Lvalue references

In C++, a reference is an alias for an existing object. Once a reference has been defined, any operation on the reference is applied to the object being referenced.

Key insight

A reference is essentially identical to the object being referenced.

This means we can use a reference to read or modify the object being referenced. Although references might seem silly, useless, or redundant at first, references are used everywhere in C++ (we’ll see examples of this in a few lessons).

You can also create references to functions, though this is done less often.

Modern C++ contains two types of references: lvalue references , and rvalue references . In this chapter, we’ll discuss lvalue references.

Related content

Because we’ll be talking about lvalues and rvalues in this lesson, please review 12.2 -- Value categories (lvalues and rvalues) if you need a refresher on these terms before proceeding.

Rvalue references are covered in the chapter on move semantics ( chapter 22 ).

Lvalue reference types

An lvalue reference (commonly just called a reference since prior to C++11 there was only one type of reference) acts as an alias for an existing lvalue (such as a variable).

To declare an lvalue reference type, we use an ampersand (&) in the type declaration:

Lvalue reference variables

One of the things we can do with an lvalue reference type is create an lvalue reference variable. An lvalue reference variable is a variable that acts as a reference to an lvalue (usually another variable).

To create an lvalue reference variable, we simply define a variable with an lvalue reference type:

In the above example, the type int& defines ref as an lvalue reference to an int, which we then initialize with lvalue expression x . Thereafter, ref and x can be used synonymously. This program thus prints:

From the compiler’s perspective, it doesn’t matter whether the ampersand is “attached” to the type name ( int& ref ) or the variable’s name ( int &ref ), and which you choose is a matter of style. Modern C++ programmers tend to prefer attaching the ampersand to the type, as it makes clearer that the reference is part of the type information, not the identifier.

Best practice

When defining a reference, place the ampersand next to the type (not the reference variable’s name).

For advanced readers

For those of you already familiar with pointers, the ampersand in this context does not mean “address of”, it means “lvalue reference to”.

Modifying values through an lvalue reference

In the above example, we showed that we can use a reference to read the value of the object being referenced. We can also use a reference to modify the value of the object being referenced:

This code prints:

In the above example, ref is an alias for x , so we are able to change the value of x through either x or ref .

Initialization of lvalue references

Much like constants, all references must be initialized.

When a reference is initialized with an object (or function), we say it is bound to that object (or function). The process by which such a reference is bound is called reference binding . The object (or function) being referenced is sometimes called the referent .

Lvalue references must be bound to a modifiable lvalue.

Lvalue references can’t be bound to non-modifiable lvalues or rvalues (otherwise you’d be able to change those values through the reference, which would be a violation of their const-ness). For this reason, lvalue references are occasionally called lvalue references to non-const (sometimes shortened to non-const reference ).

In most cases, the type of the reference must match the type of the referent (there are some exceptions to this rule that we’ll discuss when we get into inheritance):

Lvalue references to void are disallowed (what would be the point?).

References can’t be reseated (changed to refer to another object)

Once initialized, a reference in C++ cannot be reseated , meaning it cannot be changed to reference another object.

New C++ programmers often try to reseat a reference by using assignment to provide the reference with another variable to reference. This will compile and run -- but not function as expected. Consider the following program:

Perhaps surprisingly, this prints:

When a reference is evaluated in an expression, it resolves to the object it’s referencing. So ref = y doesn’t change ref to now reference y . Rather, because ref is an alias for x , the expression evaluates as if it was written x = y -- and since y evaluates to value 6 , x is assigned the value 6 .

Lvalue reference scope and duration

Reference variables follow the same scoping and duration rules that normal variables do:

References and referents have independent lifetimes

With one exception (that we’ll cover next lesson), the lifetime of a reference and the lifetime of its referent are independent. In other words, both of the following are true:

  • A reference can be destroyed before the object it is referencing.
  • The object being referenced can be destroyed before the reference.

When a reference is destroyed before the referent, the referent is not impacted. The following program demonstrates this:

The above prints:

When ref dies, variable x carries on as normal, blissfully unaware that a reference to it has been destroyed.

Dangling references

When an object being referenced is destroyed before a reference to it, the reference is left referencing an object that no longer exists. Such a reference is called a dangling reference . Accessing a dangling reference leads to undefined behavior.

Dangling references are fairly easy to avoid, but we’ll show a case where this can happen in practice in lesson 12.12 -- Return by reference and return by address .

References aren’t objects

Perhaps surprisingly, references are not objects in C++. A reference is not required to exist or occupy storage. If possible, the compiler will optimize references away by replacing all occurrences of a reference with the referent. However, this isn’t always possible, and in such cases, references may require storage.

This also means that the term “reference variable” is a bit of a misnomer, as variables are objects with a name, and references aren’t objects.

Because references aren’t objects, they can’t be used anywhere an object is required (e.g. you can’t have a reference to a reference, since an lvalue reference must reference an identifiable object). In cases where you need a reference that is an object or a reference that can be reseated, std::reference_wrapper (which we cover in lesson 23.3 -- Aggregation ) provides a solution.

As an aside…

Consider the following variables:

Because ref2 (a reference) is initialized with ref1 (a reference), you might be tempted to conclude that ref2 is a reference to a reference. It is not. Because ref1 is a reference to var , when used in an expression (such as an initializer), ref1 evaluates to var . So ref2 is just a normal lvalue reference (as indicated by its type int& ), bound to var .

A reference to a reference (to an int ) would have syntax int&& -- but since C++ doesn’t support references to references, this syntax was repurposed in C++11 to indicate an rvalue reference (which we cover in lesson 22.2 -- R-value references ).

Question #1

Determine what values the following program prints by yourself (do not compile the program).

Show Solution

Because ref is bound to x , x and ref are synonymous, so they will always print the same value. The line ref = y assigns the value of y (2) to ref -- it does not change ref to reference y . The subsequent line y = 3 only changes y .

guest

C++ Return Reference

C++ Course: Learn the Essentials

In C++, a reference is a copy of a variable or an alias(alternate name) given to the memory location that stores the address of the original variable and is very similar to a pointer in C++. Whenever the copy of the variable is modified, the original variable will also be modified. We can use this C++ return reference concept to retain the value of the variables and many other features which is not possible with normal return statements.

C++ Return by Reference

Introduction.

A reference is represented by the ampersand symbol( & ) in C++ and is an alias or copy of the original variable. A shallow copy in C++ is made to create a reference variable, which means that the reference variable points to the address of the original variable, and any changes made to the reference variable will be reflected in the original variable. For example, if b is a shallow copy of a , then if the value of b changes, then the value of a will also change because b will have the same address as a .

The following images illustrate this,

references-in-cpp

The following syntax is used to create a reference variable,

The & denotes that the variable is a reference variable and the original_var is the variable from which the reference_var is referenced. The reference variable must be initialized on declaring the variable. Therefore, the declaration int& variable; will result in an error.

The C++ return by reference concept is used to create a function that returns a reference to the original variable.

The syntax of the function which returns a reference in C++ is,

  • The data_type& represents that the returned variable will be a reference variable of the data type data_type .
  • The data_type& original_variable represents that a reference variable named original_variable will be created to the variable passed as a parameter to the function.
  • The return original_varible will return the created reference variable and marks the end of the function.
  • The data_type& is the return data type of the function.
  • The function_name is the name of the function.
  • The parameter of the function is the original_variable .

In the above syntax, a normal variable is passed to the function and a new reference variable or an implicit pointer to the original variable is created and returned by the function. Therefore it is important to note that the property of reference is used in both the return statement and the argument passed to the function.

Consider the following case,

Here oringinal_vaiable is a normal variable. On passing this to the function, the following occurs,

A reference variable is created and returned by the function.

The algorithm for the C++ return reference function is as follows,

  • Declare a function with return type and parameter as reference variables.
  • Perform the required actions with the reference variable inside the function.
  • Return the reference variable.

For example, the following piece of code declares and defines a function that returns a reference of type int ,

  • Call the function with a normal variable.

For example, the following piece of code illustrates the calling of the function declared in the above code snippet,

Examples of C++ Return by Reference

Let us see different examples to understand the working of C++ return by reference,

Illustration I

The following is a basic example that utilizes the concept of incrementing a variable to understand the concept of return by reference in C++. The example also illustrates the difference between returning the reference variable itself and returning a copy of the reference variable,

Copy of reference variable

  • In the above example, the int copy_number stores the copy of the reference variable returned by the copy() function.
  • This occurs because the copy_number is declared as a normal int variable instead of a reference variable ( int& ).
  • This stores the copy_number in a new location of the memory and this number will be independent of the original variable.
  • Any changes made to the copy_number variable will not be reflected in the original variable.

The returned reference variable

  • The int& reference_number is declared as a reference variable and will store the reference variable returned by the copy() function.
  • When the reference_number is modified, then the original variable referenced by it will also be modified.
  • This is illustrated by incrementing the reference_number and checking the increment of the original variable.

The output for the above example is,

In the output, we can see that even when the copy number is incremented, the original number remains the same.

Illustration II

The following example illustrates the manipulation of address in functions that return a reference and also explains the concept of using a function that returns a reference on the left side of an assignment operator.

This example will collaborate with the above example to make things very clear.

Address of reference variables

  • The above example is similar to the example in the previous section and this example illustrates the change of addresses for each variable after a function call is displayed.
  • We can see that the address of variable a and the variable b are the same as b is a reference variable of a . This will not be the case for copy variables.

Calling function in LHS of operator

  • Since, the function return_by_ref() returns references to the value passed as a parameter in the function, the function can be used in the LHS of the assignment operator.
  • The value in the RHS of the assignment operator will be assigned to the returned reference which results in the change of the original variable.

The output of the above example is,

From the output, we can see that the address for the copied variable, c is different from all the other addresses which are referenced from the variable a .

Illustration III: Return by Reference using Global Variable

The C++ return reference functions are mostly used in Standard Templating libraries(STL) to perform an effective update of a variable.

The following example illustrates a program that updates values in an array using the return reference in C++,

Update a value using return reference

In the above example, the C++ return reference function is used to update the values in the array. The function call of setValues(1) is similar to original_array[1] as the function returns a reference to the setValues() function and the respective values on the right-hand side(RHS) of the assignment will be assigned to the array.

Illustration IV: Return by Reference using Local Variable

Returning a local variable by reference in C++ is a bad practice and should never be used, as it makes the code vulnerable to security attacks.

One of the basic concepts of a program is that all the space used by the program will be freed after the execution of the program. The vulnerability in this method is that there are possibilities for leaving memory without freeing them after program execution which can be exploited.

The following program shows an example of returning a local variable by reference using C++. Please remember that these practices must never be followed,

Returning local variable

The above example illustrates how a local variable can be returned as a reference. The total concept of returning a local reference is explained on the basis of the following errors.

The reason for the error in printing the local_refernce function is explained below,

  • On calling the function, the local variable is created and returned.
  • A function call uses a stack memory to store variables and when the function is returned the stack is destroyed along with the pointer to where the space has been allocated for the local variable.
  • So trying to print the local_refernce variable, the program couldn't find the memory location of the variable.

This causes leakage in memory as the memory is allocated somewhere, but is not known to the compiler. Therefore, it is necessary to clear the allocated memory for the variable using the delete method.

The delete &new_string throws an error due to the following reason,

  • The variable new_string is a copy of the reference returned by the return_local_pointer() function.
  • Since, the variable new_string is a copy of the reference variable, its address is not allocated manually using the new or malloc keywords. Since the variable is not allocated space by the user, the space of the variable is not accessible to the user and we can't use a delete function on the variable to manually delete the space of the variable.

The following code can be used to understand the above point,

The above code is valid and will not throw any errors, as the space is manually allocated using the new operator.

The following code throws an error, as the space is allocated by the compiler and is not accessible to the user,

The const keyword can be used to make a constant reference to the variable and it is not possible to change the value of this variable. This technique is used to make reference to large classes or struct objects as a means of compiler optimization and to improve the performance of the compiler.

Another important difference in using constant reference is that the original data is not copied when passed as a parameter to a function.

A better and errorless solution to return the reference of a local variable is using unique_ptr or a unique pointer.

  • A unique pointer is a type of smart pointer that can persist longer that the timeline of the function.
  • The smart pointer acts like a container to the pointer and is used to store pointers. The unique pointer can store only one pointer.
  • These pointers can deallocate the pointers stored in them and free the space allocated to them using destructors.
  • The memory library has to be included to use the unique pointer in the code.

The following syntax is used to declare a unique pointer ,

  • The data_type is the data type of the unique pointer.
  • The pointer_name denotes the name given to the unique pointer which can be later used to refer to the pointer.
  • The value is the value assigned to the unique pointer

The following program illustrates the usage of the unique pointer in resolving the problem of memory leaks in the program,

The output of the above program is,

We may wonder where the & symbol represents a reference variable. The reference variable is an implicit pointer which means that both the following functions are the same,

It's just that reference is easy to use.

Another way of handling the memory leak problem of returning a reference of local variables in C++ is to convert the local variable to a static variable. Static variables will not be destroyed after the execution of the function and have a lifetime till the end of the execution of the program. Since the static variables are not deleted and can be accessed through the program, there will be no memory leak in the program.

The output of the above program will be,

Important Points to Remember While Returning by Reference in C++

  • Unlike normal functions, values cannot be directly returned as references because a value doesn't have a fixed address to be referenced. For example, the following function results in the cannot bind non-const lvalue error,
  • Never return a local variable as a reference from a function as this makes the code vulnerable to memory leaks. This practice also makes the function abide by return value optimization(RVO), a technique in which the creation of temporary variables to hold return values is prevented.
  • The function which returns a reference can be used on both sides of the assignment operator, the program can be modeled with this point in mind as this can save space that will be required for a separate variable and is more efficient.

Q : Why does the segmentation fault error occur while working with local references in C++?**

A: The segmentation fault error occurs when we try to access a memory location that is not manually allocated by us and is allocated directly by the compiler. The unique pointers explained in the above sections can be used to prevent this error.

Related articles

  • A static variable is initialized only one time through the program. Learn more about static variables in C++ .
  • A pointer is used to store an address in which a variable is present. Learn more about pointers .
  • A constant variable can't be updated after initialization. Learn more about the const keyword .
  • The & symbol must be used near the return type of the function to specify the function to return a reference.
  • The variable to which the result of the function is stored can be a copy of the reference to the original variable and this depends on the declaration of the variable.
  • Be careful, when handling returning a reference using local variables as they can produce memory leaks.
  • Smart pointers and static variables can be used to solve the problem of memory leaks.
  • Standard Template Libray(STL) uses a C++ return reference to update values in vectors and lists.

reference assignment return value

  • Latest Articles
  • Top Articles
  • Posting/Update Guidelines
  • Article Help Forum

reference assignment return value

  • View Unanswered Questions
  • View All Questions
  • View C# questions
  • View C++ questions
  • View Javascript questions
  • View Visual Basic questions
  • View Python questions
  • CodeProject.AI Server
  • All Message Boards...
  • Running a Business
  • Sales / Marketing
  • Collaboration / Beta Testing
  • Work Issues
  • Design and Architecture
  • Artificial Intelligence
  • Internet of Things
  • ATL / WTL / STL
  • Managed C++/CLI
  • Objective-C and Swift
  • System Admin
  • Hosting and Servers
  • Linux Programming
  • .NET (Core and Framework)
  • Visual Basic
  • Web Development
  • Site Bugs / Suggestions
  • Spam and Abuse Watch
  • Competitions
  • The Insider Newsletter
  • The Daily Build Newsletter
  • Newsletter archive
  • CodeProject Stuff
  • Most Valuable Professionals
  • The Lounge  
  • The CodeProject Blog
  • Where I Am: Member Photos
  • The Insider News
  • The Weird & The Wonderful
  • What is 'CodeProject'?
  • General FAQ
  • Ask a Question
  • Bugs and Suggestions

reference assignment return value

The new C++ 11 rvalue reference && and why you should start using it

reference assignment return value

Introduction                      

This is an attempt to explain new && reference present in latest versions of compilers as part of implementing the new C++ 11 standard.  Such as those shipping with Visual studio 10-11-12 and gcc 4.3-4, or beautiful fast( equally if not more) open-source alternative to gcc Clang.     

Why you should start using it ? In short : Nontrivial performance gain.    For example   inserts to std::vector (or in fact any array creation) will not cost huge amount of allocs/copies anymore .   

But before we go to detail of new c++ 11 "Move semantics"  we need to understand the core of problem. We need to understand why performance problem of c++ language with = assign operation often resulting to useless alloc/copy exists.    

We live in the world where we need to work with a lot of data. We need to store and process it in an effective manner. Problem with c and c++ is that we got used to do it ineffectively.    

The real price for copying via = was small since pointers are just numbers (containing memory addresses). But with objects the story is different.   

How the performance problem started 

Now I will try to keep this short but it is important to really understand why && operator was born.

In C++ working with  references become prevalent since they are safer alternative to pointers and very importantly new language features such as automatic call of constructors or  destructors (automatic cleanup) or operators worked only with them.    

Now make no mistake  References are internally still just pointers but this time made little bit safer and automatically dereferenced. Period. How are they made safer? Well. They can never contain invalid data since the only assignment to them is allowed during their declaration and only to existing statically declared data. But whenever you pass reference to function in fact it is still just pointer that is internally pushed on stack.  

 C++ wanted to make our life easier by doing routine pointer de/referencing for us (and hiding this pointer wizardry from us). Illusion of passing or working with objects instead of pointers to them was so perfect that many lost sense what is actually reference and what is object.

Exactly this ambiguity had unfortunate side effect that led us into believing that this

is just is safer alternative to this    

We just got rid of unsafe pointers in favor of safer references like everywhere else.But more importantly this has given us automatic call of constructors destructor and operators. Right?  Wrong. There is no such thing as array of references in C++.  No pointer magic behind the scene is going on this time like it is within functions. So what we actually created is algorithmically very bad decision. We created an array of objects stored by value. And no. Removing * from declaration doesn't automatically make pointer variable reference. From performance point of view there is really no alternative to array of pointers . With big objects containing statically declared structured data there is big performance difference when creating sorting  searching and mainly reallocating 100mb array of values. Then it is with few bytes of pointers to them.  It pretty much kills the possibility to work within processor cache with as much data as possible. So we sort/search/reallocate etc on the speed of main memory order of magnitude slower instead of cpu cache which we now uselessly pollute with unimportant object data. With objects containing big dynamic data situation is better when sorting etc. But still bad when assigned to array. Since then we need to allocate and copy large chunks of dynamic mem plus rest of object. But I would say mentioned reallocation is worst consequence of storing object by values. 

So every time you think of "nah lets put vector<large-object> there". Your memory requirements will be twice of what they need to be and your performance abysmal due to fact that whole allocated mem will be uselessly moved around after each realloc. One would think that this is just price we indeed agreed to pay for generic approaches and laziness.  But by my opinion this is the price for storing objects instead of pointers for the oo reasons (automatic constructors destructors operators)   mentioned above.    

But back to the topic. As we remember assign = "actually" copies data. And this leads us to another big performance problem with arrays.    

How to efficiently build array of large objects. Suppose you wanna build a city full of skyscrapers and you obviously (due to large scale) can't afford any waste of time or resources.    

Now think of city as an array analogy. So what you actually do is you  "create" building inside city 

  • you allocate large space for building inside of city     
  • and obviously you just build building in this space.     

But thanks to our habit of using = operator to store object pointers to array in good old C.  Its only natural that we attempt to use the same  "create and assign"  paradigm with references too;  In other words. We simply got used to it and what's worse every c++ book teaches as to do it this ineffective way.     

  • you allocate large temporary space outside city // = Skyscraper();    
  • you build skyscraper in this temporary space  
  • you allocate the same large space inside city    // city[1].operator = (Skyscraper &in)   
  • you use large amount uf trucks gas and time to move this skyscraper to space in city   

It manifests with arrays so strongly because of sheer number of ineffective assignments. In case of our benchmark its 500 assignments but if you do any reference assignment that can be handled by moving (as explained later) and not copying in loop with 500 iterations you have basically the same problem.

Yes there are but they are nor intuitive or obvious and are hack like in nature. Now if C++ did allow us to invoke specialized constructor and create objects using already allocated space inside array(that was allocated just once for all elements by one efficient alloc). Then this could saved zillion of allocs/copies most of us usually do by assigning new objects via = ; 

Still. There are some ways to do it.  You can for example move all your initialization code to method and invoke it on array element explicitly.       

  • you allocate large space for building inside city     
  • you build building in this space.     

Hurray. The problem introduced by using = is gone.   

That means now it doesn't matter if object have large  statically declared  array or structure. No copy no problem. Most importantly the wasted cpu on moving mostly useless empty bytes is gone.  Also positive is the fact that reading and writing such a large chunk of memory  which pretty much flushed all cpu caches bringing down performance of the rest of the application is gone too.     

That's all nice and great. But chances that people will stop putting code to constructors in favor of some standard create method  are pretty slim.  

Using constructors to create everything  is paradigm that we got so used to and love. Exactly as we got trained by misleading books and used = operator for storing data to arrays.   

Still. There is way to do it with constructor via little known variant of operator new  Its called " placement new " where you construct object on existing memory with  this  pointer provided by you.  But now we are entering very weird confusing and little bit dangerous territory due to word new flying around statically declared array. New that doesn't allocate anything. New that is here just as a kludge to invoke constructor.    Why dangerous? The moment you overload something as fundamental as allocator New brace yourself for all kind of troubles  http://www.drdobbs.com/article/print?articleID=184401369&dept_url=/cpp/  

Most importantly when vector goes out of scope no destructors are automatically called. it can be done manually but it reintroduces source of bugs.

Is automatic cleanup of objects stored by pointers really that impossible in current c++?   

Now consider following weird but perfectly working example. Remember stack is defaultly limited (unless you change it in linker settings) resource.  So take this as purely academic example   that array of pointers that automatically calls destructors when going out of scope is possible.  

The moment array of objects stored by pointers goes out of scope they are automatically deallocated without any manual destructor invocation;  How come this works ? What is going on? = A() is internally the same as = new A(); the same constructor is invoked. Except for first stack is used by allocator and heap for second. both return pointers. references are pointers(just meeting certain criteria to deserve label reference) as we remember right ?   

Yes for heap pointers (created by new) there is kludge of wrapping all pointers to objects simulating pointers via operator overloading aka(smart pointers) in std::shared_ptr and alike. and store just those. So if you dont mind wrapping all your variables to functionality hiding impossible to debug macros/templates then this is very good solution for you.  

But I strongly believe that simple things shall not be encrypted or hidden from sight nor does every variable.  Programmer must be aware of what is going on as much as possible like it was in C. without having template and macro expander build in his head.  And if you ask Linus to obfuscate every pointer to template wrapper he would most probably kill you.  I remember strong backlash against excessive macro usage in C. And there was rational reason for that.    

That reason was "complexity and hiding code logic is source of bugs".      

The possibly best solution is to fix C++ compillers = operation  

So this  internally can be optimized (via explicit optimization switch) to something like 

This would fix dynamic  and static(object mem) waste = zero alloc/copy since elements are created just once in already allocated memory as it always should had been for performance reasons.  Why is static(non-dynamic) mem waste equally if not more important? Majority of objects are small and selfcontained. And when you look at benchmark bellow storing  object containing statically declared array took 5662 ms  yet storing object containing dynamic array of the same size took 1372 ms.  Also. After such change to compiler all old code using big  objects would start pretty much flying at completely different speeds just by recompiling.  

Because I am curious person I am attempting to implement and test it in clang fantastic open source c++compiler as a optimization switch or pragma. Should you wanna lend a hand I will be more than thankful http://clang-developers.42468.n3.nabble.com/Proposed-C-optimization-with-big-speed-gains-with-big-objects-tt4026886.html    But let's focus on latest C++  solution to it (unfortunately only for heap mem in your objects and with a lot of code changes)

The new C++ 11  Move semantics 

Move semantics enables you to write code that transfers dynamically allocated memory from one object to another. Move semantics works because it enables this memory to be transferred from temporary objects(by copying just pointers) that cannot be referenced elsewhere in the program. Unfortunately large statically declared  (contained-within object) arrays/structs/members data must still be uselessly copied since as mentioned they are contained within temp object themselves that is about to be destroyed.  

To implement move semantics, you typically provide a  move constructor,  and optionally a move assignment operator= to your class. Copy and assignment operations whose sources are (temp objects or data that can't change) then automatically take advantage of move semantics. Unlike the default copy constructor, the compiler does not provide a default move constructor.    You can also overload ordinary functions and operators to take advantage of move semantics. Visual C++ 2010 introduces move semantics into the Standard Template Library (STL). For example, the string class implements operations that perform move semantics. Consider the following example that concatenates several strings. 

Before && references existed, each call to operator+ allocated and returned a new temp object. operator+ couldn't append one string to the other because it didn't know whether content of the source can be tampered with (temps) or not (variables). If the source strings are both variables, they might be referenced elsewhere in the program and therefore must not be modified. But now thanks to && reference we now know that temp (which cannot be referenced elsewhere in the program) was passed in. Therefore, operator+ can now safely append one string to another. This can significantly reduce the number of dynamic memory allocations that the string class must perform. 

Move semantics also helps when the compiler cannot use Return Value Optimization (RVO) or Named Return Value Optimization (NRVO). In these cases, the compiler calls the move constructor if the type defines it.    

As an another example consider the example of inserting an element into a vector object. If the capacity of the vector object is exceeded, the vector object must reallocate memory for its elements and then copy each element to another memory location to make room for the inserted element. When an insertion operation copies an element, it creates a new element, calls the copy constructor to copy the data from the previous element to the new element, and then destroys the previous element. Move semantics enables you to move objects directly without having to perform expensive memory allocation and copy operations.

So. To take advantage of move semantics to allow efficient insert of your objects in the std::vector, you must write a move constructor to allow moving of data from one object to another. 

  • you allocate large space for building outside of city   // notice   = Skyscrapper();       
  • you build building in this space.      
  • you just mark this already created building as part of city // no trucks (copying) needed    

For complete compilable example copy benchmark code bellow to dev env of your choice. 

No for those who thing everything was clear and obvious in previous example. Dont' let the eyes fool you. 

Skyscraper && in is not actually of type && anymore. The moment it enters function its & again. So  if you wana forward && to another function you need to cast it to && again (in stl via std::move ). Why c++ decided to do this behind your back hidden functionality ? Well I am being told that it's security precaution. That any && having name is in risk of being referenced somewhere else in code and thus it's not deemed safe for keeping "moveable" status. Seems like some unfinished c++ business to me since I can't imagine referencing local variable outside of this proc. Also there is little know feature of ref-specifiers where you can restrict operator/methods to accept just temps or just variables.

Now, you can't anymore say 

Unfortunately this doesn't yet seem  to be supported in Visual Studio 2012 RC1 that I am using right now.   

So to summarize. Unless you use contained big statically declared (stored within object) structures/arrays/members the result is significant speedup of your existing code. Tho see how much you can speedup your existing code (well... actually you stop slowing it down)I created simple practical && example along with benchmark results. But if you do more than just stupid memset in your constructors/destructors speedups will be significantly higher.     

Benchmark Results:    

Store objects containing array themself took 5662 ms  // even with && this is still problem   sort   objects containing array themself took 17660 ms //this is why you should not store objects    store objects containing dynamic array by copying took 1372 ms store objects containing dynamic array by moving (c++ 11) took 500 ms store just pointers to objects took 484 ms   sort   just pointers to objects took 0 ms      , benchmark code   .

To have an idea how bad the usual careless assign = is. I created example storing 500 large objects to array via different methods and measured time it takes. Texture represents standard large object we work in c++ on daily basis = just large chunk of data and its size plus mandatory operator = to be able to be stored by value. Now it's stripped to bare minimum on purpose(no chaining consts etc) . with only simple types so you can focus only on those two operators. And sort has < reversed to simulate worst case scenario.   

Now the more observable of you would probably would start arguing...  

"This is nothing new I could do this data "moving" (or just passing data along between objects without copying) the same way in  current standard c++ operator = & so why do I need new && operator ?   

Yes you can and No you can't.    If you did moving in operator = & like this. Imagine what would happen    

If we moved = copied just pointers to data in standard operator = &  

Then whenewer b changes c changes too;  And this was not intended

so we actually wana make copy when assigning from data that can change  and we just copy pointers to data that we are sure will not change 

Unfortunately & up to c++ 11 could not distingush if passed data can change so moving was not possible in current c++ standard for the reasons explained in c=b example. the new && in turn can distingush that data which cant change was passed in and thus its safe just point to its data and skip copying. So to summarize. in new c++ 11 standard you are now supposed to keep two sets of operators and constructors 

operator = and constructor taking &   where you copy from data that can change (variables etc,)       operator = and constructor taking && where you just point to data that will not change and save mem and cpu by skipping copying  (temp objects,etc)        Unfotunately that means you will now have to implement two sets of pretty much every operator under the sun that you declared with generic copy/paste like code yet still fixing only heap side of the performance problem. So reconsider using = on objects at all. At least until compiller writers fix heap and static mem waste  by doing internal placement new for = on movable objects.  

Why it's called rvalue reference &&   , now the whole article i was deliberately was not using termins like rvalues(cant change) and lvalues(can change) since they are not what their names imply.   .

lvalue should had been named something like "variable"    rvalue should had been named something like "temp"   

They are just technical grammar remnants confusing people. the were born from how C grammar in lex and yacc was described eg on what side of "that particular grammar rule they vere located = left or right" BUT that particular rule can be part of larger expression and lvalue is sudenly rvalue. Or let me explain it this way. Anything not having name is rvalue otherwise it's lvalue 

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Comments and Discussions

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

reference assignment return value

C++ Tutorial

  • C++ Overview
  • C++ Environment Setup
  • C++ Basic Syntax
  • C++ Comments
  • C++ Data Types
  • C++ Variable Types
  • C++ Variable Scope
  • C++ Constants/Literals
  • C++ Modifier Types
  • C++ Storage Classes
  • C++ Operators
  • C++ Loop Types
  • C++ Decision Making
  • C++ Functions
  • C++ Numbers
  • C++ Strings
  • C++ Pointers
  • C++ References
  • C++ Date & Time
  • C++ Basic Input/Output
  • C++ Data Structures
  • C++ Object Oriented
  • C++ Classes & Objects
  • C++ Inheritance
  • C++ Overloading
  • C++ Polymorphism
  • C++ Abstraction
  • C++ Encapsulation
  • C++ Interfaces
  • C++ Advanced
  • C++ Files and Streams
  • C++ Exception Handling
  • C++ Dynamic Memory
  • C++ Namespaces
  • C++ Templates
  • C++ Preprocessor
  • C++ Signal Handling
  • C++ Multithreading
  • C++ Web Programming
  • C++ Useful Resources
  • C++ Questions and Answers
  • C++ Quick Guide
  • C++ STL Tutorial
  • C++ Standard Library
  • C++ Discussion
  • Selected Reading
  • UPSC IAS Exams Notes
  • Developer's Best Practices
  • Questions and Answers
  • Effective Resume Writing
  • HR Interview Questions
  • Computer Glossary

Returning values by reference in C++

A C++ program can be made easier to read and maintain by using references rather than pointers. A C++ function can return a reference in a similar way as it returns a pointer.

When a function returns a reference, it returns an implicit pointer to its return value. This way, a function can be used on the left side of an assignment statement. For example, consider this simple program −

When the above code is compiled together and executed, it produces the following result −

When returning a reference, be careful that the object being referred to does not go out of scope. So it is not legal to return a reference to local var. But you can always return a reference on a static variable.

Bounded-Error Log

theoretical and applied randomness by betaveros

C++ Rvalue References: The Unnecessarily Detailed Guide

Move semantics, perfect forwarding, and... everything else.

2020-11-16 (12005 words) filed under CS

By a strange quirk of fate, I have started writing C++ for a living.

Learning C++ was about as complicated as I think I expected it to be. By line count, I’ve written a lot of C++ for programming competitions, but I knew that I had only ever used a small cross-section of the language: basic control flow and variables, STL containers and algorithms, structs on which you mechanically define bool operator<(const T& other) const so STL algorithms can order them, and the very occasional macro or templated helper function. There were many features I wasn’t even aware existed.

In the process of learning C++ professionally, one rabbit hole I fell into quickly was C++11’s defining feature, the rvalue reference , and how it can be used to implement move semantics and perfect forwarding . By poring over a copy of the widely recommended book Effective Modern C++ , by Scott Meyers, and a few dozen StackOverflow answers and blog posts, I roughly understood it after a few days, but still had a sort of blind-men-feeling-the-elephant feeling. I was confused about what lay under some of the abstractions I had been using, unsure of the full shape of the pitfalls that some of the guides had pointed out to me, and generally uncomfortable that there were still many small variations of the code I had seen that I couldn’t predict the behavior of. It took many more days to work myself out of there, and I wished I had had a guide that explained rvalue references and their applications to a bit more depth than what might be necessary for day-to-day use. So here’s my attempt to explain rvalue references in my own fundamental I-want-to-know-how-things-work-no- really style.

(If this vision doesn’t resonate with you, there are many other posts explaining rvalue references out there that you might prefer. Feel free to just skim the executive summary and/or check out some of the linked articles in the Background section.)

Executive Summary

I got… pretty carried away when writing this post, and a lot of it is just for my own understanding, which may or may not be useful to readers. Here’s a much more concise rundown (assuming you know basic C++ already):

  • Every C++ expression is either an lvalue or rvalue. Roughly, it’s an lvalue if you can “take its address”, and an rvalue otherwise. For example, if you’ve declared a variable int x; , then x is an lvalue, but 253 and x + 6 are rvalues. If you can assign to it, it’s definitely an lvalue.
  • Rvalue references are a new kind of reference in C++11, declared with two && s instead of one, e.g.  int&& x; The old single- & kind are now called lvalue references. Lvalue references and rvalue references differ only in the rules surrounding their initialization, which includes when a function has a reference parameter and the compiler determines whether a certain argument is acceptable for that parameter. In particular, they do not differ when you use them in an expression: both kinds of references will be lvalues!
  • The rules for initializing a reference are: You can only initialize a non-const lvalue reference to an lvalue, which makes sense since the reference has to refer to something. However, you can initialize a const lvalue reference to either an lvalue or an rvalue; if you provide an rvalue, it will implicitly declare a new variable, initialize it to that rvalue, and produce a const reference to that variable instead. And you can only initialize an rvalue reference to an rvalue, which will do the same thing. Also, you must be careful of the lifetime of the implicitly declared variable; if it expires too quickly, you end up with a dangling reference.
  • However, you can call std::move on an lvalue to produce an rvalue that can be “smuggled” into an rvalue reference. If you do so, no new implicit variable will be created; the rvalue reference will actually refer to the lvalue passed into std::move .

Sometimes, you want to write a function taking an argument that can be implemented in one of two ways: a slow way that treats its argument as read-only, perhaps making a new copy, and returns something new, or an efficient way that clobbers its argument and reuses its resources to produce a return value. Loosely speaking, the latter type of behavior or the ability to offer it as an option is referred to as “moving” or move semantics . A popular way to support move semantics that automatically works with many types of client code is to overload the function as follows:

One overload will have an lvalue reference parameter and do the slow, argument-preserving thing; it’ll be called if the argument is an lvalue.

The other overload will have an rvalue reference parameter and do the efficient, argument-clobbering thing; it’ll be called if the argument is an rvalue.

Although you can’t write such a type directly, the new reference collapsing rules in C++11 states that a reference to a reference is just a reference. In particular, T& && = T& . So if T is a type variable, T&& can be either an lvalue reference or an rvalue reference, and if a templated function has a parameter type T&& , a natural value for T can be inferred for any argument depending on if it’s an lvalue or an rvalue. This enables you to write a templated function that can be called with any arguments, is aware of whether its arguments are lvalues and rvalues, and can forward those arguments to another function while preserving both their type and their lvalue/rvalue-ness. However, you will need to call std::forward to reconstruct the lvalue/rvalue-ness. This is called perfect forwarding and typically looks like this:

Read on for the long, detailed version with (way) more examples and links.

I assume you know simple C++, understand and are comfortable with pointers and in particular how pointers can dangle, and understand references, templates, classes, and constructors on at least a basic level. If you understand why the below function doesn’t work, how to fix it, and how to change it so it also applies to vectors that hold any numeric type, you should be good.

In addition, this post will make the most sense if you already understand, on a high level, why move semantics and perfect forwarding are nice features to have in C++; I will discuss them briefly but not try particularly hard to motivate them. Some other posts that cover overlapping material and do motivate them:

For motivating move semantics:

  • Triangles, of InternalPointers, C++ rvalue references and move semantics for beginners , motivates move semantics with a standard managed- int[] class. The predecessor Understanding the meaning of lvalues and rvalues in C++ also covers some of the same material.
  • Eli Bendersky, Understanding lvalues and rvalues in C and C++ , motivates move semantics with the standard int vector example.

For motivating perfect forwarding:

  • Eli Bendersky, Perfect forwarding and universal references , motivates perfect forwarding through vector “emplacement” and variants.
  • oopscene, C++11: Perfect forwarding , motivates perfect forwarding through a higher-order function and variants.
  • Thomas Becker, C++ Rvalue References Explained , motivates move semantics with a generic resource-holding class and perfect forwarding with a generic factory function.

We’re also going to do the pretentious language-lawyer thing where we differentiate parameters from arguments, because the difference will matter. Parameters are the variables that a function is declared as taking and that it can use in its function body; arguments are the expressions that you actually pass into a function to call it. Below, param1 and param2 are the parameters to f , and arg1 and arg2_1 + arg2_2 are the arguments it’s called with. Note already from this example that that parameters are variables, but arguments are expressions that can be variables or can be more complicated.

Finally, we’ll be using C++11. If you choose to compile along at home, make sure to pass -std=c++11 or otherwise specify the C++ edition to your compiler! I didn’t and was really confused when certain code snippets didn’t do what I expected them to do. Shows how narrow my C++ knowledge was until now.

Without further ado:

Lvalues and Rvalues

The names “lvalues” and “rvalues” come from early C and are named after the following rather poor approximation to what they are now, which I mention mostly just to help you remember which one is which:

  • Lvalues are expressions that can be assigned to, i.e. they can be on the left side of an = sign in an assignment;
  • Rvalues are all other expressions, which you will typically find on the right side of an assigmnent.

For example, if x is an int variable, the statement x = 6; makes sense, so the expression x is an lvalue. But 4 = x; doesn’t make sense — you can’t assign to 4 ; what would that do, change the meaning of 4 everywhere else it appears in the program? 1 — so the expression 4 is an rvalue, as are all other numeric literals. Some other familiar examples of lvalues include a[i] if a is an array variable and s.f if s is a variable holding a struct with a field called f . Some other familiar examples of rvalues are arithmetic expressions between primitive numeric types, for example something like x + 4 .

While easy to remember, this breaks down quickly in modern C++ (and modern C too): x is still an lvalue even if it’s const , but const variables can no longer be assigned to. Also, surprisingly, string literals are lvalues . A better rule of thumb is that lvalues are expressions you can take the address of . And since lvalues can actually also go on the right side of an assignment, the name is sometimes retconned to be short for “locator value”. As far as I’m aware, though, nobody has come up with a good retcon for “rvalue”. And it is still the case that lvalues and rvalues form a perfect dichotomy of all expressions: every expression is exactly one of the two. So rvalues are expressions you can’t take the address of .

In C++11, the category of “rvalues” was further subdivided into “xvalues” (sometimes “eXpiring values”, though this is also a retcon) and “prvalues” (“pure rvalues”). These categories are called value categories , by the way, and they still form a perfect trichotomy of all expressions: every expression is exactly one of an lvalue, an xvalue, and a prvalue. The term “glvalue” (“generalized l-value”) refers to simply “either an lvalue or an xvalue”. cppreference.com has a very detailed explanation of value categories , and there’s a classic StackOverflow question What are rvalues, lvalues, xvalues, glvalues, and prvalues with many good answers that are worth reading. But at a high level, I think the difference between xvalues and prvalues is less important to know than the difference between lvalues and rvalues. xvalues are pretty rare and you have to write somewhat tricky code to produce an xvalue. On the other hand, the innovative bits of C++11 that we’re here to discuss are exactly those that enable that tricky code. On the gripping hand, the goal of that tricky code is often simply to produce any kind of rvalue rather than specifically an xvalue.

One thing I want to make sure gets across is that a value category is a property of an expression, and not of a variable . This is confusing because expressions can consist of a single variable and will thus have a value category, but later we’ll see why we want to distinguish variables from expressions consisting of a single variable. 2 To be clear when this comes up, I’ll call such an expression that just consists of a single variable a “variable expression”, although I don’t think this is established terminology (cppreference calls them “id-expressions” ).

As I mentioned in the introduction, I’m assuming you understand pointers, so, well, a reference is like a pointer. The C++ FAQ says not to think of a reference as a funny pointer , but I think the comparison is useful in the sense that given a pointer to some data and a reference to some data, the things you can learn about that data and the ways you can modify it are basically the same. You can assign to it and modify it directly; you’ll be affecting exactly the same data, not a copy of it. You can get the address. You can convert between a pointer and a reference easily.

One way I think about references is that they’re pointers where when you first initialize them, there’s an implicit & (reference operator) applied to the expression you use to initialize them with, and whenever you use them in an expression (no matter if they’re on the left or right side of an assignment!), there’s an implicit * (dereference operator) applied to them. These implicit operators cannot be circumvented, which limits some of the ways you can manipulate references compared to pointers. Unlike pointers, references can’t be null 3 , because you have to initialize each reference to the & of some expression; also, pointers can change to point at something else, but references can’t change to refer to something else, because to change where a pointer p points (as opposed to changing the data at the location where it points), you have to directly assign to it without dereferencing it: p = ...; . So a reference is basically another name for a variable that exists somewhere else. Finally, although you can have pointers to pointers (and pointers to pointers to pointers, and so on), and you can have references to pointers, you can’t have a reference to a reference or a pointer to a reference. There can only be one level of “reference-of-ness” in a type and (ignoring templated types) it can only be at the outermost edge of the type.

The presence of the implicit & when you initialize a reference also immediately implies that you must initialize a (non-const) reference to a (non-const) lvalue . (We’ll see how that’s not true for const references later. Also, variables and references can also be volatile, which often affects types in a way similar to but orthogonal to const-ness; but for simplicity, I’m not even going to touch that for the rest of the post.) That’s why these references are more precisely called “ lvalue references ”, to differentiate them from the rvalue references that are the main target of this post, and which we’ll see soon. So, you can write the following, because x is an lvalue expression:

But you can’t write

because that would involve taking the address of “253”, which doesn’t make sense; it’s a constant that could be produced by hardcoded assembly instructions and isn’t necessarily ever stored anywhere. You also can’t write, for example,

because x + 6 is an intermediate expression. It isn’t necessarily stored anywhere, certainly not in a way that is guaranteed to persist after the statement, and so likely doesn’t have an address.

However, in a sense, this isn’t fundamentally impossible. You could imagine that C++ might have been designed to accept the code above and just treat it as syntax sugar for code like the following, which declares a plain non-reference-type variable in the same scope and takes a reference to it:

There wouldn’t be any way to access the variable implicitly_created_temp_y other than through y , but this code could still make sense and y might still behave the way you’d expect it to behave. And in fact, you can initialize a const lvalue reference to an rvalue expression ( or an lvalue expression), which produces code that works basically exactly as I described:

behaves just like

The lifetime of this new temporary value is the same as the lifetime of the reference.

Similar things happen with calling a function with a parameter that’s a reference type. If you have a function f declared as f(int& arg) , you can call it with the expression f(x) , but not f(253) . On the other hand, if you have a function f(const int& arg) , you can call it with f(253) ; this implicitly creates a variable initialized to 253, takes a const reference to it, and calls the function with that. As a result, even before C++11, it was quite idiomatic for functions that only needed read-only versions of their arguments to be declared with const reference parameters, as those would be more efficient than non-reference parameters on lvalue arguments by avoiding needing to copy them, but would still work on rvalue arguments. However, note that the lifetime of any such implicitly created variable only lasts as long as the “full expression” containing the function call, so it’ll be gone by the following statement and you need to make sure you don’t still have dangling references to it. A contrived example to illustrate this: 4

This compiles with no warnings, but when I run it, it prints 492 492 . The issue is that x is a reference to a temporary variable initialized to 253 that only lives as long as the expression it’s initialized it, which is silly(253) (and in particular, not as long as x itself). So, whatever memory location x refers to, it’s freely overwriteable by the time we finish its definition and get to the definition of y , and certainly by the time we printf it. It’s undefined behavior (and would be even if we deleted the definition of y ). The term “full expression” is a formal one but it roughly means “an expression that’s not part of another expression”. If you see a semicolon, that’s almost certainly the end of a full expression.

Finally, if you’re implementing a function whose return type is a const reference, you can also write a return statement that returns an rvalue… but you should never do this because this particular case never extends the lifetime of the temporary variable. You are guaranteed to produce a dangling reference. Consistent with this observation, the C++ compilers I tested actually warn if you try to return a const reference to a local variable from a function, whereas they didn’t warn about silly above.

For the interested, cppreference.com documents the nitty-gritty details of extending the lifetime of a temporary . In general, given a fixed reference type T (which might or might not be const, and might be an lvalue reference or an rvalue reference as we’re about to see) and an expression /* some expression */ with a fixed type and value category, the rules for whether these three snippets will typecheck and compile are the same:

These rules are documented in the full page on reference initialization . In fact, I find it kind of useful to try imagining manually inlining function calls, that is, temporarily ignore scoping issues and mentally translate function calls like this

Note the imaginary variable ret with type ReturnType , which we never named and can be a reference. I think this may be a mental model you build early on when learning programming and then stop thinking about because it’s too obvious, but when there are references involved, the exact semantics can become nonobvious.

Although I think the above mental inlining helps you reason about whether some expressions that replace /* return expr */ or /* arg expr */ will compile, it doesn’t necessarily represent the operations that actually happen, because of return value optimization , or RVO. The above code suggests that if ReturnType and VarType are classes with a nontrivial constructor, the constructor will be called twice, once to initialize ret and once to initialize v . (In case you haven’t encountered this before: yes, despite appearances, T x = y; calls a single-argument constructor of T because it’s a variable definition; it has nothing to do with operator= . 5 But after that declaration, x = y; would call operator= .) Even worse, /* return expr */ might just be an expression that calls the constructor — that’s the natural way to write it, since there’s no syntax to directly construct into ret — and then we’d be calling the constructor three times. However, if ReturnType and VarType are the same class and you actually compile and run such code, you will likely find that the constructor is only called once, simply because the compiler can tell where the constructed object will end up. This optimization is so common that it’s named. There are other ways constructors can be elided from the above inlined version; Shahar Mike’s article on Return Value Optimization goes into more depth on this phenomenon.

Here I will put another important note to mirror the one I concluded the last section with: Variables (including function parameters) can be either references or not, as can function return types; but expressions (including function arguments) can never be reference types ! Whether you end up taking a reference to any given expression or not depends on how the expression is used. (Don’t be confused by the many ways the word “reference” has popped up in this post. You can apply the reference operator to some expressions to get new expressions, whose types are pointer types; and you can apply the dereference operator to an expression if its type is a pointer type to get another expression. None of these expressions are necessarily reference types.)

Rvalue References

So what is an rvalue reference? It’s just a slightly different kind of reference introduced in C++11. The differences are actually smaller than I expected when I started learning about them. An rvalue reference still has to refer to something you can take the address of, and every time you use it in an expression, it still gets implicitly dereferenced in an uncircumventable way.

You declare a variable, parameter, or function return type with an rvalue reference type just like you would for an lvalue reference type, except with two & s instead of one: int&& y; Note that the two ampersands && are a single syntactic unit. It does not mean, and there is no confusion with, a “reference to a reference to” something.

The key difference lies in how you initialize rvalue references: you can only initialize an rvalue reference with an rvalue . You cannot initialize it to an lvalue! This might seem bizarre because, of course, you can’t take the address of an rvalue, which is what we need to produce a reference. But as I described earlier, you could imagine treating such an initialization as syntax sugar that implicitly defines a variable that is initialized to the rvalue and then takes a reference to that, and similar syntax sugar already exists and has well-defined semantics for const lvalue references. That sort of implicit variable-declaration-and-reference-taking is often, but not necessarily, what happens. The two lines that do compile below are simple examples of where it does happen: they behave as if they create variables that last as long as t and z and then take reference to those variables.

The same rule applies when you’re calling a function that has a parameter with an rvalue reference:

You can call this function as f(253) . If x is an int variable, you cannot call f(x) because the expression x is an lvalue, but you can call f(x + 6) . The rules defining the lifetime of the implicitly defined variable are the same as before: it lasts until the end of the full expression with the function call. (And for completeness, in a function int&& f() { ... } , you could return an rvalue and it would compile, but just as it was with const lvalue references, doing this would always produce a dangling reference, so you shouldn’t.)

Given that the above works, it may be a little surprising that defining int&& y = x; or calling f(x) doesn’t work, because it’s even easier to imagine the syntax sugar that it could expand to — you just initialize an implicit variable in the same way, but with the expression x . Sure, it’s an lvalue, but lvalues can be on the right side of an assignment too. However, it doesn’t work because making it hard for yourself to do that is sort of the point of having rvalue references. We’ll see how you could nevertheless force it to happen soon.

Another thing I should mention is that, like lvalue references, rvalue references can be const, and a non-const rvalue reference can only be initialized to a non-const rvalue. None of the rvalues we’ve seen so far have been non-const, and the idea of a non-const rvalue might even seem paradoxical — the point of an rvalue that it doesn’t have an address, so doesn’t that mean nobody else has a way to access it, so nobody will care or even notice if we modify an rvalue we got a reference to? We’ll see later how that’s false, so just keep this at the back of your mind for now.

But perhaps the most important thing to understand is that these are the rules for initializing an rvalue reference, not for using an rvalue reference in expressions. Even if the variable y is an rvalue reference to an int , the variable expression y will still be an lvalue — it has an address, which is the same as the address of the int it refers to. This is the number one confusing thing that every tutorial about rvalue references will invariably point out specifically, 6 and I still had to read like five of these tutorials to really understand why, so let me try to spell it out as explicitly as possible.

If we have this variable declaration,

the variable v has type rvalue reference int&& , but you cannot describe it as an lvalue or rvalue. The variable expression v is an lvalue of type int . And in general, every variable expression — that’s every expression that consists solely of a single identifier of a variable — is always an lvalue , no matter whether that variable’s type is non-reference, lvalue reference, or rvalue reference. You should think of “lvalue reference” and “rvalue reference” as compound words that cannot be naively analyzed as the combination of the two words inside them, like how the compound word “hot dog” refers to something that is neither necessarily “hot” nor a “dog”. The first parts of those compound words refer to the rules surrounding their initialization, but have nothing to do with how they get used in expressions. After you’ve initialized a reference, it’s actually quite hard to tell whether it’s an lvalue reference or rvalue reference — every time you use it, you’ll just get an lvalue. 7

Some concrete consequences are that you cannot initialize another rvalue reference to v :

Even though the variables v and vv have the same type, the variable vv can’t be initialized with the variable expression v because vv ’s reference type doesn’t match v ’s value category. For the exact same reason, if you have a function f that has an rvalue reference as a parameter, like the one defined above, you still can’t call f(v) .

More interesting than the value categories of variable expressions are the value categories of function call expressions. Here, the rules are as follows. If you write a function call expression that calls a function, and the function has a return type that is…

  • a non-reference type T , then the function call expression is an rvalue, specifically a prvalue.
  • an lvalue reference type T& , then the function call expression is an lvalue.
  • an rvalue reference type T&& , then the function call expression is an rvalue, specifically an xvalue. (These functions are quite rare, but the fact that they can now exist in C++11 is the entire reason this post exists and has, like, 11,000 words.)

C++ operators are kind of like function calls — on instances of classes, they literally are calls to the special operator+ functions and company, but even on primitives, you can basically think of arithmetic operators as like functions that return non-reference types, and assignment operators as like functions that return lvalue reference types. So if you are comfortable with the above list of understanding functions, you should be comfortable with determining the value category of quite a lot of expressions. However, you should be aware of implicit conversions secretly turning lvalues into rvalues and making them assignable to rvalue references. For example:

By the way, the value category of the function call expression is where the “mental inlining” I proposed earlier fails: if ReturnType and VarType are both rvalue references, the below compiles:

whereas the below does not:

We’ll see how to patch this mental inlining in a few sections.

We can now understand the standard library function std::move and resolve some earlier questions with it. The second most popular thing for rvalue reference tutorials to say is that std::move is kind of a misnomer. It doesn’t “move” anything. (“move” is not an idea with a strict technical definition anyway — it just loosely describes destructively operating on an object to move its data to another object.) All std::move does is cast its argument to an rvalue reference type and return it. It’s not a complicated function — you can find simple definitions of it all over the place, with varying degrees of pedagogical simplification 8 — but when first trying to really understand it, I thought even the few lines of templating were kind of gross. What I found really illuminating was trying to write out the specializations of std::move that would work with a specific non-reference type, say, only int s. They’re very short.

move is nothing more than a punchily-named function that performs a typecast. ( static_cast isn’t in the list of things I assume you understand, but it’s just the modern C++ way to cast expressions to types. And even in this case, appearances notwithstanding, the expression resulting from the static_cast isn’t a reference type; the reference-ness of the type just affects the value category of the resulting expression. Here, static_cast turns an lvalue into an rvalue, so that an rvalue reference can be initialized to it.) In the first instantiation, it doesn’t even do anything (but you would still need the static_cast to compile, because again, the variable expression x is an lvalue and return x; wouldn’t work in a function whose return type is an rvalue reference). But in the second instantiation, it does change the value category, which is exactly what we need. By calling it on an lvalue, you get an expression that’s an rvalue but refers to the same data.

Now, we know how to fix our code, where we tried to initialize an rvalue reference to an lvalue, that wouldn’t compile earlier (although whether we should is of course another question):

We just apply std::move to the expression we’re initializing the variable with:

Note that this doesn’t implicitly declare a new variable and take a reference to it. y is actually truly a reference to x , so assigning to y will assign to x and vice versa.

It’s almost exactly as if you had defined it as an lvalue reference instead:

We’ll see why this behavior is desirable in the next section. By contrast, if you had defined y with even a trivial expression that’s equal to x , you would have gotten a reference to something else, an implicit temporary variable. With the following definition of y , the variables x and y now refer to distinct int variables and can be assigned to without affecting each other.

There’s one more thing I haven’t mentioned: std::move preserves const-ness from its input type to its output type, so there are two more instantiations that are useful to know about:

So calling move on a const lvalue will produce a const rvalue, which is something that you wouldn’t be able to initialize a non-const rvalue reference with:

You should rarely need to write code that uses such an instantiation, but their existence will have consequences later.

Move Semantics

We can finally fully understand move semantics and how they’re implemented and used. As a reminder, I won’t spend much time motivating why move semantics are desirable; I linked some posts in the introduction of this post that do that instead. But the one-sentence goal of move semantics is that if you’re writing a function that does something with an object and might benefit from modifying it in-place or stealing its resources to use elsewhere (i.e. moving it), you’d want to know whether you’re allowed to do that.

To make things fully concrete, suppose you’re working with vector<int> s and you want to write a function sorted that takes a vector<int> and returns a sorted version of that vector. It would be nice to distinguish callers that don’t want their original vector to be modified from callers that don’t care, because in the latter case, you can sort the vector you received in-place for more efficiency and less memory allocation and return the same vector, and the caller won’t notice. That is, you’d want to write a function that works correctly for this caller:

but is still efficient when called like this:

Okay, so you still can’t write a single (non-overloaded, non-templated) function that does this. But what you can do, as of C++11, is write two overloads of the same function that do accomplish this. The two overloads have parameters that are a const lvalue reference and a non-const rvalue reference, respectively:

The first overload should copy the vector and allocate a new one; the second overload can modify the one it received. The first client above would call the first overload, because x is an lvalue; the second client above would call the second overload, because generateTestVector() is an rvalue. More often, this kind of overloading is used to write constructors and assignment operators ( operator= ), since you can’t rename those and there are many syntaxes that use them. All in all, this is a big improvement: you make a copy when you need to and don’t when you don’t.

However, if you’re a client of this function, you might sometimes find that you want to call it with an lvalue, say the variable expression x , as an argument, but you don’t want to preserve x . That is, you want the more efficient implementation of the function that won’t make a copy of x , at the cost of it potentially clobbering the contents of x . So you’d want a way to deliberately invoke the second overloading. Furthermore, note that you do not in fact want to do this by declaring a new variable vector<int> xx = x; and then somehow getting sorted ’s parameter to be an rvalue reference to xx ; nor do you want syntax that implicitly translates to code like that, because then xx would have to be a copy of x , and copying x is precisely the inefficiency you want to avoid. No, you want to convince sorted to have its rvalue reference refer to your variable x , even though the expression x is an lvalue.

That is precisely the setting for which std::move is designed. If you call std::move on the lvalue that you want to pass in as an argument, it makes the argument an rvalue, causing the second overload of sorted to be called instead of the first. However, crucially, when the second overload initializes its rvalue reference to that rvalue, it will refer to the same lvalue you supplied — no copy will occur.

If you’d like to see this in a full program, we can use the vector<int> constructor, which is overloaded just like this:

When I run this, it prints 6 0 . The first number must be 6 because we called the vector<int> constructor with an lvalue, so we would have called the overload with a const lvalue reference parameter, which is called the “copy constructor” 9 . But intuitively, there’s no guarantee what the second number printed will be at all, because by passing std::move(a) into the vector<int> constructor when initializing c , we’re deliberately passing an rvalue to invoke the overload with a rvalue reference parameter, which is called the “move constructor”. Intuitively, that’s a way of saying, “do whatever you want with a , I don’t care about it any more.” 10

So in terms of its relation to move semantics, std::move just sort of adds a flag to your lvalue expression saying, “Hey, I’m okay with being clobbered or otherwise moved out of”. (If you call std::move with an rvalue for an argument, it doesn’t really do anything.) But it’s up to the function receiving such an argument to notice that type-level flag and handle it by actually performing efficient move-semantics actions. If you didn’t overload sorted and only defined the version with the const vector<int>& parameter, client code could still call it by passing an rvalue, possibly produced by calling std::move on an lvalue, and your code would work, but it would copy the vector once unnecessarily 11 and no moving would occur. So std::move doesn’t necessarily imply moving at all; it’s just a suggestion to the function that it can be moved for efficiency, a suggestion that could be heeded, ignored, or even willfully misinterpreted.

One way it could be ignored is if you try to std::move a const lvalue. Consider this slight modification of our above program:

This compiles fine, but it prints 6 6 : c was not able to move the data out of a . That’s because even though std::move(a) is still an rvalue, this time it’s a const rvalue, and can’t be used to initialize a non-const rvalue reference. But it can be used to initialize a const lvalue reference, so the compiler silently picks the overload of the constructor with that as its parameter, i.e. the copy constructor, instead. The overload with a non-const rvalue reference parameter will only be called on non-const rvalues, not all rvalues.

Even more dramatically: if you wanted, you could overload sorted , or any other constructor or assignment method, in a way such that it treats lvalues and rvalues exactly oppositely for move semantics! That is, you could write overloads of sorted that steal the resources from its argument if you pass in an lvalue argument via an lvalue reference parameter (although it would have to be non-const), but perform a shallow copy and allocate a new vector if you pass in an rvalue argument via an rvalue reference parameter (which could be const). The first behavior would likely break your clients’ code and the second behavior would be obtusely inefficient in most cases, but there’s no technical reason you couldn’t write this code. And if you did, then whenever one of your clients tries to pass an lvalue to your function as an argument, they would have to call std::move on it only if they didn’t want it to be moved. Hopefully that thought experiment really drives home how std::move , on its own, doesn’t do any moving at all.

Still, if you ever find yourself passing an lvalue into a function supporting move semantics and you don’t care about the lvalue any more, std::move may save you a copy. However, I must caution here that there’s one place you might think of using it immediately, in the return statement of a function returning an object it constructed, that you almost always shouldn’t.

The logic is compelling, to be sure. As we’ve discussed before, mentally inlining this

results in this code:

That’s one constructor call and two copy constructor calls, because inner_t and ret are both lvalues. If we replaced return Thing(); with return std::move(Thing()); , then in the inlined version, we’d be able to invoke the move constructor rather than the copy constructor for ret . Assuming Thing implements move semantics sensibly, isn’t that better?

Actually, like I mentioned earlier, without the std::move , most compilers will already elide both copy constructor calls and directly construct the Thing into t because of return value optimization, or RVO. There’s a section in the RVO article I linked earlier on why returning by std::move() is an anti-pattern and can even actively make things worse. Part of Effective Modern C++ ’s Item 25 also discusses this. I won’t belabor the details.

  • Lvalue and rvalue references enable you to write functions or overloads of functions that can only be called with lvalues or rvalues as arguments.
  • This is useful because many functions can be implemented more efficiently if they know they can clobber or steal their argument’s resources (“moving”), which is strongly but not perfectly correlated with the argument being an rvalue.
  • If a caller has a variable that they are OK with being clobbered or stolen from, they might want to deliberately break the correlation above by casting their variable to an rvalue and passing that into a function. They can do that casting by calling std::move . However, whether the called function will recognize this and do any moving depends on its implementation and is purely a matter of convention, albeit a very strong one.

Aside: Reference Qualifiers

Incidentally, class methods can also have reference qualifiers constraining whether they can be called on lvalues or rvalues, which look like & or && at the end of the signature:

These notations turn the “implicit object parameter” into an lvalue reference or an rvalue reference, sort of as if method was declared as below, and if it were an operator or if C++ supported Uniform Function Call Syntax like D or Nim:

You could use this feature to implement move semantics in methods in terms of the expression they’re invoked on. However, the convention for doing so is not as strong and this isn’t used in the standard library much (if at all?).

Reference Collapsing, Universal References, and Perfect Forwarding

Now that we understand move semantics, we turn to the second application of rvalue references in C++11: allowing perfect forwarding . Again, I won’t try to motivate this very hard, but it might be useful to understand the problem of perfect forwarding with the precise terminology we’ve worked out in this post so far. Let’s keep our setup simple: say you have a function f , which might have overloads and which you can’t change, and you want to write a function g so that calling g with some arguments behaves exactly like calling f with the same arguments. (Defining such a g could be useful if g also does something else additionally or postprocesses the return value of f .) So g would be “forwarding” calls it received to f instead. Let’s even make things easy and say we know f has a single parameter and we know its return type is void . Then a first attempt at implementing g would be:

This seems okay because T can be deduced to any type, including a reference type, so the parameter of g should be inferred to be the same type as the parameter of f . But, by applying what we’ve learned so far, we can see that that actually isn’t true because we lose the information of the value category of the argument. The argument g was called with might have been an lvalue or an rvalue, but the argument that f was called with, which is the variable expression x , is always an lvalue.

Concretely, if f ’s parameter’s type is an rvalue reference, then we could have called f with an rvalue as an argument; but trying to call g with an rvalue as an argument will cause it to try to call f with an lvalue as an argument, which won’t work. Even worse, if f has two overloads that have an lvalue reference parameter and an rvalue reference parameter, respectively, our attempt at forwarding will silently always call the former overload even if passed an rvalue, and then g will not behave like f even though replacing a call to f with a call to g still resulted in code that compiles.

Before we get to how “universal references” resolve this issue, we need to talk about reference collapsing . Much earlier, I said that “you can’t have a reference of a reference”. This is sort of a half-truth. You can’t write int& & y; — it won’t compile, and you have no reason to, as we’ll see very soon. But you can get into a situation where you write something equivalent, with things like typedefs and template expansion. This code compiles:

What is the type of the parameter p ? Is it a reference to a reference to an int ? Well, it turns out that taking a reference to a reference to a type collapses to just taking a reference to that type directly. The resulting type is an lvalue reference if either level of reference was an lvalue-reference; it’s an rvalue reference if both levels of reference were rvalue references. As a list:

  • & & = &
  • & && = &
  • && & = &
  • && && = &&

It’s binary AND where && is true. You can read more about reference collapse on cppreference.com . But, in any case, the trichotomy that every type is exactly one of a non-reference, lvalue reference, and rvalue reference remains complete. You can’t write int& & y; not because there’s no sensible definition, but because, in a rare instance of C++ preventing yourself from shooting yourself in the foot, you wouldn’t gain anything — that would be exactly equivalent to int& y; .

With that in mind, a prototypical universal reference, useful for forwarding, is the type T&& of v in this templated function:

As we learned about when we first met rvalue references, if you have a function f(int&& x) that has a parameter of type int&& , you can only call it with an argument that’s an rvalue. And in fact, if you had a function g2 that was declared to take T& (an lvalue reference to T ) like so,

you could only call g2 with an argument that’s an lvalue. However, because of reference collapsing, you can pass either an lvalue or an rvalue as an argument to g !

  • If you pass an lvalue of, say, type int , then T can be inferred to be int& , so that the parameter is of type int& && = int& (by reference collapsing).
  • If you pass an rvalue of type int , then T can be inferred to be int so that the parameter is of type int&& . (Passing an int rvalue would also work if T were inferred to be int&& and the parameter’s type would be int&& && = int&& , and you could explicitly specify that T be int&& if you wanted, but that just turns out to not be how the template type inference rules are written.)

This is why T&& is called a universal reference: it’s a reference, but it can be initialized to any argument, lvalue or rvalue, and for that matter, const or non-const.

Does that mean we’re done? Not at all: if we wrote,

we would still always be passing an lvalue as an argument to f , and no amount of fiddling with the type of the parameter v or other aspects of the templating will fix this, because the variable expression v we’re passing as an argument is always an lvalue. To have a chance of passing an rvalue to f , we must pass it some other kind of expression. The best candidate (the only one we’ve really looked at in this post) would be a function call expression. And one function that will solve our problem neatly is the function std::forward .

Here’s how std::forward , which is a templated function taking one type variable T , works:

  • If T is a non-reference (or rvalue reference, but this case won’t really be relevant), std::forward<T> ’s return type is the rvalue reference T&& and its parameter type is the lvalue reference T& . So the argument to std::forward<T> must be an lvalue and the result of calling std::forward<T>(...) will be an rvalue (specifically an xvalue).
  • If T is an lvalue reference, std::forward<T> ’s return type is T itself, which is an lvalue reference, and its parameter type is also T itself, which is an lvalue reference. So the argument to std::forward<T> must be an lvalue and the result of calling std::forward<T>(...) will also be an lvalue.

More briefly, the int specializations of std::forward are:

In particular, all specializations of forward have a parameter of lvalue reference type, so you can’t expect the desired reference-ness of T to be inferred solely based on the argument passed to std::forward . You will need to specify T yourself in order to get forward to do anything interesting. And that is exactly what we do to accomplish perfect forwarding:

It may help to imagine the int specializations of g as well. They simplify down to:

The argument we supply to forward in g is always an lvalue, which tracks with the fact that forward ’s parameter is always an lvalue reference. But you can work out how the forwarding occurs now:

  • if g is called with an lvalue argument of type A (which must be non-reference — expressions aren’t reference types), then T will be inferred to be A& , so std::forward<A&> will have return type A& . Thus, calling it will give an lvalue, so f will be called with an lvalue argument;
  • if g is called with an rvalue argument of type A (which, again, must be non-reference), then T will be inferred to be A , so std::forward<A> will have return type A&& and thus calling it will give an rvalue, so f will be called with an rvalue argument. What’s more, the argument is always passed by reference, so f ’s parameter will be an rvalue reference to the exact same thing that g ’s parameter references.

That was a mouthful, but the result is perfect forwarding: g will call f with the same arguments it receives as the same value categories, so it behaves partially just as if you had called f .

For completeness, I’ll quickly mention that you can do perfect forwarding even without knowing how many arguments you’re trying to forward, by using a template parameter pack . But in terms of the types and value categories involved, nothing fundamentally different is going on here. It would look like this, and is likely how you’ll actually see perfect forwarding in the wild or implement it in practice:

Perfect forwarding appears in the standard library in places such as the data structure “emplace” functions (e.g.  vector::emplace_back and map::emplace ), smart pointer construction functions (e.g.  std::make_unique and std::make_shared ), std::forward_as_tuple , and probably others.

(Unsurprisingly, there are actually quite a few ways in which “perfect forwarding” is still imperfect: for some suitably crafted functions f and some arguments, the above g will not behave like f . If you want to learn about them, you may actually want to buy Effective Modern C++ and read Item 30, because wowzers, there are some crazy corner cases and there’s no way I know enough C++ to cover them more effectively.)

Incidentally, std::forward also lets us patch the “manual inlining” model of understanding how functions return values. Ignoring scoping issues and lifetimes, this

should be equivalent to this:

The expression std::forward<ReturnType>(ret) has the same value category as a call expression to a function with return type ReturnType . 12 Admittedly, we introduced another (templated!) function call with this patch, so if we’re trying to strictly simplify the rules we have to remember, we didn’t gain any ground, but I mentioned it in case it helps with intuition anyway.

If perfect forwarding is so good why isn’t there a perfect forwarding 2

A question to think about: Can you write a function that perfectly forwards two disjoint argument lists to two different functions?

That is, if you have two functions f1 and f2 , and you don’t know how many parameters either takes or what types they are, can you write a templated function g such that any caller of g can specify two lists of arguments, and g will behave just as if f1 were called with the first list and f2 were called with the second list?

It’s not easy, but std::pair has a constructor overload that does it, which you have to invoke by prepending a piecewise_constructor argument and then packing things into a tuple. Honestly, though, I don’t understand this deeply and this post is already far too long, so I’ll leave it at that.

Some unanswered questions

We now understand deeply how C++11 uses rvalue references to achieve move semantics and perfect forwarding, but I don’t know if you have this mathematician’s unease that we made some arbitrary choices along the way about exactly how rvalue references work. In particular, are the reference collapse rules really “canonical”?

It seems that the most direct impetus for the choice of reference collapse rules is just to allow perfect forwarding by allowing universal references to exist — specifically to make it so that, under template<typename T> , the type T&& can be either an lvalue reference or an rvalue reference, but not a non-reference. What’s more, note that you do want the expression-in-terms-of-T to be simple and probably have direct syntax support, because you want to be able to infer T from the type and value category of your argument by following canonical, unsurprising rules when possible. It’s not sufficient to just say your parameter’s type is an unadorned type variable T and require that it be a reference through type utilities through other parts of the templating. That’s already possible with enable_if :

This does produce a function where T must be either an lvalue reference type or an rvalue reference type, and, depending on what T is, can either only be called with lvalues or only be called with rvalues. Unfortunately, T isn’t correctly inferred in calls. Given an argument, the compiler infers its non-reference type for T , which would have worked and would be the most sensible choice without the enable_if , but then finds that it fails to substitute because of the enable_if and doesn’t try to backtrack.

It does work if you explicitly specify T , but of course, that defeats the purpose of type inference entirely:

So not only do we want a simple type-level expression in terms of T that can be either an lvalue reference or an rvalue reference but not a non-reference, the expression has to be “canonical” enough that we can standardize how to infer T given what kind of reference the expression is. And although we could make T& that expression if we made references collapse the other way (so a reference of a reference would be an lvalue reference only if both original reference operations were lvalue references), we probably wouldn’t want to for backwards-compatibility. So choosing T&& and adopting the direction of reference collapsing to make it universal is a plausible choice.

Still, to the best of my knowledge, I suspect a version of C++ where references didn’t collapse (i.e. taking a reference of a reference would just fail to compile, whether or not there was a typedef or using in the way) and we found a different way to represent universal references would still hold up. One thing to observe is that, even if reference collapse were gone, it’s sufficient to implement perfect forwarding for a single argument with two overloads that take T& and T&& . But we really want something more uniform so we can handle varargs (and so that, even for a finite, known number of arguments n , we don’t need 2 n overloads). It’s possible to try to make the enable_if<is_reference> mess earlier work, because it sort of already means the right thing, but I don’t see a compelling way.

  • One strategy to make it work is giving the compiler the intelligence to change its inference rules after seeing such a template expression, but that seems too much of a brittle special case.
  • Or we could make the compiler backtrack, trying both the non-reference type and the reference type for T , but that threatens exponential blow-up in compilation time with many parameters.

Perhaps we could have chosen some brand new syntax that forces T to be a reference without affecting it, and causes T to be inferred to be either an lvalue reference or an rvalue reference. For example, the natural syntax extension I’m the most confident wouldn’t affect any other part of the syntax would be:

It even seems useful if we could find a way to make this work with non-type-variables, something like:

This would be a function that can take any int argument (in particular, causes arguments to be implicitly converted to int s), but takes it by reference and knows whether the argument was an lvalue or an rvalue. But there’s no obvious way or syntax for the function to access that knowledge. Perhaps we could make it so, if a variable’s type is an rvalue reference, its variable expressions is an rvalue? And to allow us to do everything with rvalue references we could do before, we might have an additional std function that casts rvalue references to lvalue references. But this is also doomed because you can’t wait until f is actually called to know if p is an lvalue or an rvalue — if you turn around and call another function with it, you might be calling different overloads with radically different behavior — so then you’d have to template the function twice, which is extremely suspicious if our made-up syntax doesn’t have any template variables in it. 13 Attempting to retrofit other type machinery to dig the information out of p , for example with decltype , seems doomed to failure for the same reasons. Oh well.

Still, there are other ways reference collapse feels like a somewhat arbitrary consequence of trying to achieve a goal while maintaining backwards compatibility. For example, it’s quite annoying to write a templated function that can take arguments of any type, but only if they’re rvalues. That is, you’d want the function’s parameter type to be any rvalue reference, but not an lvalue reference or a non-reference. StackOverflow shows it’s possible , but it’s tough. Compare to how easy it is to write a templated function that can take arguments of any type, but only if they’re lvalues: template<typename T> void f(T& x) , end of story. No matter whether T is an lvalue reference, an rvalue reference, or a non-reference, T& will be an lvalue reference type (possibly via reference collapse), and given the type that T& is equal to, the type that T should be inferred to be is obvious.

Rvalue references are a new type of variable or parameter in C++11 that can only be initialized to rvalues. Firstly, this lets you write functions or overloads of functions that only accept rvalues as arguments, which turns out to usually be a great way to detect whether you have permission to clobber your argument or take resources from it. In addition, std::move lets you explicitly give permission to such a function or overload of a function in a call where it would normally not detect it. Secondly, in conjunction with reference collapsing, rvalue references let you write a templated function that can tell whether its argument was an lvalue or an rvalue and make templating decisions accordingly. This enables you to write functions that preserve the value category of their arguments when calling other functions with them, an ability called perfect forwarding. These abilities make C++11 a much more powerful language than its predecessor, just like your C++ skills are probably much more powerful than when you started reading this post which I guess justifies why they incremented the version number so much but Rust’s lifetimes are actually both better and simpler and nobody can convince me otherwise why am I even bothering to write a conclusion, this isn’t an AP exam.

  • New footnote about decltype and its distinguishing variables from expressions. Edited other footnotes to refer to it.
  • New footnote on constructors.
  • New section on reference qualifiers.
  • New footnote on generic lambdas and potential implications.

It may be more likely than you think. You can do it in Python , you can do it in Java … ↩

Although this is a bit out of the way, one place this does make a visible difference in C++ is with decltype . If x is a variable or has a similarly simple form (I won’t list this out; refer to cppreference.com on decltype ), decltype(x) gives the type of the variable, reference-ness and all. But on any more complicated expression, even (x) , decltype gives a type derived from the value category of the expression: lvalue reference for lvalue, rvalue reference for xvalue, and non-reference for prvalue. Here’s a program to show that:

This might not seem too bad since, whenever you see decltype , you can immediately tell what variable or expression it’s being applied to. But it can get into much spookier action-at-a-distance when applied to auto . To steal another example from Effective Modern C++ , Item 3:

Well, you could shoehorn it in with code like

but you really shouldn’t. Don’t take my word for it, the C++ FAQ is perfectly adamant about it. ↩

Although the full example is contrived, note that if you pass a lvalue to silly , it just returns a reference to the same lvalue; there’s no undefined behavior, and a function with similar parameter and return types could be useful (e.g. producing a const reference to a field in a struct that it also takes by const reference). And, you can write a function that takes a const reference, does computations with it, and returns a non-reference, which also wouldn’t cause any undefined behavior:

So silly ’s type and the action of passing an rvalue as an argument to a function with silly ’s parameter’s type could both individually make sense (so it’s plausible that compilers don’t warn about the above code), but when combined as in the contrived example above, they don’t. ↩

See also this table by Nicolai Josuttis for the plethora of syntaxes C++ has for initialization. ↩

Section 5 of Thomas Becker’s explainer is dedicated to the question: “Is an Rvalue Reference an Rvalue?” . Jonathan Boccara bolds it twice in “Understanding lvalues, rvalues and their references” . It gets stated explicitly in literally page 3 of Effective Modern C++ in the first code snippet in the introduction. ↩

If you try, you can at least do it with decltype and type support utilities . There may be much easier ways; I’m not good enough at C++ to know. But note that, as mentioned in an earlier footnote, this hinges crucially on the fact that decltype may treat the thing you apply it to as a variable rather than an expression. Anything that can’t do that is doomed.

Examples include in A Brief Introduction to Rvalue References and in Item 23 of Effective Modern C++ . ↩

The constructor with this exact signature, one parameter of type const lvalue reference to the class the constructor is defined on, is one of a few special methods in that it has a default implementation where it just copies all the fields with each field’s copy constructors. If the class satisfies certain constraints and you don’t opt-out explicitly, the compiler will generate such a constructor automatically. You can also explicitly request this default implementation with = default .

In addition to this constructor, called the copy constructor, the other constructors and methods with default implementations are the default (parameterless) constructor, the move constructor (taking one Thing&& parameter), the copy assignment operator (taking one const Thing& parameter), the move assignment operator (taking one Thing&& parameter), and the destructor. For more details, cppreference.com’s classes page links to each of these. ↩

However, if you actually look it up, vector ’s move-constructor — constructor overload (8) on cppreference.com , as of time of writing — actually explicitly states that the moved-from vector will be empty() , so this program is guaranteed to print 6 0 . vector ’s move-assignment-operator overload (2) might have been a better example: the moved-from vector is “in a valid but unspecified state afterwards.” But the intuitive role that std::move plays is the same. ↩

Well, there’s no guarantee this copy will happen — it’s possible the compiler will optimize it away. ↩

Okay, fine, this is another lie: if ReturnType is a non-reference, then std::forward<ReturnType>(ret) will be an xvalue, but the call expression will be a prvalue. But they’re either both lvalues or both rvalues, which is good enough. ↩

But one reason to have hope anyway is that, as of C++14, auto can also introduce templating in lambdas (“generic lambdas”). Declaring a lambda like this:

roughly declares an implicit class with a templated operator() function like

and then initializes f to an instance of this class. This also works if you replace auto with auto&& : T becomes T&& , a universal reference. And then, you can in fact use decltype(x) to dig out the deduced reference-ness of the parameter, which tells you whether the argument is an lvalue or rvalue.

Admittedly, auto already shared a lot of the same type deduction machinery as templates in C++11, so perhaps this is natural. And this doesn’t actually make f ’s type itself templated: it’s an instance of a concrete class, just one with a multitude of instantiations of one method. We’re trying to come up with syntax that turns a method declaration into a templated one with many instantiations. ↩

(note: the commenting setup here is experimental and I may not check my comments often; if you want to tell me something instead of the world, email me!)

cppreference.com

Return statement.

Terminates the current function and returns the specified value (if any) to the caller.

[ edit ] Syntax

[ edit ] explanation, [ edit ] notes.

If control reaches the end of

  • a function with the return type (possibly cv-qualified) void ,
  • a constructor,
  • a destructor, or
  • a function-try-block for a function with the return type (possibly cv-qualified) void

without encountering a return statement, return ; is executed.

If control reaches the end of the main function , return 0 ; is executed.

Flowing off the end of a value-returning function, except main and specific coroutines (since C++20) , without a return statement is undefined behavior.

In a function returning (possibly cv-qualified) void , the return statement with expression can be used, if the expression type is (possibly cv-qualified) void .

Returning by value may involve construction and copy/move of a temporary object, unless copy elision is used. Specifically, the conditions for copy/move are as follows:

and that variable is declared

  • in the body or
  • as a parameter
  • then overload resolution is performed as usual, with expression considered as an lvalue (so it may select the copy constructor ).

If the expression is move-eligible, it is treated as an xvalue (thus overload resolution may select the move constructor ).

[ edit ] Keywords

return , co_return

[ edit ] Example

[ edit ] defect reports.

The following behavior-changing defect reports were applied retroactively to previously published C++ standards.

[ edit ] See also

  • copy elision
  • Recent changes
  • Offline version
  • What links here
  • Related changes
  • Upload file
  • Special pages
  • Printable version
  • Permanent link
  • Page information
  • In other languages
  • This page was last modified on 22 March 2024, at 18:43.
  • This page has been accessed 407,865 times.
  • Privacy policy
  • About cppreference.com
  • Disclaimers

Powered by MediaWiki

Pass-by-value, reference, and assignment | Pydon't 🐍

When you call a function in Python and give it some arguments... Are they passed by value? No! By reference? No! They're passed by assignment.

Python snippet containing the code `x is y`.

(If you are new here and have no idea what a Pydon't is, you may want to read the Pydon't Manifesto .)

Introduction

Many traditional programming languages employ either one of two models when passing arguments to functions:

  • some languages use the pass-by-value model; and
  • most of the others use the pass-by-reference model.

Having said that, it is important to know the model that Python uses, because that influences the way your code behaves.

In this Pydon't, you will:

  • see that Python doesn't use the pass-by-value nor the pass-by-reference models;
  • understand that Python uses a pass-by-assignment model;
  • learn about the built-in function id ;
  • create a better understanding for the Python object model;
  • realise that every object has 3 very important properties that define it;
  • understand the difference between mutable and immutable objects;
  • learn the difference between shallow and deep copies; and
  • learn how to use the module copy to do both types of object copies.

You can now get your free copy of the ebook “Pydon'ts – Write elegant Python code” on Gumroad .

Is Python pass-by-value?

In the pass-by-value model, when you call a function with a set of arguments, the data is copied into the function. This means that you can modify the arguments however you please and that you won't be able to alter the state of the program outside the function. This is not what Python does, Python does not use the pass-by-value model.

Looking at the snippet of code that follows, it might look like Python uses pass-by-value:

This looks like the pass-by-value model because we gave it a 3, changed it to a 4, and the change wasn't reflected on the outside ( a is still 3).

But, in fact, Python is not copying the data into the function.

To prove this, I'll show you a different function:

As we can see, the list l , that was defined outside of the function, changed after calling the function clearly_not_pass_by_value . Hence, Python does not use a pass-by-value model.

Is Python pass-by-reference?

In a true pass-by-reference model, the called function gets access to the variables of the callee! Sometimes, it can look like that's what Python does, but Python does not use the pass-by-reference model.

I'll do my best to explain why that's not what Python does:

If Python used a pass-by-reference model, the function would've managed to completely change the value of l outside the function, but that's not what happened, as we can see.

Let me show you an actual pass-by-reference situation.

Here's some Pascal code:

Look at the last lines of that code:

  • we assign 2 to x with x := 2 ;
  • we print x ;
  • we call foo with x as argument; and
  • we print x again.

What's the output of this program?

I imagine that most of you won't have a Pascal interpreter lying around, so you can just go to tio.run and run this code online

If you run this, you'll see that the output is

which can be rather surprising, if the majority of your programming experience is in Python!

The procedure foo effectively received the variable x and changed the value that it contained. After foo was done, the variable x (that lives outside foo ) had a different value. You can't do anything like this in Python.

Python object model

To really understand the way Python behaves when calling functions, it's best if we first understand what Python objects are, and how to characterise them.

The three characteristics of objects

In Python, everything is an object, and each object is characterised by three things:

  • its identity (an integer that uniquely identifies the object, much like social security numbers identify people);
  • a type (that identifies the operations you can do with your object); and
  • the object's content.

Here is an object and its three characteristics:

As we can see above, id is the built-in function you use to query the identity of an object, and type is the built-in function you use to query the type of an object.

(Im)mutability

The (im)mutability of an object depends on its type. In other words, (im)mutability is a characteristic of types, not of specific objects!

But what exactly does it mean for an object to be mutable? Or for an object to be immutable?

Recall that an object is characterised by its identity, its type, and its contents. A type is mutable if you can change the contents of its objects without changing its identity and its type.

Lists are a great example of a mutable data type. Why? Because lists are containers : you can put things inside lists and you can remove stuff from inside those same lists.

Below, you can see how the contents of the list obj change as we make method calls, but the identity of the list remains the same:

However, when dealing with immutable objects, it's a completely different story. If we check an English dictionary, this is what we get for the definition of “immutable”:

adjective: immutable – unchanging over time or unable to be changed.

Immutable objects' contents never change. Take a string as an example:

Strings are a good example for this discussion because, sometimes, they can look mutable. But they are not!

A very good indicator that an object is immutable is when all its methods return something. This is unlike list's .append method, for example! If you use .append on a list, you get no return value. On the other hand, whatever method you use on a string, the result is returned to you:

Notice how obj wasn't updated automatically to "HELLO, WORLD!" . Instead, the new string was created and returned to you.

Another great hint at the fact that strings are immutable is that you cannot assign to its indices:

This shows that, when a string is created, it remains the same. It can be used to build other strings, but the string itself always. stays. unchanged.

As a reference, int , float , bool , str , tuple , and complex are the most common types of immutable objects; list , set , and dict are the most common types of mutable objects.

Variable names as labels

Another important thing to understand is that a variable name has very little to do with the object itself.

In fact, the name obj was just a label that I decided to attach to the object that has identity 2698212637504, has the list type, and contents 1, 2, 3.

Just like I attached the label obj to that object, I can attach many more names to it:

Again, these names are just labels. Labels that I decided to stick to the same object. How can we know it's the same object? Well, all their “social security numbers” (the ids) match, so they must be the same object:

Therefore, we conclude that foo , bar , baz , and obj , are variable names that all refer to the same object.

The operator is

This is exactly what the operator is does: it checks if the two objects are the same .

For two objects to be the same, they must have the same identity:

It is not enough to have the same type and contents! We can create a new list with contents [1, 2, 3] and that will not be the same object as obj :

Think of it in terms of perfect twins. When two siblings are perfect twins, they look identical. However, they are different people!

Just as a side note, but an important one, you should be aware of the operator is not .

Generally speaking, when you want to negate a condition, you put a not in front of it:

So, if you wanted to check if two variables point to different objects, you could be tempted to write

However, Python has the operator is not , which is much more similar to a proper English sentence, which I think is really cool!

Therefore, the example above should actually be written

Python does a similar thing for the in operator, providing a not in operator as well... How cool is that?!

Assignment as nicknaming

If we keep pushing this metaphor forward, assigning variables is just like giving a new nickname to someone.

My friends from middle school call me “Rojer”. My friends from college call me “Girão”. People I am not close to call me by my first name – “Rodrigo”. However, regardless of what they call me, I am still me , right?

If one day I decide to change my haircut, everyone will see the new haircut, regardless of what they call me!

In a similar fashion, if I modify the contents of an object, I can use whatever nickname I prefer to see that those changes happened. For example, we can change the middle element of the list we have been playing around with:

We used the nickname foo to modify the middle element, but that change is visible from all other nicknames as well.

Because they all pointed at the same list object.

Python is pass-by-assignment

Having laid out all of this, we are now ready to understand how Python passes arguments to functions.

When we call a function, each of the parameters of the function is assigned to the object they were passed in. In essence, each parameter now becomes a new nickname to the objects that were given in.

Immutable arguments

If we pass in immutable arguments, then we have no way of modifying the arguments themselves. After all, that's what immutable means: “doesn't change”.

That is why it can look like Python uses the pass-by-value model. Because the only way in which we can have the parameter hold something else is by assigning it to a completely different thing. When we do that, we are reusing the same nickname for a different object:

In the example above, when we call foo with the argument 5 , it's as if we were doing bar = 5 at the beginning of the function.

Immediately after that, we have bar = 3 . This means “take the nickname "bar" and point it at the integer 3 ”. Python doesn't care that bar , as a nickname (as a variable name) had already been used. It is now pointing at that 3 !

Mutable arguments

On the other hand, mutable arguments can be changed. We can modify their internal contents. A prime example of a mutable object is a list: its elements can change (and so can its length).

That is why it can look like Python uses a pass-by-reference model. However, when we change the contents of an object, we didn't change the identity of the object itself. Similarly, when you change your haircut or your clothes, your social security number does not change:

Do you understand what I'm trying to say? If not, drop a comment below and I'll try to help.

Beware when calling functions

This goes to show you should be careful when defining your functions. If your function expects mutable arguments, you should do one of the two:

  • do not mutate the argument in any way whatsoever; or
  • document explicitly that the argument may be mutated.

Personally, I prefer to go with the first approach: to not mutate the argument; but there are times and places for the second approach.

Sometimes, you do need to take the argument as the basis for some kind of transformation, which would mean you would want to mutate the argument. In those cases, you might think about doing a copy of the argument (discussed in the next section), but making that copy can be resource intensive. In those cases, mutating the argument might be the only sensible choice.

Making copies

Shallow vs deep copies.

“Copying an object” means creating a second object that has a different identity (therefore, is a different object) but that has the same contents. Generally speaking, we copy one object so that we can work with it and mutate it, while also preserving the first object.

When copying objects, there are a couple of nuances that should be discussed.

Copying immutable objects

The first thing that needs to be said is that, for immutable objects, it does not make sense to talk about copies.

“Copies” only make sense for mutable objects. If your object is immutable, and if you want to preserve a reference to it, you can just do a second assignment and work on it:

Or, sometimes, you can just call methods and other functions directly on the original, because the original is not going anywhere:

So, we only need to worry about mutable objects.

Shallow copy

Many mutable objects can contain, themselves, mutable objects. Because of that, two types of copies exist:

  • shallow copies; and
  • deep copies.

The difference lies in what happens to the mutable objects inside the mutable objects.

Lists and dictionaries have a method .copy that returns a shallow copy of the corresponding object.

Let's look at an example with a list:

First, we create a list inside a list, and we copy the outer list. Now, because it is a copy , the copied list isn't the same object as the original outer list:

But if they are not the same object, then we can modify the contents of one of the lists, and the other won't reflect the change:

That's what we saw: we changed the first element of the copy_list , and the outer_list remained unchanged.

Now, we try to modify the contents of sublist , and that's when the fun begins!

When we modify the contents of sublist , both the outer_list and the copy_list reflect those changes...

But wasn't the copy supposed to give me a second list that I could change without affecting the first one? Yes! And that is what happened!

In fact, modifying the contents of sublist doesn't really modify the contents of neither copy_list nor outer_list : after all, the third element of both was pointing at a list object, and it still is! It's the (inner) contents of the object to which we are pointing that changed.

Sometimes, we don't want this to happen: sometimes, we don't want mutable objects to share inner mutable objects.

Common shallow copy techniques

When working with lists, it is common to use slicing to produce a shallow copy of a list:

Using the built-in function for the respective type, on the object itself, also builds shallow copies. This works for lists and dictionaries, and is likely to work for other mutable types.

Here is an example with a list inside a list:

And here is an example with a list inside a dictionary:

When you want to copy an object “thoroughly”, and you don't want the copy to share references to inner objects, you need to do a “deep copy” of your object. You can think of a deep copy as a recursive algorithm.

You copy the elements of the first level and, whenever you find a mutable element on the first level, you recurse down and copy the contents of those elements.

To show this idea, here is a simple recursive implementation of a deep copy for lists that contain other lists:

We can use this function to copy the previous outer_list and see what happens:

As you can see here, modifying the contents of sublist only affected outer_list indirectly; it didn't affect copy_list .

Sadly, the list_deepcopy method I implemented isn't very robust, nor versatile, but the Python Standard Library has got us covered!

The module copy and the method deepcopy

The module copy is exactly what we need. The module provides two useful functions:

  • copy.copy for shallow copies; and
  • copy.deepcopy for deep copies.

And that's it! And, what is more, the method copy.deepcopy is smart enough to handle issues that might arise with circular definitions, for example! That is, when an object contains another that contains the first one: a naïve recursive implementation of a deep copy algorithm would enter an infinite loop!

If you write your own custom objects and you want to specify how shallow and deep copies of those should be made, you only need to implement __copy__ and __deepcopy__ , respectively!

It's a great module, in my opinion.

Examples in code

Now that we have gone deep into the theory – pun intended –, it is time to show you some actual code that plays with these concepts.

Mutable default arguments

Let's start with a Twitter favourite:

Python 🐍 is an incredible language but sometimes appears to have quirks 🤪 For example, one thing that often confuses beginners is why you shouldn't use lists as default values 👇 Here is a thread 👇🧵 that will help you understand this 💯 pic.twitter.com/HVhPjS2PSH — Rodrigo 🐍📝 (@mathsppblog) October 5, 2021

Apparently, it's a bad idea to use mutable objects as default arguments. Here is a snippet showing you why:

The function above appends an element to a list and, if no list is given, appends it to an empty list by default.

Great, let's put this function to good use:

We use it once with 1 , and we get a list with the 1 inside. Then, we use it to append a 1 to another list we had. And finally, we use it to append a 3 to an empty list... Except that's not what happens!

As it turns out, when we define a function, the default arguments are created and stored in a special place:

What this means is that the default argument is always the same object. Therefore, because it is a mutable object, its contents can change over time. That is why, in the code above, __defaults__ shows a list with two items already.

If we redefine the function, then its __defaults__ shows an empty list:

This is why, in general, mutable objects shouldn't be used as default arguments.

The standard practice, in these cases, is to use None and then use Boolean short-circuiting to assign the default value:

With this implementation, the function now works as expected:

is not None

Searching through the Python Standard Library shows that the is not operator is used a bit over 5,000 times. That's a lot.

And, by far and large, that operator is almost always followed by None . In fact, is not None appears 3169 times in the standard library!

x is not None does exactly what it's written: it checks if x is None or not.

Here is a simple example usage of that, from the argparse module to create command line interfaces:

Even without a great deal of context, we can see what is happening: when displaying command help for a given section, we may want to indent it (or not) to show hierarchical dependencies.

If a section's parent is None , then that section has no parent, and there is no need to indent. In other words, if a section's parent is not None , then we want to indent it. Notice how my English matches the code exactly!

Deep copy of the system environment

The method copy.deepcopy is used a couple of times in the standard library, and here I'd like to show an example usage where a dictionary is copied.

The module os provides the attribute environ , similar to a dictionary, that contains the environment variables that are defined.

Here are a couple of examples from my (Windows) machine:

The module http.server provides some classes for basic HTTP servers.

One of those classes, CGIHTTPRequestHandler , implements a HTTP server that can also run CGI scripts and, in its run_cgi method, it needs to set a bunch of environment variables.

These environment variables are set to give the necessary context for the CGI script that is going to be ran. However, we don't want to actually modify the current environment!

So, what we do is create a deep copy of the environment, and then we modify it to our heart's content! After we are done, we tell Python to execute the CGI script, and we provide the altered environment as an argument.

The exact way in which this is done may not be trivial to understand. I, for one, don't think I could explain it to you. But that doesn't mean we can't infer parts of it:

Here is the code:

As we can see, we copied the environment and defined some variables. Finally, we created a new subprocess that gets the modified environment.

Here's the main takeaway of this Pydon't, for you, on a silver platter:

“ Python uses a pass-by-assignment model, and understanding it requires you to realise all objects are characterised by an identity number, their type, and their contents. ”

This Pydon't showed you that:

  • Python doesn't use the pass-by-value model, nor the pass-by-reference one;
  • Python uses a pass-by-assignment model (using “nicknames”);
  • its identity;
  • its type; and
  • its contents.
  • the id function is used to query an object's identifier;
  • the type function is used to query an object's type;
  • the type of an object determines whether it is mutable or immutable;
  • shallow copies copy the reference of nested mutable objects;
  • deep copies perform copies that allow one object, and its inner elements, to be changed without ever affecting the copy;
  • copy.copy and copy.deepcopy can be used to perform shallow/deep copies; and
  • you can implement __copy__ and __deepcopy__ if you want your own objects to be copiable.

If you prefer video content, you can check this YouTube video , which was inspired by this article.

If you liked this Pydon't be sure to leave a reaction below and share this with your friends and fellow Pythonistas. Also, subscribe to the newsletter so you don't miss a single Pydon't!

Become a better Python 🐍 developer 🚀

+35 chapters. +400 pages. Hundreds of examples. Over 30,000 readers!

My book “Pydon'ts” teaches you how to write elegant, expressive, and Pythonic code, to help you become a better developer. Get it below!

  • Python 3 Docs, Programming FAQ, “How do I write a function with output parameters (call by reference)?”, https://docs.python.org/3/faq/programming.html#how-do-i-write-a-function-with-output-parameters-call-by-reference [last accessed 04-10-2021];
  • Python 3 Docs, The Python Standard Library, copy , https://docs.python.org/3/library/copy.html [last accessed 05-10-2021];
  • effbot.org, “Call by Object” (via “arquivo.pt”), https://arquivo.pt/wayback/20160516131553/http://effbot.org/zone/call-by-object.htm [last accessed 04-10-2021];
  • effbot.org, “Python Objects” (via “arquivo.pt”), https://arquivo.pt/wayback/20191115002033/http://effbot.org/zone/python-objects.htm [last accessed 04-10-2021];
  • Robert Heaton, “Is Python pass-by-reference or pass-by-value”, https://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/ [last accessed 04-10-2021];
  • StackOverflow question and answers, “How do I pass a variable by reference?”, https://stackoverflow.com/q/986006/2828287 [last accessed 04-10-2021];
  • StackOverflow question and answers, “Passing values in Python [duplicate]”, https://stackoverflow.com/q/534375/2828287 [last accessed 04-10-2021];
  • Twitter thread by @mathsppblog , https://twitter.com/mathsppblog/status/1445148566721335298 [last accessed 20-10-2021];

Previous Post Next Post

Random Article

Stay in the loop, popular tags.

  • 1 April 2024
  • 5 March 2024
  • 2 February 2024
  • 8 January 2024
  • 1 December 2023
  • 22 November 2023
  • 4 October 2023
  • 6 September 2023
  • 7 August 2023
  • 12 July 2023
  • 4 June 2023

Number 73 is a number packed with interesting properties.

mathspp

  • Skip to main content
  • Skip to search
  • Skip to select language
  • Sign up for free
  • English (US)

Assignment (=)

The assignment ( = ) operator is used to assign a value to a variable or property. The assignment expression itself has a value, which is the assigned value. This allows multiple assignments to be chained in order to assign a single value to multiple variables.

A valid assignment target, including an identifier or a property accessor . It can also be a destructuring assignment pattern .

An expression specifying the value to be assigned to x .

Return value

The value of y .

Thrown in strict mode if assigning to an identifier that is not declared in the scope.

Thrown in strict mode if assigning to a property that is not modifiable .

Description

The assignment operator is completely different from the equals ( = ) sign used as syntactic separators in other locations, which include:

  • Initializers of var , let , and const declarations
  • Default values of destructuring
  • Default parameters
  • Initializers of class fields

All these places accept an assignment expression on the right-hand side of the = , so if you have multiple equals signs chained together:

This is equivalent to:

Which means y must be a pre-existing variable, and x is a newly declared const variable. y is assigned the value 5 , and x is initialized with the value of the y = 5 expression, which is also 5 . If y is not a pre-existing variable, a global variable y is implicitly created in non-strict mode , or a ReferenceError is thrown in strict mode. To declare two variables within the same declaration, use:

Simple assignment and chaining

Value of assignment expressions.

The assignment expression itself evaluates to the value of the right-hand side, so you can log the value and assign to a variable at the same time.

Unqualified identifier assignment

The global object sits at the top of the scope chain. When attempting to resolve a name to a value, the scope chain is searched. This means that properties on the global object are conveniently visible from every scope, without having to qualify the names with globalThis. or window. or global. .

Because the global object has a String property ( Object.hasOwn(globalThis, "String") ), you can use the following code:

So the global object will ultimately be searched for unqualified identifiers. You don't have to type globalThis.String ; you can just type the unqualified String . To make this feature more conceptually consistent, assignment to unqualified identifiers will assume you want to create a property with that name on the global object (with globalThis. omitted), if there is no variable of the same name declared in the scope chain.

In strict mode , assignment to an unqualified identifier in strict mode will result in a ReferenceError , to avoid the accidental creation of properties on the global object.

Note that the implication of the above is that, contrary to popular misinformation, JavaScript does not have implicit or undeclared variables. It just conflates the global object with the global scope and allows omitting the global object qualifier during property creation.

Assignment with destructuring

The left-hand side of can also be an assignment pattern. This allows assigning to multiple variables at once.

For more information, see Destructuring assignment .

Specifications

Browser compatibility.

BCD tables only load in the browser with JavaScript enabled. Enable JavaScript to view data.

  • Assignment operators in the JS guide
  • Destructuring assignment

Search code, repositories, users, issues, pull requests...

Provide feedback.

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly.

To see all available qualifiers, see our documentation .

  • Notifications

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement . We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

named return values and reference-assignment operators #286

@andrewrk

andrewrk commented Mar 27, 2017 • edited

@andrewrk

raulgrell commented Mar 27, 2017 • edited

Sorry, something went wrong.

@thejoshwolfe

thejoshwolfe commented Mar 28, 2017

Raulgrell commented mar 28, 2017.

@thejoshwolfe

andrewrk commented Mar 28, 2017

@tiehuis

No branches or pull requests

@thejoshwolfe

This browser is no longer supported.

Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.

Resolve nullable warnings

  • 5 contributors

This article covers the following compiler warnings:

  • CS8597 - Thrown value may be null.
  • CS8600 - Converting null literal or possible null value to non-nullable type.
  • CS8601 - Possible null reference assignment.
  • CS8602 - Dereference of a possibly null reference.
  • CS8603 - Possible null reference return.
  • CS8604 - Possible null reference argument for parameter.
  • CS8605 - Unboxing a possibly null value.
  • CS8607 - A possible null value may not be used for a type marked with [NotNull] or [DisallowNull]
  • CS8608 - Nullability of reference types in type doesn't match overridden member.
  • CS8609 - Nullability of reference types in return type doesn't match overridden member.
  • CS8610 - Nullability of reference types in type parameter doesn't match overridden member.
  • CS8611 - Nullability of reference types in type parameter doesn't match partial method declaration.
  • CS8612 - Nullability of reference types in type doesn't match implicitly implemented member.
  • CS8613 - Nullability of reference types in return type doesn't match implicitly implemented member.
  • CS8614 - Nullability of reference types in type of parameter doesn't match implicitly implemented member.
  • CS8615 - Nullability of reference types in type doesn't match implemented member.
  • CS8616 - Nullability of reference types in return type doesn't match implemented member.
  • CS8617 - Nullability of reference types in type of parameter doesn't match implemented member.
  • CS8618 - Non-nullable variable must contain a non-null value when exiting constructor. Consider declaring it as nullable.
  • CS8619 - Nullability of reference types in value doesn't match target type.
  • CS8620 - Argument cannot be used for parameter due to differences in the nullability of reference types.
  • CS8621 - Nullability of reference types in return type doesn't match the target delegate (possibly because of nullability attributes).
  • CS8622 - Nullability of reference types in type of parameter doesn't match the target delegate (possibly because of nullability attributes).
  • CS8624 - Argument cannot be used as an output due to differences in the nullability of reference types.
  • CS8625 - Cannot convert null literal to non-nullable reference type.
  • CS8629 - Nullable value type may be null.
  • CS8631 - The type cannot be used as type parameter in the generic type or method. Nullability of type argument doesn't match constraint type.
  • CS8633 - Nullability in constraints for type parameter of method doesn't match the constraints for type parameter of interface method. Consider using an explicit interface implementation instead.
  • CS8634 - The type cannot be used as type parameter in the generic type or method. Nullability of type argument doesn't match 'class' constraint.
  • CS8643 - Nullability of reference types in explicit interface specifier doesn't match interface implemented by the type.
  • CS8644 - Type does not implement interface member. Nullability of reference types in interface implemented by the base type doesn't match.
  • CS8645 - Member is already listed in the interface list on type with different nullability of reference types.
  • CS8655 - The switch expression does not handle some null inputs (it is not exhaustive).
  • CS8667 - Partial method declarations have inconsistent nullability in constraints for type parameter.
  • CS8670 - Object or collection initializer implicitly dereferences possibly null member.
  • CS8714 - The type cannot be used as type parameter in the generic type or method. Nullability of type argument doesn't match 'notnull' constraint.
  • CS8762 - Parameter must have a non-null value when exiting.
  • CS8763 - A method marked [DoesNotReturn] should not return.
  • CS8764 - Nullability of return type doesn't match overridden member (possibly because of nullability attributes).
  • CS8765 - Nullability of type of parameter doesn't match overridden member (possibly because of nullability attributes).
  • CS8766 - Nullability of reference types in return type of doesn't match implicitly implemented member (possibly because of nullability attributes).
  • CS8767 - Nullability of reference types in type of parameter of doesn't match implicitly implemented member (possibly because of nullability attributes).
  • CS8768 - Nullability of reference types in return type doesn't match implemented member (possibly because of nullability attributes).
  • CS8769 - Nullability of reference types in type of parameter doesn't match implemented member (possibly because of nullability attributes).
  • CS8770 - Method lacks [DoesNotReturn] annotation to match implemented or overridden member.
  • CS8774 - Member must have a non-null value when exiting.
  • CS8776 - Member cannot be used in this attribute.
  • CS8775 - Member must have a non-null value when exiting.
  • CS8777 - Parameter must have a non-null value when exiting.
  • CS8819 - Nullability of reference types in return type doesn't match partial method declaration.
  • CS8824 - Parameter must have a non-null value when exiting because parameter is non-null.
  • CS8825 - Return value must be non-null because parameter is non-null.
  • CS8847 - The switch expression does not handle some null inputs (it is not exhaustive). However, a pattern with a 'when' clause might successfully match this value.

The purpose of nullable warnings is to minimize the chance that your application throws a System.NullReferenceException when run. To achieve this goal, the compiler uses static analysis and issues warnings when your code has constructs that may lead to null reference exceptions. You provide the compiler with information for its static analysis by applying type annotations and attributes. These annotations and attributes describe the nullability of arguments, parameters, and members of your types. In this article, you'll learn different techniques to address the nullable warnings the compiler generates from its static analysis. The techniques described here are for general C# code. Learn to work with nullable reference types and Entity Framework core in Working with nullable reference types .

You'll address almost all warnings using one of four techniques:

  • Adding necessary null checks.
  • Adding ? or ! nullable annotations.
  • Adding attributes that describe null semantics.
  • Initializing variables correctly.

Possible dereference of null

This set of warnings alerts you that you're dereferencing a variable whose null-state is maybe-null . These warnings are:

The following code demonstrates one example of each of the preceding warnings:

In the example above, the warning is because the Container , c , may have a null value for the States property. Assigning new states to a collection that might be null causes the warning.

To remove these warnings, you need to add code to change that variable's null-state to not-null before dereferencing it. The collection initializer warning may be harder to spot. The compiler detects that the collection maybe-null when the initializer adds elements to it.

In many instances, you can fix these warnings by checking that a variable isn't null before dereferencing it. Consider the following that adds a null check before dereferencing the message parameter:

The following example initializes the backing storage for the States and removes the set accessor. Consumers of the class can modify the contents of the collection, and the storage for the collection is never null :

Other instances when you get these warnings may be false positive. You may have a private utility method that tests for null. The compiler doesn't know that the method provides a null check. Consider the following example that uses a private utility method, IsNotNull :

The compiler warns that you may be dereferencing null when you write the property message.Length because its static analysis determines that message may be null . You may know that IsNotNull provides a null check, and when it returns true , the null-state of message should be not-null . You must tell the compiler those facts. One way is to use the null forgiving operator, ! . You can change the WriteLine statement to match the following code:

The null forgiving operator makes the expression not-null even if it was maybe-null without the ! applied. In this example, a better solution is to add an attribute to the signature of IsNotNull :

The System.Diagnostics.CodeAnalysis.NotNullWhenAttribute informs the compiler that the argument used for the obj parameter is not-null when the method returns true . When the method returns false , the argument has the same null-state it had before the method was called.

There's a rich set of attributes you can use to describe how your methods and properties affect null-state . You can learn about them in the language reference article on Nullable static analysis attributes .

Fixing a warning for dereferencing a maybe-null variable involves one of three techniques:

  • Add a missing null check.
  • Add null analysis attributes on APIs to affect the compiler's null-state static analysis. These attributes inform the compiler when a return value or argument should be maybe-null or not-null after calling the method.
  • Apply the null forgiving operator ! to the expression to force the state to not-null .

Possible null assigned to a nonnullable reference

This set of warnings alerts you that you're assigning a variable whose type is nonnullable to an expression whose null-state is maybe-null . These warnings are:

The compiler emits these warnings when you attempt to assign an expression that is maybe-null to a variable that is nonnullable. For example:

The different warnings indicate provide details about the code, such as assignment, unboxing assignment, return statements, arguments to methods, and throw expressions.

You can take one of three actions to address these warnings. One is to add the ? annotation to make the variable a nullable reference type. That change may cause other warnings. Changing a variable from a non-nullable reference to a nullable reference changes its default null-state from not-null to maybe-null . The compiler's static analysis may find instances where you dereference a variable that is maybe-null .

The other actions instruct the compiler that the right-hand-side of the assignment is not-null . The expression on the right-hand-side could be null-checked before assignment, as shown in the following example:

The previous examples demonstrate assignment of the return value of a method. You may annotate the method (or property) to indicate when a method returns a not-null value. The System.Diagnostics.CodeAnalysis.NotNullIfNotNullAttribute often specifies that a return value is not-null when an input argument is not-null . Another alternative is to add the null forgiving operator, ! to the right-hand side:

Fixing a warning for assigning a maybe-null expression to a not-null variable involves one of four techniques:

  • Change the left side of the assignment to a nullable type. This action may introduce new warnings when you dereference that variable.
  • Provide a null-check before the assignment.
  • Annotate the API that produces the right-hand side of the assignment.
  • Add the null forgiving operator to the right-hand side of the assignment.

Nonnullable reference not initialized

This set of warnings alerts you that you're assigning a variable whose type is non-nullable to an expression whose null-state is maybe-null . These warnings are:

Consider the following class as an example:

Neither FirstName nor LastName are guaranteed initialized. If this code is new, consider changing the public interface. The above example could be updated as follows:

If you require creating a Person object before setting the name, you can initialize the properties using a default non-null value:

Another alternative may be to change those members to nullable reference types. The Person class could be defined as follows if null should be allowed for the name:

Existing code may require other changes to inform the compiler about the null semantics for those members. You may have created multiple constructors, and your class may have a private helper method that initializes one or more members. You can move the initialization code into a single constructor and ensure all constructors call the one with the common initialization code. Or, you can use the System.Diagnostics.CodeAnalysis.MemberNotNullAttribute and System.Diagnostics.CodeAnalysis.MemberNotNullWhenAttribute attributes. These attributes inform the compiler that a member is not-null after the method has been called. The following code shows an example of each. The Person class uses a common constructor called by all other constructors. The Student class has a helper method annotated with the System.Diagnostics.CodeAnalysis.MemberNotNullAttribute attribute:

Finally, you can use the null forgiving operator to indicate that a member is initialized in other code. For another example, consider the following classes representing an Entity Framework Core model:

The DbSet property is initialized to null! . That tells the compiler that the property is set to a not-null value. In fact, the base DbContext performs the initialization of the set. The compiler's static analysis doesn't pick that up. For more information on working with nullable reference types and Entity Framework Core, see the article on Working with Nullable Reference Types in EF Core .

Fixing a warning for not initializing a nonnullable member involves one of four techniques:

  • Change the constructors or field initializers to ensure all nonnullable members are initialized.
  • Change one or more members to be nullable types.
  • Annotate any helper methods to indicate which members are assigned.
  • Add an initializer to null! to indicate that the member is initialized in other code.

Mismatch in nullability declaration

Many warnings indicate nullability mismatches between signatures for methods, delegates, or type parameters.

The following code demonstrates CS8764 :

The preceding example shows a virtual method in a base class and an override with different nullability. The base class returns a non-nullable string, but the derived class returns a nullable string. If the string and string? are reversed, it would be allowed because the derived class is more restrictive. Similarly, parameter declarations should match. Parameters in the override method can allow null even when the base class doesn't.

Other situations can generate these warnings. You may have a mismatch in an interface method declaration and the implementation of that method. Or a delegate type and the expression for that delegate may differ. A type parameter and the type argument may differ in nullability.

To fix these warnings, update the appropriate declaration.

Code doesn't match attribute declaration

The preceding sections have discussed how you can use Attributes for nullable static analysis to inform the compiler about the null semantics of your code. The compiler warns you if the code doesn't adhere to the promises of that attribute:

Consider the following method:

The compiler produces a warning because the message parameter is assigned null and the method returns true . The NotNullWhen attribute indicates that shouldn't happen.

To address these warnings, update your code so it matches the expectations of the attributes you've applied. You may change the attributes, or the algorithm.

Exhaustive switch expression

Switch expressions must be exhaustive , meaning that all input values must be handled. Even for non-nullable reference types, the null value must be accounted for. The compiler issues warnings when the null value isn't handled:

The following example code demonstrates this condition:

The input expression is a string , not a string? . The compiler still generates this warning. The { } pattern handles all non-null values, but doesn't match null . To address these errors, you can either add an explicit null case, or replace the { } with the _ (discard) pattern. The discard pattern matches null as well as any other value.

Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see: https://aka.ms/ContentUserFeedback .

Submit and view feedback for

Additional resources

IMAGES

  1. CPP-return-statement.png

    reference assignment return value

  2. C++ Return by Reference

    reference assignment return value

  3. Reference lists

    reference assignment return value

  4. Guide to do proper harvard referencing in assignment and dissertation

    reference assignment return value

  5. How to Write Reference in Assignment ️ Useful Guide

    reference assignment return value

  6. Havard Referencing a quick guide

    reference assignment return value

VIDEO

  1. Java Programming # 44

  2. Assignment with a Returned Value (Basic JavaScript) freeCodeCamp tutorial

  3. What value does Point of Reference deliver to CMA professionals?

  4. Practical Assignment

  5. How To Use Cell References in Excel: Relative Reference & Absolute Cell Reference

  6. 8

COMMENTS

  1. c++

    Take a const-reference input (const MyClass &rhs) as the right hand side of the assignment. The reason for this should be obvious, since we don't want to accidentally change that value; we only want to change what's on the left hand side. Always return a reference to the newly altered left hand side, return *this.

  2. c++

    Returning a reference from assignment allows chaining: a = b = c; // shorter than the equivalent "b = c; a = b;" (This would also work (in most cases) if the operator returned a copy of the new value, but that's generally less efficient.) We can't return a reference from arithmetic operations, since they produce a new value.

  3. Copy elision

    This variant of copy elision is known as NRVO, "named return value optimization." In the initialization of an object, when the source object is a nameless temporary and is of the same class type (ignoring cv-qualification) as the target object. When the nameless temporary is the operand of a return statement, this variant of copy elision is ...

  4. Standard C++

    Remember: the reference is the referent, so changing the reference changes the state of the referent. In compiler writer lingo, a reference is an "lvalue" (something that can appear on the left hand side of an assignment operator). What happens if you return a reference? The function call can appear on the left hand side of an assignment ...

  5. Assignment operators

    For the built-in simple assignment, the object referred to by target-expr is modified by replacing its value with the result of new-value. target-expr must be a modifiable lvalue. The result of a built-in simple assignment is an lvalue of the type of target-expr, referring to target-expr. If target-expr is a bit-field, the result is also a bit ...

  6. 12.3

    New C++ programmers often try to reseat a reference by using assignment to provide the reference with another variable to reference. This will compile and run -- but not function as expected. ... { 5 }; // normal integer int& ref { x }; // reference to variable value return 0; } // x and ref die here. References and referents have independent ...

  7. How to Return by Reference in C++?

    Update a value using return reference In the above example, the C++ return reference function is used to update the values in the array. The function call of setValues(1) is similar to original_array[1] as the function returns a reference to the setValues() function and the respective values on the right-hand side(RHS) of the assignment will be ...

  8. The new C++ 11 rvalue reference && and why you should start using it

    Introduction. This is an attempt to explain new && reference present in latest versions of compilers as part of implementing the new C++ 11 standard. Such as those shipping with Visual studio 10-11-12 and gcc 4.3-4, or beautiful fast ( equally if not more) open-source alternative to gcc Clang.

  9. Returning values by reference in C++

    A C++ program can be made easier to read and maintain by using references rather than pointers. A C++ function can return a reference in a similar way as it returns a pointer. When a function returns a reference, it returns an implicit pointer to its return value. This way, a function can be used on the left side of an assignment statement.

  10. C++ Rvalue References: The Unnecessarily Detailed Guide

    Every C++ expression is either an lvalue or rvalue. Roughly, it's an lvalue if you can "take its address", and an rvalue otherwise. For example, if you've declared a variable int x;, then x is an lvalue, but 253 and x + 6 are rvalues. If you can assign to it, it's definitely an lvalue. Rvalue references are a new kind of reference in ...

  11. Rvalue reference declarator: &&

    The compiler uses reference collapsing rules to reduce the signature: print_type_and_value<string&>(string& t) This version of the print_type_and_value function then forwards its parameter to the correct specialized version of the S::print method. The following table summarizes the reference collapsing rules for template argument type deduction:

  12. Why assignment operator overloading must return reference

    The reason you should return a mutable reference from your assignment operator is that not doing so causes your returned value to be copied each time your assignment operator is called. This means that, for every assignment operator invocation, you cause not just an assignment operator invocation, but also a copy constructor invocation and a ...

  13. return statement

    a constructor, a destructor, or. a function-try-block for a function with the return type (possibly cv-qualified) void. without encountering a return statement, return; is executed. If control reaches the end of the main function, return0; is executed. Flowing off the end of a value-returning function, except main and specific coroutines (since ...

  14. Pass-by-value, reference, and assignment

    most of the others use the pass-by-reference model. Having said that, it is important to know the model that Python uses, because that influences the way your code behaves. In this Pydon't, you will: see that Python doesn't use the pass-by-value nor the pass-by-reference models; understand that Python uses a pass-by-assignment model;

  15. Assignment operator's return value can be a reference?

    The canonical assignment operator for a class Foo looks like this: class Foo { ... Foo& operator=(const Foo& other) { // make this the same as other return *this; } }; As you can see, it does return a reference. In this case the object is ok, and so it returns a reference to ok, which is an lvalue, which is why the call works.

  16. Assignment (=)

    The assignment operator is completely different from the equals (=) sign used as syntactic separators in other locations, which include:Initializers of var, let, and const declarations; Default values of destructuring; Default parameters; Initializers of class fields; All these places accept an assignment expression on the right-hand side of the =, so if you have multiple equals signs chained ...

  17. What is the benefit of having the assignment operator return a value?

    Generally speaking, no. The idea of having the value of an assignment expression be the value that was assigned means that we have an expression which may be used for both its side effect and its value, and that is considered by many to be confusing. Common usages are typically to make expressions compact: x = y = z;

  18. Why does the assignment operator return a value and not a reference?

    4.Call PutValue(Result(1), Result(3)). 5.Return Result(3). Basically when you make (foo.bar = foo.bar) the actual assignment ( Step 4.) has no effect because PutValue will only get the value of the reference and will place it back, with the same base object. The key is that the assignment operator returns ( Step 5) the value obtained in the ...

  19. named return values and reference-assignment operators #286

    Proposal: There are 2 forms of return values. Status quo, and named return values. Copyable types must use normal return syntax. Non-copyable types must use named return syntax. const Point = struct { x: i32, y: i32, }; const Vec3 = stru...

  20. Resolve nullable warnings

    CS8601 - Possible null reference assignment. CS8602 - Dereference of a possibly null reference. CS8603 - Possible null reference return. CS8604 - Possible null reference argument for parameter. CS8605 - Unboxing a possibly null value. CS8607 - A possible null value may not be used for a type marked with [NotNull] or [DisallowNull]

  21. c

    The rule is to return the right-hand operand of = converted to the type of the variable which is assigned to. int a; float b; a = b = 4.5; // 4.5 is a double, it gets converted to float and stored into b. // this returns a float which is converted to an int and stored in a. // the whole expression returns an int. Share.