Saturday, November 3, 2012

rvalue and universal references

Introductory note: this is the first article in a series about some aspects I don't like about C++.  Please take this as the start of a discussion , and not as sterile complains.

While writing down some considerations about the great Scott Meyers article Universal References in C++11 , it came in my mind why I initially found the name "rvalue reference" at least a bit inappropriate.

One of the first counter-intuitive things I found when introduced to this kind of references was the fact that named rvalue references can be used as lvalues. At the same time , unnamed rvalue references can only  be used as rvalues.

A quick resume here:

int && a = 3;
a = 4;

is valid code. Unexpected , at least for a C++98 user, but valid :)

The temporary object "3" is bound to the rvalue refence and it's lifetime extended. Now, according to the rule above , that temporary object can be used as it was an lvalue.
In fact it can also be bound to an lvalue reference!

int & b = a;

instead, if you use an unnamed rvalue reference(say, returned from a function), you can only use it as rvalue.
int && ret () { return 3; }
ret () = 4;  // ERROR

it doesn't compile because the rvalue reference is unnamed. It could be used as function argument,instead.
All of this is pretty logical and useful, in this way you can reuse and extend the lifetime of temporary objects, without making a copy.
--

Returning to the point: the part "rvalue" in the "rvalue references" name is not related to "rvalues" in any special way. At least no more than how it's related to lvalues.

"Temporary object reference",as name,can be an alternative to better explain the concept and not mix the names (but surely I won't discuss the standard committee difficult naming choices).
This is how I usually think rvalue references.

Now, the article above gave me another surprise about this C++11 topic : universal references.

(I admittely don't do advanced template metaprogramming, so I still didn't had to use auto and decltype with rvalue references; I didn't know these rules before)

To quickly resume, using the && syntax with a deduced type doesn't necessarily generate an rvalue reference.

int value = 10; 
auto && a =  value;
int && b = value;

a is actually an lvalue reference, while b is actually an rvalue reference.
This is because a has a deduced type, while b has an explicit type.


In this case kind of usage of these "universal references" has nothing to do both with other "rvalues" and "rvalue references" usage at all.

--
So in the end we have that
- deduced types that appear as && , will become lvalue references or rvalue references depending on the case.
- named rvalue references (&&) will used as lvalues
- unnamed rvalue references will be used as rvalues when all of these have the same apperance.

Scott Meyers , about the "universal references", in it's blog article, writes the following:
"I really think that the idea of universal references helps make C++11's new "&&" syntax a lot easier to understand.If you agree, please use this term and encourage others to do it, too. "

Sorry, I don't agree here.
Don't you see that you have two really different things, with different behavoir and rules, that appear exactly the same in the code?

This seems terrific to me.At least very counter-intuitive.

Why these concepts got mixed in that way?
What about maintability and readability?
And, given all of these explanations , why the name "rvalue references"?


In the snipped above, you can't figure out the type of a unless you exactly know the language rules(or you have intellisense).
While ignorance is not an excuse , this will make programmer's life difficut, specially for those who don't regularly do advanced things with templates and metaprogramming, or follow the language evolution.
It will end up that either the universal references will be wrongly exchanged and used as rvalue references (eventually leading to bugs), or not used at all.

2 comments:

  1. The confusing thing is that '&&' is context dependent. The concept of a 'universal reference' gives us a framework to think about the context, so I would agree with Scott Meyers on this.

    With this conceptual framework, I think your first example is fairly easy to understand.

    int value = 10;
    auto && a = value;
    int && b = value;

    '&& a' means only means that a is a reference -- it can be an lvalue reference or an rvalue reference. Since 'value' is an lvalue, 'auto &&' deduces the type to be an lvalue reference. When you say 'int &&', you are explicitly defining an rvalue reference. Note, however, that b itself is still an lvalue!

    I think Scott's Blog does a great job outlining why it works this way. Consider the example from his blog:

    template
    void f(T&& param);

    int x;
    f(10); // invoke f on rvalue
    f(x); // invoke f on lvalue

    As Scott explains, because of type deduction, the function f could ultimately be a reference-to-a-reference, which is not valid. What are the options?

    -We could declare f(x) to be invalid, but that would be highly inconsistent. We would have to declare a second f taking type T&, resulting in unnecessary code duplication. Worse, without understanding the context, now there is ambiguity as to which f gets called.
    -We could invent a second symbol instead of && to use with f, but that would lead to inconsistent syntax for function declarations.
    - Or we could pick the lesser of evils and collapse the references. Sometimes && means lvalue, and sometimes it means rvalue.

    If this bothers you, ask yourself two questions:

    1. What does '&' mean?
    2. What does 'const' mean?

    The answer, of course, is it depends entirely on the context. '&' could declare a reference, take an address, or perform a bitwise AND. 'const' could be used to declare a constant pointer to an int or a pointer to a constant int, two very different concepts.

    I think it will take time for programmers to digest rvalue references, and the concept of a 'universal reference' helps us get there.

    ReplyDelete
    Replies
    1. yep, personally I would have picked the second option: having a specific syntax for universal references. It's my opinion, of course. Probably they considered it also while writing the standard and there were other drawbacks. (for instance &&& )

      While in context of template I could understand the && ambiguity (there are many specific rules anyway), the "auto &&" case, in non-templated code, is really un-unintuitive to read.

      many keywords are used in different contexts, but & is always a lvalue reference in the context of type definitions.

      Also , the name "rvalues reference" is a bit confusing itself, at least because named rvalues references can bind to lvalues.

      Anyway at this point nothing can be done, let's use the goodness of these new reference types and hope that people will be able to use them correctly :)

      Delete