Pizer’s Weblog

programming, DSP, math

C++0x: Do people understand rvalue references?

with 8 comments

[last update: 2009-05-27]

Occasionally I happen to see a piece of C++ code from an eager author that tries to utilize rvalue references but doesn’t understand them and gets it wrong. I remember that I had trouble fully grasping the rvalue reference concept. But now, it seems like that was way in the past. I can hardly remember why it was so difficult to understand rvalue references.

It may have something to do with the sources people use to learn about this feature. For example I wouldn’t consider the C++0x wikipedia article to be a good introduction for rvalue references. It seems too short for that. But this wikipedia article is a nice overview of what kind of features can be expected.

Let me just mention the real sources worth reading:

  • N1377: Move Proposal (2002)
  • N1385: Perfect Forwarding Problem (2002)
  • N2027: A Brief Introduction to Rvalue References (2006)
  • N2812: A Safety Problem with Rvalue References (2008)
  • N2844: Fixing a Safety Problem with Rvalue References (Revision 1, 2009)
  • Rvalue References 101by Dave Abrahams

I would recommend N2027 as an introduction and Dave Abraham’s article for more practical details including exception safety and return value optimizations.

How do you know whether you got something wrong? Well, if you feel the need to declare a function that returns a reference (be it lvalue reference or rvalue reference) to a function-local object, you got it wrong:

  std::string&& wrong() {
     std::string r = "foo";
     r += "bar";
     return std::move(r);
  }

  std::string right() {
     std::string r = "foo";
     r += "bar";
     return r;
  }

This is an exampe adapted from a piece of code from another author who shall remain nameless. Part of the original author’s confusion is probably due to the name of the function move and the fact that it returns an rvalue reference. Truth is, the function move doesn’t move anything. It just converts an lvalue reference to an unnamed rvalue reference. Only when this unnamed rvalue reference is later used as source in constructing a new object it might actually be a move-construction. But it could also be a copy-construction in case the type doesn’t support moving. In this example you actually don’t need any C++0x specific syntax to benefit from movable string objects. Firstly, returning references to function-local objects is still a no-go. Rvalue references don’t make that any less true. Secondly, the std::move() call is unnecessary. In fact, it may even prevent a common optimization called NRVO. Since this optimization is even better than a move-construction of the return value you should not disable NRVO by being “smart”. The new language rules already force compilers to move-construct return values from function-local objects in case they don’t support (N)RVO or simply can’t apply NRVO in a specific case.

I also saw an early example by Howard Hinnant (one of the authors of the rvalue reference proposal) in which he wrote a function that returned an rvalue reference. It didn’t refer to function-local objects but it referred to a reference parameter object which could have been a temporary and thus led to a dangling reference. He later said that he considers this example to be buggy and dangerous.

  
  string operator+(string const& a, string const& b) {
     string result;
     result.reserve( a.length() + b.length() );
     result += a;
     result += b;
     return result;
  }

  #ifdef WRONG

  string&& operator+(string && a, string const& b) {
     a += b;
     return move(a);
  }

  #else // correct version following ...

  string operator+(string && a, string const& b) {
     a += b;
     return move(a);
  }

  #endif

The version named “wrong” in this example has an advantage and a disadvantage: It allows the recycling of temporary objects in an expression like get_directory() + "/" + get_name() + ".png" but it also allows the user to bind a reference (rvalue reference or just an ordinary reference to const) to the result which leads to a dangling reference. The 2nd version is much better in terms of safety because of a special rule that makes the compiler extend the life-time of the returned temporary in some cases. Instead of recycling a temporary new temporaries will be move-constructed from old temporaries. This is believed to be a good trade-off.

In this situation we actually need std::move because a named rvalue reference behaves just like an lvalue. Anything that has a name or is an lvalue reference behaves like an lvalue. This includes a named rvalue reference.

Conclusions:

  • Read a high-quality introduction to rvalue references before using this feature
  • Never ever return references to non-static function-local objects.
  • As a rule of thumb: Avoid returning rvalue references to reference parameters. The functions that do so are std::move and std::forward. They serve a special purpose.
  • Don’t use std::move on a non-static function-local object in a return expression. The compiler takes care of it already and may be even able to elide the move (NRVO).
  • Don’t use std::move on expressions that are known to return an rvalue (temporary object) already. If you do it may disable the RVO optimization.

- P

About these ads

Written by pizer

April 13, 2009 at 12:50 pm

8 Responses

Subscribe to comments with RSS.

  1. Shouldn’t the second code example, line 3 read

    `result.reserve( a.length() + b.length() );`

    instead of

    `result.reserve( a.length(), b.length() );`

    ?

    Jack

    April 13, 2009 at 1:52 pm

  2. Yes, you’re right. Thanks for pointing out the error. I’ve corrected it.

    pizer

    April 13, 2009 at 2:00 pm

  3. string operator+(string && a, string const& b) {
    a += b;
    return move(a);
    }

    in the above you don’t need the move(a).. because all that does is cast to an rvalue.. which is immediately ignored

    goalieca

    October 29, 2009 at 1:24 am

  4. It’s not immediately ignored. It makes the compiler move-construct the return value from the object referenced by “a” instead of copy-construct from it.

    pizer

    October 29, 2009 at 1:29 am

  5. “The version named “wrong” in this example has … a disadvantage … [it] allows the user to bind a reference (rvalue reference or just an ordinary reference to const) to the result which leads to a dangling reference.”

    I’m not certain I followed why this is a problem. What problematic code can the user of the operator+ example do with “wrong” that they couldn’t also write with “correct version”?

    Tristan Wibberley

    March 10, 2012 at 5:38 pm

  6. For example:

    string foo();
    string bar();

    int main()
    {
    auto&& x = foo() + bar();
    // use of x
    // in case string+string returned a reference to a temporary, this temporary
    // would not be kept alive long enough which would make x a dangling reference
    }

    This might seem like an artificial corner case, but this auto&& trick is actually used inside the new for range loop to avoid unnecessary copying of the range. So, if you write

    int main()
    {
    for(char c : foo()+bar()) {
    cout << '[' << c << ']';
    }
    cout << endl;
    }

    and operator+ returned a reference, you’d hold on to a string reference which would become invalid and then the code would invoke undefined behaviour by accessing the non-existing range.

    pizer

    July 4, 2012 at 11:19 am

  7. Which versions of operator+ does you example “main” function compile successfully against, and which does it suffer problems with?

    I thought both. Have I missed something?

    Tristan Wibberley

    July 4, 2012 at 8:37 pm


  8. template<class T> T&& ohno(T&& x) {return std::forward<T>(x);}
    template<class T> T okay(T&& x) {return std::forward<T>(x);}

    int main()
    {
    auto&& x = ohno(42);
    auto&& y = okay(42);
    // x is a dangling reference
    // y is NOT
    }

    There is a rule in the C++ standard which extends the life-time of the temporary in certain cases. This rule applies in the case of the reference called y, but not in the case of the reference x. That’s because the information that the returned reference from ohno actually refers to a temporary is lost and not apparent by just looking at the function signature. Does that answer your question?

    pizer

    July 5, 2012 at 9:41 am


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: