Archive for June, 2009

Rvalue reference

The new feature in C++0x was rather confusing to me until yesterday when I suddenly realized that my codes could be more efficient if we had rvalue references.

In my understanding, the main practical use of rvalue references is to eliminate spurious copies by introducing a “move” semantics in addition to the existing “copy” semantics.

Suppose we have a map object: map<int,SomeComplexType> my_map;

The most intuitive statement to add something to it is my_map[key] = value;.

In current C++, a copy assignment must be triggered here, potentially unnecessary and expensive. (“Copy” semantics.)

If value will not be used later (esp. it’s a temporary object), we may want to “move” instead of “copy” it into the map. (“Move” semantics.)
[Sure, we can use value.swap (my_map[key]); if swapping is efficient (e.g. STL strings & containers). But this is rather unreadable.]

In C++0x, with rvalue references, we can distinguish them easily:

  1. Use "copy" semantics in SomeComplexType::operator = (const SomeComplexType &);
  2. Use "move" semantics in SomeComplexType::operator = (SomeComplexType &&); (Should we call it a "move assignment"?)

Now the compiler automatically chooses between the "copy" or "move" semantics for my_map[key] = value;, depending on whether value is an rvalue or not.

It is also possible to force the "move" semantics: my_map[key] = std::move (value);

What std::move does is accept either an lvalue or rvalue reference, and return it as an rvalue reference.


Microsoft Visual C++ supports, as a non-standard extension, binding temporary objects to non-const (lvalue) references. This extension cannot substitute rvalue references:

string a = "Hello";
string b = a;

If we use move semantics in string::string (string &), then a will be empty after b's construction. This usually is not what we desire.


Again, my main concern about C++0x is that it's going to be too complicated to learn.


Reference:
A Brief Introduction to Rvalue References

Tags: ,

wprintf(“%s”,…)

Microsoft and GNU interprets %s differently in the wide-string version of the printf-family functions (wprintf, etc.)

Microsoft: “when used with wprintf functions, specifies a wide-character string.”

C99 and GNU: “If no l modifier is present: The const char * argument is expected to be a pointer to an array of character type (pointer to a string).”

Fortunately, both accept “%ls” for wide strings.

Unfortunately, the only supported format specifier for multi-byte (narrow) strings in C99 is “%s”, which Microsoft interpret differently.

Fortunately, the specifier that Microsoft recommends for multi-byte strings, “%hs”, is also accepted by many other C libraries, though undocumented. Such acceptance is very reasonable – the unknown prefix h is simply ignored. (I tested it with GNU and Solaris C libraries.)It seems such acceptance is necessary in order to strictly conform to the wording of C99.

Microsoft wprintf GNU wprintf C99
%s Wide Narrow Narrow
%S Narrow Wide (deprecated)
%hs Narrow Narrow (undocumented)
%ls Wide Wide Wide

To draw a conclusion:

  1. Everybody agrees that, in wprintf, “%ls” specifies a wide string. (I’m not sure whether VC6 supports it.)
  2. There is no consensus on the specifier for multi-byte strings. The best practical choice is “%hs”.

This table and conclusion also apply to the “%c” family.

Tags: ,

I hate the “c…” headers

What’s the reason for using <cstdio> instead of <stdio.h>? Merely to pretend more standard compliant?

Framers of the C++ standard probably wished to “clean” the global namespace by pulling everything into std. Unfortunately, many implementations (Microsoft, GNU, etc.) instead put all those symbols in both the global and std namespaces, rendering this argument invalid in practice.

Even more unfortunately, a few other well-known implementations (e.g. Solaris) actually followed the standards.

Actually I lost some points in a course for exactly this reason, in which the TA failed to compile on Solaris my program which compiled well on Linux. In that program I included <cstdio> but forgot to pretend std:: to two printf‘s. Since then, I have always been using <name.h> rather than <cname> though “deprecated.”

To write strictly conforming programs, we need to remember what symbols are macros and what are not. The C++ standard lists those symbols which are symbols:

[Note: the names defined as macros in C include the following: assert, errno, offsetof, setjmp, va_arg, va_end, and va_start. -end note]

They had the rarely-used setjmp here, but omitted three very important ones which the C standard says should be macros. Let’s look into the header stdio.h provided by glibc:

/* Standard streams. */
extern struct _IO_FILE *stdin; /* Standard input stream. */
extern struct _IO_FILE *stdout; /* Standard output stream. */
extern struct _IO_FILE *stderr; /* Standard error output stream. */
/* C89/C99 say they're macros. Make them happy. */
#define stdin stdin
#define stdout stdout
#define stderr stderr

They’re not in std either.

Tags: