# Pizer’s Weblog

programming, DSP, math

## Writing portable C and C++ code

### Width and value ranges of Integer types

C and C++ only specify minimum requirements regarding the width and value range of different integer types and how these are converted to each other. If you want to write portable code you should be aware of this. Check out the article Integer Types in C and C++.

The header <stdint.h> is also very useful. Unfortunately it’s a C99 header. You may be able to use it in C++ code, too. Some vendors (recent GNU C++ compiler) even allow you to include <cstdint> which will be officially present in the upcoming C++ standard. This header includes various typedefs for integer types like int_fast16_t.

### Integer conversion

Signed to unsigned conversion:

int a = -5;
unsigned b = ~ unsigned(a);
cout << b << endl;

This will output 4 because the result of a signed to unsigned conversion is guaranteed to be congruent to the original value modulo 2^N where N is the number of bits used for the unsigned variable. I emphasize the word value because this does not imply that the unsigned variable will have the same representation in bits. This is only true for signed numbers stored in two’s complement which is not mandated by the standards.

Unsigned to signed conversion isn’t as well-defined as the other direction. The conversion is only defined for those values that can be represented as signed number. Values that are higher than that lead to implementation-defined behaviour.

unsigned a = INT_MAX + 3u;
int b = a;
cout << b << endl;

Your C/C++ implementation is basically allowed to do anything. Your compiler’s manual should document this as one of the implementation-defined rules. It’s very likely that in case your platform uses two’s complement for representing negative numbers your compiler will simply reinterpret the bit pattern as signed number which also implicitly obeys the “congruent modulo 2^N”-rule. Still, you might want to include some unit tests to verify this if you rely on this behaviour.

### Making your code 64bit compatible

An int is still usually 32 bit. On modern platforms pointers are likely to be 64 bit long. Be aware of that. Make use of std::size_t and std::ptrdiff_t when it makes sense. std::size_t should be big enough to count the amount of addressable bytes in your RAM. std::ptrdiff_t is the result of a pointer difference and might be able to represent more values than int.

### Handling binary data portably

Sometimes you may want to access raw data via a pointer to unsigned char — for example to decode contents of a binary file that contains fields of various formats (ie. signed 16 bit int in two’s complement and little endian byte order). The standards allow this but only for char and unsigned char. Casting pointers to other pointer types might violate alignment and aliasing rules. You’d be leaving the territory of well-defined behaviour. Even the common union trick for converting data is not supported by the official C and C++ standards. But accessing raw data as a sequence of characters (both plain and unsigned char) is fine.

A reasonable implementation to read 16 bit integers in little endian byte order might look like this:

#include <climits>
#include <stdint.h>
#if CHAR_BIT != 8
#error Only supported for 8bit chars
#endif

template<bool B, typename R> struct enable_if {};
template<typename R> struct enable_if<true,R> { typedef R type; };

inline uint16_t get_u16le(const void* pv) {
const unsigned char* pc = static_cast<const unsigned char*>(pv);
return pc[0] | (static_cast<uint16_t>(pc[1]) << 8);
}

inline int16_t get_s16le(const void* pv) {
enable_if<(signed(~1u) == -2), int16_t>::type tmp = get_u16le(pv);
// Relying on two's complement and reinterpretation of bit pattern
return tmp;
}

The last bit is probably overkill and might not catch all weird platforms that don’t support this trick. But I feel the need to express that I’m making use of a common but still implementation-defined behaviour.

Cheers!
– P

Written by pizer

December 18, 2008 at 5:11 pm