Processor
The choice of the central unit and its processor implies some organizational choices for the portability with respect to the conception of this processor. Most of the machines have an elementary unit, the byte composed of eight bits. Bytes are accessed in a group named word, generally based on 2 or 4 bytes, i.e. 16 or 32 bits. Two methods are used to arrange the bytes in the words.
Big Endianism This is an ordering from left to right where the leftmost byte has the highest address.
Little Endianism The leftmost significant byte has the lowest address but we must envisage different cases :
– Little Endian 32 bits machine VAX type ;
– Little Endian 32 bits machine Intel 80386 type, where there is a “swab” (swap bytes) more on a half-word to assure the compatibility with the next case ;
– Little Endian 16 bits machine Intel 80286 type ;
The choice of the central unit and its processor implies some organizational choices for the portability with respect to the conception of this processor. Most of the machines have an elementary unit, the byte composed of eight bits. Bytes are accessed in a group named word, generally based on 2 or 4 bytes, i.e. 16 or 32 bits. Two methods are used to arrange the bytes in the words.
Big Endianism This is an ordering from left to right where the leftmost byte has the highest address.
Little Endianism The leftmost significant byte has the lowest address but we must envisage different cases :
– Little Endian 32 bits machine VAX type ;
– Little Endian 32 bits machine Intel 80386 type, where there is a “swab” (swap bytes) more on a half-word to assure the compatibility with the next case ;
– Little Endian 16 bits machine Intel 80286 type ;
It means that :
1 for people reading from left to right, the Big Endian mode is easier to read ;
2 a program based on an hypothesis of the internal repre-sentation of the data is known as non portable.
The number of registers is also variable from one processor to another (more in the RISC than in the CISC machines) and we must thus take that into account when the programs access registers. The representation of real numbers (sign, character-istic and mantissa) could be based on conventions which differ from a machine to another. The IEEE 754 norm proposes a standardization of this representation : simple precision on 32 bits and double on 64 bits. It guarantees the binary (without any conversion) portability of the floating data.
Differences in the number of interrupt levels and the clock rate can also affect the portability and can produce strange be-haviour. They can cause problems which are difficult to detect.
The portability problems are bound to the different data repre-sentations. Two addressing methods are used (see above) :
Little Endian : VAX and Intel Machines. The address of a ‘long int’ or a ‘short int’ always gives access to the data but string reading is reversed.
Big Endian : Motorola, Sparc, Power (IBM) processors. An address in a ‘long int’ or a ‘short int’ doesn’t allow access to the data but the string reading is done from left to right.
It is recommended to use a method avoiding an hypothesis on the internal representation of the data because of problems of the data alignment and the cast of the pointer.
Another way to see the alignment problem is the using of structures. We can have different size of values for a 16-bits and a 32-bits machine. These problems appear when writing a device driver. Say we have a board where we plan to address three cells, respectively 8, 16, 32 bits contiguous in memory. On a machine aligning the data on byte boundaries, some programs do not allow correct access to the cells.
In the case of the pointers’ cast problem, it is highly recommended to use the cast by char* which gives the same result with alignment or with no alignment. The pointed data seen as achar data is not submitted to the alignment.