mirror of
https://github.com/radareorg/radare2.git
synced 2025-02-04 04:28:20 +00:00
merge endian in CONTRIBUTING
This commit is contained in:
parent
6f53a6a69f
commit
e44286124a
@ -197,6 +197,62 @@ r_core_wrap.cxx:32103:61: error: assigning to 'RDebugReasonType' from incompatib
|
||||
* Never ever use %lld or %llx. This is not portable. Always use the PFMT64x
|
||||
macros. Those are similar to the ones in GLIB.
|
||||
|
||||
# Manage Endianness
|
||||
|
||||
As hackers, we need to be aware of endianness.
|
||||
|
||||
Endianness can become a problem when you try to process buffers or streams
|
||||
of bytes and store intermediate values as integers with width larger than
|
||||
a single byte.
|
||||
|
||||
It can seem very easy to write the following code:
|
||||
|
||||
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
||||
ut32 value = *(ut32*)opcode;
|
||||
|
||||
... and then continue to use "value" in the code to represent the opcode.
|
||||
|
||||
This needs to be avoided!
|
||||
|
||||
Why? What is actually happening?
|
||||
|
||||
When you cast the opcode stream to a unsigned int, the compiler uses the endianness
|
||||
of the host to interpret the bytes and stores it in host endianness. This leads to
|
||||
very unportable code, because if you compile on a different endian machine, the
|
||||
value stored in "value" might be 0x40302010 instead of 0x10203040.
|
||||
|
||||
## Solution
|
||||
|
||||
Use bitshifts and OR instructions to interpret bytes in a known endian.
|
||||
Instead of casting streams of bytes to larger width integers, do the following:
|
||||
|
||||
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
||||
ut32 value = opcode[0] | opcode[1] << 8 | opcode[2] << 16 | opcode[3] << 24;
|
||||
|
||||
or if you prefer the other endian:
|
||||
|
||||
ut32 value = opcode[3] | opcode[2] << 8 | opcode[1] << 16 | opcode[0] << 24;
|
||||
|
||||
This is much better because you actually know which endian your bytes are stored in
|
||||
within the integer value, REGARDLESS of the host endian of the machine.
|
||||
|
||||
## Endian helper functions
|
||||
|
||||
Radare2 now uses helper functions to interpret all byte streams in a known endian.
|
||||
|
||||
Please use these at all times, eg:
|
||||
|
||||
val32 = r_read_be32(buffer) // reads 4 bytes from a stream in BE
|
||||
val32 = r_read_le32(buffer) // reads 4 bytes from a stream in LE
|
||||
val32 = r_read_ble32(buffer, isbig) // reads 4 bytes from a stream:
|
||||
// if isbig is true, reads in BE
|
||||
// otherwise reads in LE
|
||||
|
||||
There are a number of helper functions for 64, 32, 16, and 8 bit reads and writes.
|
||||
|
||||
(Note that 8 bit reads are equivalent to casting a single byte of the buffer
|
||||
to a ut8 value, ie endian is irrelevant).
|
||||
|
||||
# Additional resources
|
||||
|
||||
* [README.md](https://github.com/radare/radare2/blob/master/README.md)
|
||||
|
68
doc/endian
68
doc/endian
@ -1,68 +0,0 @@
|
||||
Endian issues
|
||||
=============
|
||||
|
||||
As hackers, we need to be aware of endianness.
|
||||
|
||||
Endianness can become a problem when you try to process buffers or streams
|
||||
of bytes and store intermediate values as integers with width larger than
|
||||
a single byte.
|
||||
|
||||
It can seem very easy to write the following code:
|
||||
|
||||
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
||||
ut32 value = *(ut32*)opcode;
|
||||
|
||||
... and then continue to use "value" in the code to represent the opcode.
|
||||
|
||||
This needs to be avoided!
|
||||
|
||||
Why? What is actually happening?
|
||||
|
||||
When you cast the opcode stream to a unsigned int, the compiler uses the endianness
|
||||
of the host to interpret the bytes and stores it in host endianness. This leads to
|
||||
very unportable code, because if you compile on a different endian machine, the
|
||||
value stored in "value" might be 0x40302010 instead of 0x10203040.
|
||||
|
||||
In the past, radare devs were not as strict about this issue, and as a result,
|
||||
needed to swap the endian of values regularly in the code.
|
||||
|
||||
Solution
|
||||
========
|
||||
|
||||
Use bitshifts and OR instructions to interpret bytes in a known endian.
|
||||
Instead of casting streams of bytes to larger width integers, do the following:
|
||||
|
||||
ut8 opcode[4] = {0x10, 0x20, 0x30, 0x40};
|
||||
ut32 value = opcode[0] | opcode[1] << 8 | opcode[2] << 16 | opcode[3] << 24;
|
||||
|
||||
or if you prefer the other endian:
|
||||
|
||||
ut32 value = opcode[3] | opcode[2] << 8 | opcode[1] << 16 | opcode[0] << 24;
|
||||
|
||||
This is much better because you actually know which endian your bytes are stored in
|
||||
within the integer value, REGARDLESS of the host endian of the machine.
|
||||
|
||||
|
||||
Endian helper functions
|
||||
=======================
|
||||
|
||||
Radare2 now uses helper functions to interpret all byte streams in a known endian.
|
||||
|
||||
Please use these at all times, eg:
|
||||
|
||||
val32 = r_read_be32(buffer) // reads 4 bytes from a stream in BE
|
||||
val32 = r_read_le32(buffer) // reads 4 bytes from a stream in LE
|
||||
val32 = r_read_ble32(buffer, isbig) // reads 4 bytes from a stream:
|
||||
// if isbig is true, reads in BE
|
||||
// otherwise reads in LE
|
||||
|
||||
There are a number of helper functions for 64, 32, 16, and 8 bit reads and writes.
|
||||
|
||||
(Note that 8 bit reads are equivalent to casting a single byte of the buffer
|
||||
to a ut8 value, ie endian is irrelevant).
|
||||
|
||||
Happy hacking!
|
||||
|
||||
- damo22
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user