Segment Address Space and Creation
Here we explains IDA address space model and how to create segments with proper address translation. It covers the relationship between linear addresses, virtual addresses, and segment bases, with practical examples for different processor architectures.
Address space model
Internally, IDA has 32-bit linear address space (IDA64 uses 64-bit address space). The internal addresses are called “linear addresses”. The input program is loaded into this linear address space.
Please note that usually the linear addresses are not used in the program directly. During disassembling, we use so-called “virtual addresses”, which are calculated by the following formula:
VirtualAddress = LinearAddress - (SegmentBase << 4);
We see that the SegmentBase determines what addresses will be displayed on the screen. More than that, IDA allows to create several segments with the same virtual address in them. For this, you just need to create segments with correct segment base values.
Normally a SegmentBase is a 16bit quantity. To create a segment with base >= 0x10000, you need to use selectors window . However, if you try to create a segment with a segment base >= 0x10000, IDA will automatically choose appropriately a free selector and setup for the new segment.
All SegmentBases are looked up in the selector table.
There are some address restrictions in IDA.
There is a range of addresses that are used for internal housekeeping. This range can be specified by the configuration variable PRIVRANGE (start address and size). It is not recommended to use these addresses for other purposes.
There is also one address which must never be used in the disassembly. It is the ‘all ones’ address, or -1. Internally, it is used as a BADADDR (bad address). No address or address range can include BADADDR.
Segment Creation Examples
Create segment - simple case (PC)
IBM PC case
Suppose we need to create a segment occupying addresses F000:1000..F000:2000 Let’s calculate linear addresses:
start = (0xF000 << 4) + 0x1000 = 0xF1000
end = (0xF000 << 4) + 0x2000 = 0xF2000
The segment base must be selected so that the first offset in our segment will be 0x1000. Let’s find it using the following equation:
VirtualAddress = LinearAddress - (SegmentBase << 4);
0x1000 = 0xF1000 - (base << 4);
After solving this equation, we see that the segment base is equal to 0xF000. (you see, this is really a very simple case :) )
Now, we can create a segment entering:
segment start address: 0xF1000
segment end address: 0xF2000
segment base: 0xF000
Please note that the end address never belongs to the segment in IDA.
Create segment - simple case (Z80)
Suppose we need to create a segment occupying virtual addresses 8000-C000. Since we are free to place our segment anywhere in the linear address space, we choose the linear addresses at our convenience. Let’s say we choose a linear address 0x20000:
start = 0x20000
end = start + 0x4000 = 0x24000
The segment base must be selected so that the virtual address in our segment will be 0x8000. Let’s find it using the following equation:
VirtualAddress = LinearAddress - (SegmentBase << 4);
0x8000 = 0x20000 - (base << 4);
base << 4 = 0x20000 - 0x8000
base << 4 = 0x18000
base = 0x1800
After solving this equation, we see that the segment base is equal to 0x1800.
Now we can create a segment entering:
segment start address: 0x20000
segment end address: 0x24000
segment base: 0x1800
Please note that the end address never belongs to the segment in IDA.
Create segment - automatically chosen selector case
Suppose we need to create a segment occupying linear addresses 200000-200C00 and the virtual addresses must have be 0000..0C00. If we simply enter
segment start address: 0x200000
segment end address: 0x200C00
segment base: 0x20000
Then IDA will notice that the segment base is too big and does not fit into 16bits. Because of this IDA will find a free selector (let’s say it has found selector number 5), define it to point at paragraph 0x20000 and create a segment. After all this we will have:
- a new selector is defined (5 -> 0x20000)
- a new segment is created. Its attributes:
start = 0x200000
end = 0x200C00
base = 5
The first virtual address in the segment will be 0:
VirtualAddress = LinearAddress - (SelectorValue(SegmentBase) << 4)
= 0x200000 - (SelectorValue(5) << 4)
= 0x200000 - (0x20000 << 4)
= 0x200000 - 0x200000
= 0
Please note that the end address never belongs to the segment in IDA.
Create segment - user-defined selector case
If the previous example we saw how IDA allocates a selector automatically. You could make it yourself:
- Create a selector. For this, open the selectors window and press Ins. Enter a selector number and its value.
- Create a segment. Specify the selector number as the segment base.
Change segment translation
A call like:
call 1000
in the segment C obviously refers to the segment B, while the instruction:
call 500
refers to the segment A.
However, IDA does not try to link these references unless you tell it to do so: include the segments A and B into a translation list of the segment C. It means that you have to create a translation list
A B
for the segment C.
Below is a more complicated example:
start end
A 0000 1000
B 1000 2000
C 1000 2000
D 3000 4000
E 3000 4000
translations
B: A
C: A
D: A B
E: A C
allow you to emulate overlays (the first set is A B D, the second A C E)
{% hint style=“info” %} If you use the segment translations, make sure that all segments have unique segment bases. If two segments are placed in the linear address space so that they must have the same segment base, you may assign different selectors with equal values to them. {% endhint %}
{% hint style=“info” %} IDA supports only one translation list per segment. This translation is applied by default to all instruction in the segment. If the segment uses other mappings, then these individual mappings can be specified for each instruction separately by using the convert operand commands. {% endhint %}
{% hint style=“info” %} Since only code references are affected by the segment translations, try to create the RAM segment at its usual place (i.e., its linear address in IDA corresponds to its address in the processor memory). This will make all data references to it to be correct without any segment translation. For the data references to other segments you’ll need to use the convert operand commands for each such reference. {% endhint %}