Implementation of an Operating System(4-Segmentation)…

Lahiru Chathuranga
5 min readAug 13, 2021

If you new to my series fist you must read in in beginning. Because if you don’t read it in beginning then you couldn’t understand what am I say now, In here you can go to first one.

This chapter I will explain how to Segmentation in x86. Segmentation in x86 means accessing the memory through segments. In Operating Systems, Segmentation is a memory management technique in which the memory is divided into the variable size parts. Each part is known as a segment which can be allocated to a process. The details about each segment are stored in a table called a segment table.

To enable segmentation you need to set up a table that describes each segment. There are two types of descriptor tables,

  1. The Global Descriptor Table (GDT).
  2. Local Descriptor Tables (LDT).

An LDT is set up and managed by user-space processes, and all processes have their own LDT.

Accessing Memory

Most of the time when accessing memory there is no need to explicitly specify the segment to use. The processor has six 16-bit segment registers: cs, ss, ds, es, gs and fs. The register cs is the code segment register and specifies the segment to use when fetching instructions. The register ss is used whenever accessing the stack (through the stack pointer esp), and ds is used for other data accesses. The OS is free to use the registers es, gs and fs however it want.

Below is an example showing implicit use of the segment registers:

func:
mov eax, [esp+4]
mov ebx, [eax]
add ebx, 8
mov [eax], ebx
ret

The above example can be compared with the following one that makes explicit use of the segment registers:

func:
mov eax, [ss:esp+4]
mov ebx, [ds:eax]
add ebx, 8
mov [ds:eax], ebx
ret

You don’t need to use ss for storing the stack segment selector, or ds for the data segment selector. You could store the stack segment selector in ds and vice versa. However, in order to use the implicit style shown above, you must store the segment selectors in their indented registers.

The Global Descriptor Table (GDT)

The GDT can hold things other than segment descriptors as well. Every 8-byte entry in the GDT is a descriptor, but these descriptors can be references not only to memory segments but also to Task State Segment (TSS), Local Descriptor Table (LDT), or Call Gate structures in memory. The last ones, Call Gates, are particularly important for transferring control between x86 privilege levels although this mechanism is not used on most modern operating systems.

There is also a Local Descriptor Table (LDT). Multiple LDTs can be defined in the GDT, but only one is current at any one time: usually associated with the current Task. While the LDT contains memory segments which are private to a specific program, the GDT contains global segments. The x86 processors have facilities for automatically switching the current LDT on specific machine events, but no facilities for automatically switching the GDT.

Every memory access which a program can perform always goes through a segment. On the 80386 processor and later, because of 32-bit segment offsets and limits, it is possible to make segments cover the entire addressable memory, which makes segment-relative addressing transparent to the user.

In order to reference a segment, a program must use its index inside the GDT or the LDT. Such an index is called a segment selector (or selector). The selector must generally be loaded into a segment register to be used. Apart from the machine instructions which allow one to set/get the position of the GDT, and of the Interrupt Descriptor Table (IDT), in memory, every machine instruction referencing memory has an implicit Segment Register, occasionally two. Most of the time this Segment Register can be overridden by adding a Segment Prefix before the instruction.

Loading a selector into a segment register automatically reads the GDT or the LDT and stores the properties of the segment inside the processor itself. Subsequent modifications to the GDT or LDT will not be effective unless the segment register is reloaded.

Now we can see how to loading the GDT.

Loading the GDT

Loading the GDT into the processor is done with the lgdt assembly code instruction, which takes the address of a struct that specifies the start and size of the GDT. It is easiest to encode this information using a “packed struct” as shown in the following example:

struct gdt {
unsigned int address;
unsigned short size;
} __attribute__((packed));

If the content of the eax register is the address to such a struct, then the GDT can be loaded with the assembly code shown below:

lgdt [eax]

It might be easier if you make this instruction available from C, the same way as was done with the assembly code instructions in and out.

After the GDT has been loaded the segment registers needs to be loaded with their corresponding segment selectors. The content of a segment selector is described in the figure and table below:

Bit:     | 15                                3 | 2  | 1 0 |
Content: | offset (index) | ti | rpl |

The layout of segment selectors.NameDescriptionrplRequested Privilege Level — we want to execute in PL0 for now.tiTable Indicator. 0 means that this specifies a GDT segment, 1 means an LDT Segment.offset (index)Offset within descriptor table.

The offset of the segment selector is added to the start of the GDT to get the address of the segment descriptor: 0x08 for the first descriptor and 0x10 for the second, since each descriptor is 8 bytes. The Requested Privilege Level (RPL) should be 0 since the kernel of the OS should execute in privilege level 0.

Loading the segment selector registers is easy for the data registers — just copy the correct offsets to the registers:

mov ds, 0x10
mov ss, 0x10
mov es, 0x10
.
.
.

To load cs we have to do a “far jump”:

; code here uses the previous cs
jmp 0x08:flush_cs ; specify cs when jumping to flush_cs
flush_cs:
; now we've changed cs to 0x08

A far jump is a jump where we explicitly specify the full 48-bit logical address: the segment selector to use and the absolute address to jump to. It will first set cs to 0x08 and then jump to flush_cs using its absolute address.

Thank you!

--

--

Lahiru Chathuranga

Undergraduate from University of Kelaniya, Sri Lanka. And following Software Engineering course