Paging; famously known as the hardest challenge for a newbie OSDever.
TODO: Explain more in-depth

(On the left, our theorical CPU, with a nice MMU mapping 0x2000 physical address to 0x3000)
The virtual to physical translation happens at your leafs (that is, the lowest page table levels), the virtual address is determined by the offest of the page.
In this example we will assume a 2-level paging, and 4KB pages, with the level 1 being the leaf pages:
Leaf pageSo - to recapitulate, this would be the code to represent the pages and their memory:
#define PAGE_ENTRIES 1024 #define PAGE_SIZE 4096 /* 1024 32-bit entries, a page table size is 4096, so 4096 / 4 = 1024 */ typedef uint32_t page_t; /* The page table itself containing 1024 page entries */ page_t PageTable[PAGE_ENTRIES]; /* How the virtual memory could be laid out according to the page table */ unsigned char Memory[PAGE_ENTRIES][PAGE_SIZE]; /* Page 1 - Offset 0 */ Memory[0x01][0x00] = 0; /* Page 3 - Offset 0x20 */ Memory[0x03][0x20] = 0; /* Etcetera... */
On most sane architectures the order of the entries on a page table is sequential.
If we mapped page 1 to 0x4000, every write done to Memory[0x00][Offset] would reflect on 0x4000 + Offset, but not on 0x0000 + Offset - this is because a page can only contain a physical address, an access to a virtual address cannot be handled by multiple entries - if the architecture is sane of course.
The page table and the parts that it maps is known as the Virtual Space or Address Space.
On modern operating systems an unique virtual space is given to each program, the address space contains ELF files, libraries and other dependencies the program may have.
In order to save space, the operating system can load a library on a physical location, then map it along the programs that use it as if the library was "loaded" along the program. This works well except when libraries depend on global states.
In some architectures, the top bits of a virtual address are taken by the cache to cache the latest used page per X pages. You can use this technique to put high-priority pages into an address that is aligned to the cache's requirements, normally 255 coloured pages can exist in a system.
Swap consists in accessing non-existant pages (by reporting them as being available to the application), trapping the accesses and then swapping the data to the disk - this takes a lot of time to do and creates something called overcommitment, where the system may run out of RAM and the disk may be used for extra memory.
Optimizations to context switches is describes in multitasking.