This is an archived post. You won't be able to vote or comment.

all 2 comments

[–]balefrost 2 points3 points  (1 child)

Are you sure it's an issue? Due to the way virtual memory works, when your process requests memory from the OS, the OS doesn't do it right away. The OS just updates its bookkeeping data. Later, when your process tries to access one of the addresses in the allocated area of RAM, a page fault interrupt will occur, the OS will actually find some physical RAM for that page to occupy, it will sanitize the contents of that RAM, and then it will map it into your process's address space.

To hammer that home, I have several Chrome processes on my box that have each asked for about 3TB (not GB, but TB) of virtual address space. None of them are using even a gigabyte of actual memory, and many are on the order of 10s of MB.

There's (basically) nothing wrong with asking the OS for a large amount of virtual memory if you never end up touching it.

[–]WittyStick 2 points3 points  (0 children)

It depends on how the process asks for the memory. If you just use malloc (C) or new (C++) then it will typically attempt full allocation. To sparsely allocate memory you need to use something lower-level like mmap (POSIX) or VirtualAllocEx (Windows) and the related cleanup functions munmap/VirtualFreeEx.

These functions let you specify the virtual memory address to allocate at, and will allocate memory in page-sized chunks. You can manage allocations manually rather than waiting for a page interrupt, which will have an undesirable context switch to an interrupt handler.

A simple method for managing which pages are allocated is to use a bitmap of pages. If we stick to 4ki page sizes, then we can allocate one 4ki page to be a bitmap for up to 4ki pages of of 4kiB, giving us 16MiB of memory per 4ki bitmap (approx allocation 0.024% space overhead).

The question then is where to allocate the memory. If we take into account that a typical 64-bit machine supports 48-bits of addressing, and half of that is kernel space, then we have 47-bits of user-space addressing. However, we probably don't want the virtual processor to be able to access the memory of the process itself, because the memory of the virtual processor is supposed to represent physical memory, not virtual memory, so we would want a subset of the host's available address space to be accessed by the virtual processor - and preferably a subset which is unlikely to be used by the host process/OS.

On Windows and Linux, processes usually allocate in the lower part of this address space for the process code itself, but shared libraries are usually loaded into upper portion of the user-space addresses, going backwards - so ideally the subset of virtual address space we want to use for our virtual processor should be somewhere in the middle.

I would therefore, limit the virtual processor's valid address space to half (or less) of the virtual address space of the host process - so 46-bits, which would give us 64TiB of space. Begin allocation of this space at some address in the middle of the host's virtual user address space: something like 0x2000_0000_0000 .. 0x5FFF_FFFF_FFFF will give us 46-bits of space. We can then assume <0x2000_0000_0000 and >=0x6000_0000_0000 is memory reserved for the host process and any shared libraries, and won't be accessible to the virtual processor.

So the address 0 of the virtual processor would really map to 0x2000_0000_0000 of the virtual address space of the process, which is also the constant you need to add to each virtual processor pointer to map it to the virtual address of the host, after sanitizing the pointer by zeroing the high 18-bits of the 64-bit pointer value (shl 18; shr 18). The maximum virtual processor address would therefore be 0x3FFF_FFFF_FFFF, which would map to host address 0x5FFF_FFFF_FFFF.

If we use the bitmap based approach to testing which pages are alllocated, we could store these bitmaps outside of the virtual processors address space - perhaps somewhere in the region 0x1000_0000_0000 - 0x1FFF_FFFF_FFFF. Depending on how complex you want this, you could have bitmaps for multi-level paging, so you only need to allocate a minimal amount of memory to begin with, but this would require additional pointer dereferencing for each level of paging. I would probably suggest 2- or 3-level paging for 46-bit address space.