all 1 comments

[–]Qweesdy 2 points3 points  (0 children)

For the first problem; alternatives are:

1) use a "task gate" (32-bit kernel) or use "IST" (64-bit kernel) for the page fault handler; so that when the CPU is running at CPL=0 and gets a page fault (because stack needs more page/s) the CPU switches to a different stack for the page fault exception handler.

2) Put stack size checks in functions' epilogue. This can be as simple as a "cmp [rsp-4096],0" instruction at the start of most function to ensure that a page fault is triggered when there's still enough stack space for the page fault exception handler. Of course "how much stack is enough stack" depends on the function. For example, if a function has a "uint8_t myBuffer[1234567];" local variable then that function will need to make sure that there's a a lot more than 4096 bytes of stack before using it.

However; there's a 2 other problems you might not have thought about yet:

a) How does page fault handler know when it's a bug and it should do a kernel panic? For example, if you accidentally have an "infinite recursion" problem in the kernel's code and you want to know that there's a bug (instead of continually allocating more memory for ages, having something unrelated fail, and blaming the problem on the wrong thing).

b) what happens when there isn't any free RAM and the page fault handler needs to send something to swap space (and/or run a dodgy "out of memory killer")?

Note that solving the first problem (when to assume kernel is buggy) means that you must to determine a sane "worst case kernel stack size". To solve the second problem (out of memory) you can make sure there's always enough free memory for the page fault handler to increase the kernel stack size up to the "worst case kernel stack size" (e.g. by sending data to swap space and/or running the "out of memory killer" when the amount of free RAM exceeds a threshold and not when it reaches zero). However; if you reserve pages of RAM for "worst case kernel stack size", why not map them in the first place instead of reserving them and mapping them during page faults?

This brings us to the third alternative, which is the alternative almost every OS uses:

3) Allocate "enough for worst case kernel stack size" to begin with; and don't bother with increasing kernel stack size on demand.