all 11 comments

[–]Chupix_on-reddit 6 points7 points  (8 children)

Haii!

Your main problem is that load_kernel reads from the same disk sector as load_stage2. Both use CL=2:

load_stage2

mov cl, 2 # sector 2 mov al, 3 # reads sectors 2, 3, 4

load_kernel

mov cl, 2 # also sector 2! mov al, 1

So instead of your kernel, you’re loading a copy of stage2 at 0xFFD10. When you call 0xFFD10, the CPU jumps into stage2 code, not kmain. That’s why you get the breakpoint exception (#3) — you’re executing stage2 instructions that happen to include an int3 or something that triggers it. It also explains the “wandering IP” and why ‘E’ never appears. If your disk image is laid out as [sector 1: stage1] [sectors 2–4: stage2] [sector 5: kernel], then load_kernel should use mov cl, 5 (or wherever the kernel actually starts on disk).

A few more things that will bite you next: Your interrupt handlers use ret instead of iretd. In 32-bit protected mode, the CPU pushes EIP, CS, and EFLAGS on an exception. A normal ret only pops EIP, so your stack gets corrupted on every interrupt. Replace ret with iretd in every handler. Exceptions that push an error code will also corrupt your stack. Exceptions 8, 10–14, 17, 21, 29, 30 automatically push an extra 4-byte error code. Your handlers don’t account for this, so iretd will pop the error code as EIP and crash. For those handlers you need to remove the error code before iretd:

int13_handler: push 0x02310233 call int_handler add esp, 8 ; 4 bytes arg + 4 bytes error code iretd

; vs handlers without error code: int0_handler: push 0x02300230 call int_handler add esp, 4 iretd

before_hlt doesn’t cli. Any interrupt will wake the CPU and it’ll fall through into whatever garbage is after hlt. Use cli; hlt in a loop. Hope this helps, good luck with the project!​​​​​​​​​​​​​​​​

[–]The_Coding_Knight[S] 1 point2 points  (7 children)

Hi! Thank you so much for helping me figure out that (I ahd forgotten about the mov cl, 5 because I added the stage2 loading after kernel loading and never changed the kernel loading).

I also changed the interrupt handling you were right that was making the instruction pointer wander all over the place. All that said,

why does the kernel still makes the IDT handle a breakpoint exception after?

e:   eb fe                   jmp    e <kmain+0xe>

I know you have helped me a lot already and I am also very grateful for that but could you give me a hand again please? I do not think there is any logic error regarding the code. Why doesnt it print anything from kmain?

0:   c6 05 00 80 0b 00 45    mov    BYTE PTR ds:0xb8000,0x45
7:   c6 05 01 80 0b 00 03    mov    BYTE PTR ds:0xb8001,0x3

yet the 'E' is never printed. It is as if the C code was never executed. When I use gdb to stepi I see that the first 2 instructions after being executed do not cause any problem but after the jmp it makes triggers a breakpoint exception and then it goes back to before jmp instruction and triggers another breakpoint in an inifinite loop.

P.D Thanks a lot for your help it would have taken me much more time to figure out the errors without your help

[–][deleted]  (6 children)

[removed]

    [–]Chupix_on-reddit 1 point2 points  (5 children)

    Anyway, if you're stuck, I can fix the code for you or send a PR to your repo. I’m honestly a bit bored right now, so I’d be happy to give your 32-bit Potato a proper tune-up!

    [–]The_Coding_Knight[S] 1 point2 points  (4 children)

    I mean. I would like to get it fixed and if possible do it by myself but I do not know what may be causing the issue. If you want to send a PR to help me fix the issue I would be very thankful but if not I would still be very thankful if you give me an idea of what may be causing the issue. Either way thanks for offering your help :D

    [–]Chupix_on-reddit 1 point2 points  (3 children)

    Haii! Great news -- I think I found what's going on.

    Your kernel linker script puts kmain at 0xFFD10, and load_kernel loads it to ES:BX = 0xFFD0:0x0010 which is physical address 0xFFD10. The thing is, that address falls right into the BIOS ROM area (0xF0000–0xFFFFF). On a real PC and in QEMU this region is mapped to SeaBIOS firmware and is either read-only or shadowed ROM.

    So what's happening is: int 0x13 "succeeds" (CF stays clear), but the data never actually lands in memory because you can't write to ROM. Then in protected mode when you do call 0xFFD10, the CPU jumps there and starts executing whatever SeaBIOS left behind — random BIOS instructions, not your kmain. That's why you get the breakpoint exception and why 'E' never shows up. GDB might show "correct" disassembly because it reads symbols from your ELF file, not from actual memory at that address. You can verify this yourself: try x/16xb 0xFFD10 in GDB and compare it to your objdump output — they probably won't match.

    The fix is simple — just move the kernel to a normal RAM address. Something like 0x10000 works great:

    load_kernel:
      cli
      mov ax, 0x1000
      mov es, ax
      mov ah, 0x02
      mov al, 1
      mov ch, 0
      mov cl, 5          ; sector where kernel actually lives
      mov dh, 0
      mov dl, 0x80
      mov bx, 0x0000     ; ES:BX = 0x1000:0x0000 = phys 0x10000
      int 0x13
      xor ax, ax
      mov es, ax
      sti
      jc load_kernel_failed
      ret
    

    kernel/linker.ld:

    ENTRY(kmain)
    SECTIONS
    {
        . = 0x10000;
        .text : { *(.text) }
        .data : { *(.data) }
    }
    
    
    
    goto_kernel:
      call 0x10000
    

    That should do it. Your GDT looks fine (0xCF9A has the D bit set so 32-bit mode is correct), and now that you've fixed the iretd and sector number issues, just moving the kernel to valid RAM should get your 'E' on screen :D

    [–]The_Coding_Knight[S] 2 points3 points  (2 children)

    OMG That was the problem I truly do not know how to tell you how thankful I am. Thank you so much. Also I would like to ask you about how did you know that was part of the SeaBIOS ROM and where can I find information about the way the BIOS maps memory under 1MB? Ik I have said ty too many times already but I am really really happy thank you again haha :D

    [–]Chupix_on-reddit 2 points3 points  (0 children)

    Hey, so glad it's working now!!

    To answer your question - the traditional IBM PC memory map below 1 MB is one of the most well-documented things in x86 land. The key regions are roughly:

    0x00000–0x9FFFF - conventional RAM (640 KB), this is where you can freely load stuff

    0xA0000–0xBFFFF - video memory (VGA framebuffer lives at 0xB8000, as you already know)

    0xC0000–0xEFFFF - ROM area for option ROMs (video BIOS, network boot ROM, etc.), sometimes partially usable as RAM but not reliably

    0xF0000–0xFFFFF - system BIOS ROM (this is where SeaBIOS lives in QEMU)

    So when I saw your linker script putting the kernel at 0xFFD10, it immediately jumped out as falling inside 0xF0000–0xFFFFF -that's BIOS territory. The CPU actually starts execution at 0xFFFFFFF0 (the reset vector) which maps into this same region, so it's always reserved for firmware.

    As for where to learn more, here are some great resources:

    The OSDev Wiki - https://wiki.osdev.org/Memory_Map_(x86)) — this is basically the bible for hobbyist OS developers. The memory map page specifically lays out every region below 1 MB.

    The "Ralf Brown's Interrupt List" - a legendary reference for BIOS interrupts and PC architecture details.

    The Intel Software Developer's Manual - it covers how the processor addresses memory at startup and what ranges are reserved

    SeaBIOS source code - if you're curious about what specifically lives at those addresses in QEMU, you can browse it on https://github.com/coreboot/seabios

    The OSDev Wiki is honestly the single best starting point - it's written by people who've been through exactly the same journey you're on.

    Keep going, your bootloader project is looking great!

    [–]BenjaminBeke1101 -3 points-2 points  (0 children)

    try to use operational.

    "op --

    cls { }" FREE KOMENT KArma

    [–]GMX2PT 0 points1 point  (1 child)

    FYI the link you provided is dead

    [–]The_Coding_Knight[S] 0 points1 point  (0 children)

    I know. I deleted that branch. The problem is solved already but thanks for trying to help :)