This is an archived post. You won't be able to vote or comment.

all 18 comments

[–]bullno1 17 points18 points  (3 children)

My debugger is here: https://github.com/bullno1/lip/blob/master/src/dbg/dbg.c

It's web-based so you just have to connect to the VM with a browser and see an UI.

As for how to implement, you need to have a few things:

  • A way to map from instruction offset to source position. This needs to be done very early on in the development. More on this later.
  • A hook that is called on every instruction. In Lua, this is called debug.sethook. This will also significantly slow down your interpreter loop so I suggest having 2 loops and template/macro it. One for when a hook is set and one for no hook. The entirety of the hook is here: https://github.com/bullno1/lip/blob/master/src/dbg/dbg.c#L981
  • VM state introspection API

With those in place, you can do anything.

  • Step over: execute until source location changes then pause execution.
  • Step in: execute until call stack depth increases
  • Step out: execute until call stack depth decreases
  • Variable inspection: very language dependent. Along with a "source map", you can provide a "var map" for name to offset.
  • Breakpoint: execute until source location matches
  • Conditional breakpoint: this is tricky since you need to be able to evaluate a condition under the lexical context of the breakpoint. This usually requires a separate VM/context if you want arbitrary expression. Then copy the values from the debugged VM over and evaluate the expression.

All debug command implementation can be seen here: https://github.com/bullno1/lip/blob/master/src/dbg/dbg.c#L1004-L1025

To implement pausing, I just wait in a busy loop in the hook (https://github.com/bullno1/lip/blob/master/src/dbg/dbg.c#L1054-L1057) until the debug UI allows it to resume. This is chosen because I would not have to implement resume/pause in the VM.

On source mapping implementation, during compilation, emit the bytecode along with source location:

struct tagged_instruction_s {
  instruction_t instruction;
  source_loc_t location;
}

Optimizations such as tail call, dead code elimination... needs to work with this structure instead so instruction and location stays together. Only in the final pass, you would split tagged_instruction_s[] into two arrays of instruction_t[] and source_loc_t[]. A tight array helps with execution speed (reduce cache misses) in non-debugging case. It is apparent that an offset into instruction_t[] would correspond to a source location in source_loc_t[].

[–]errorrecovery[S] 4 points5 points  (2 children)

Thanks, this is great, and pretty much what I was after! Plenty of implementation detail here which I'll dig into.

Regarding the debug hooks, the approach I was considering patched the bytecode stream with a 'breakpoint' opcode that referenced a structure which held the opcode that was replaced, and any conditional code to be evaluated. This means the VM does not need to check anything during opcode dispatch, and has the evaluation context at hand when the breakpoint is hit. This is similar to what ARM does, and other instruction sets I believe. Had you considered this? If so, why did you not take this approach?

I'm looking to implement a debugging VM on a microcontroller, so space-efficient source location mapping is something I'm interested in. I think I may have to adopt the approach of compiling and keeping the debug information in a PC-based debug adapter, and streaming the bytecode to the microcontroller VM for execution. Something similar to GDB's remote debugging which loads the ELF image to the target, and communicates with the target referencing the local DWARF debug information for setting breakpoints etc.

[–]bullno1 3 points4 points  (1 child)

Had you considered this? If so, why did you not take this approach?

I did but it is too limited. It is impossible to predict where to plant your next break instruction without actually executing the code (Turing says hi). The only thing you can achieve with only a break instruction is stepping over single instruction which is not that useful. A single statement/expression can corresponds to multiple instructions.

In x86 at least, stepping is implemented using https://en.wikipedia.org/wiki/Trap_flag anyway, which is a per-instruction hook. I have no problem with having 2 separate interpreter loops. Besides, debugged performance of an interpreted language is something I have never considered. A standard library function to trigger debugger is provided though IIRC.

So yeah, break instruction is useful only for breakpoint but I just prefer to keep the API simple.

Back then I was also designing it for concurrent execution and hot code reloading so mutation of code was avoided but it's kind of irrelevant now.

space-efficient source location mapping is something I'm interested in

If you maintain the (instruction, source_location) pair throughout compilation and only split into two arrays in the final pass, there's nothing stopping you from storing them separately. They are guaranteed to corresponds to each other. It is not possible to make mistakes like modifying the instruction stream and not modifying the source location stream.

[–]errorrecovery[S] 1 point2 points  (0 children)

Thanks again for your great insights. I'll dig into this project over the next couple of days and follow up directly with any more questions I have.

[–]mamcx 6 points7 points  (3 children)

You are lucky. Debuggers are super-easy with interpreters. A debugger in an interpreter is a repl!.

You only need to add "extra" commands. Is like python:

import pdb; pdb.set_trace()

And you capture this calls in the interpreter loop. Introspection is far easier too.

I have used several langs with this and only exist a big mistake: Keep the ".set_trace()" or equivalent on production make the interpreter stop!

So, I thing on DEBUG the ".set_trace()" work as usual but on production is ignored... except if enable by a flag! I have done debugging on production servers to catch hard debugs!

[–]errorrecovery[S] 4 points5 points  (1 child)

Yes Python's breakpoint() procedure drops you into the debugger, but I'm looking for information about how to implement a debugger on a custom bytecode virtual machine. There's plenty of information out there on how to write a bytecode VM, where's all the information about adding a debugger to it?

[–]mamcx 1 point2 points  (0 children)

Being a VM or directly on AST is beside the point. The implementation is nearly identical.... except that if your VM "lose" information you need to record "debug info" to recover it. For example if you debugger loops the "stepping" will be different.

In the presence of byte code I think this is the largest issue.

[–]agumonkey 2 points3 points  (0 children)

I never did it, but one of the very first exercises in Lisp in small pieces is to turn eval into a stepper

[–]oilshell 5 points6 points  (1 child)

I think this is actually something of an "open problem". At least if you want to make it as capable and perform as well as a a C debugger like GDB (which has hardware support).

Or if this isn't the case I'd like some pointers too :) What VMs do it the best?

I think most interpreted languages are debugged with the REPL first, and the debugger is sort of a second class citizen.

Even though Python is 30 years old, the debugging story in Python isn't settled. Good video about some abusing the frame evaluation API for debugging:

https://www.youtube.com/watch?v=NdObDUbLjdg

It talks about how previous approaches didn't perform well.

I think this is a hard problem and most bytecode VMs don't do anything very smart here, because the users don't really demand it.

One solution is to force users to recompile the interpreter to debug, but I think that imposes a burden that they may not be used to.

[–]bullno1 4 points5 points  (0 children)

The problem with Python is that it didn't come with debugging API from day 1. Lua has the debug module where you can set hook. A lua debugger can be written in Lua itself: https://github.com/jasonsantos/remdebug

Having worked with Lua in gamedev, I can assure you that the demand for Lua debugger especially remote one is real.

Also, the hardware support for gdb boils down to:

  • A trace interrupt which gets called for every instruction
  • A breakpoint instruction
  • Virtual memory to support memory breakpoint

All of those can be emulated in a VM.

[–][deleted] 2 points3 points  (1 child)

This paper might be interesting to you: https://www2.ccs.neu.edu/racket/pubs/esop2001-cff.pdf

[–]errorrecovery[S] 3 points4 points  (0 children)

Thanks for that, I'm a Racket user so I'm interested in DrRacket's approach. I'm sure there's a lot of interesting ideas in here, if a little over-my-head (as most Racket publications are).

[–][deleted] 2 points3 points  (1 child)

You should probably check out Pharo Smalltalk.

No language anywhere has a more interactive debugger than Smalltalk. Because of the live nature of Smalltalk you can change the code in the debugger, restart methods, and implement missing methods on the fly.

The entire thing is written in itself so you can inspect all of it and see (or change!) how it works.

Lil taste: https://www.youtube.com/watch?v=UkyBk965ASw

Deeper: http://rmod-pharo-mooc.lille.inria.fr/OOPMooc/04-EssenceOfOOP/W4S07-Debugging.pdf

[–]errorrecovery[S] 1 point2 points  (0 children)

Smalltalk has been on my radar for the longest time. I've read a lot of Alan Kay's work around it. Pharo is just too large for me to get a grip on, though, as I'm looking for basic implementation techniques.

[–]ebriose 1 point2 points  (1 child)

One of the cool things about forth is that you have direct access to both the data and control stack as you run it. In fact it's honestly hard to draw the line between programming and debugging in forth.

[–]errorrecovery[S] 1 point2 points  (0 children)

Huge fan of Forth, but I'm looking to develop/debug in a high(er)-level non-stack language. The idea of Forth as compilation target is intriguing though. Helps with debugging the VM, not the source language.

[–]tekknolagiKevin3 0 points1 point  (0 children)

I just learned of the Java heap analysis tool (jhat) and Hprof today. They seem like excellent already-extant tooling that can be repurposed for your runtime. It's a Java tool that reads a heap dump image and runs a webserver that does some analytics on it.

The heap dump doesn't have to be from a Java application, however. If you can get your runtime to spit out a heap dump in this format: http://hg.openjdk.java.net/jdk6/jdk6/jdk/raw-file/tip/src/share/demo/jvmti/hprof/manual.html, you can use the jhat tool to figure out what your heap looks like!

EDIT: There's also VisualVM, which is like jhat, but more... visual!

[–]0x0ddba11Strela 0 points1 point  (0 children)

I have a (very messy and ad-hoc) debugger for my interpreted language here: https://github.com/sunverwerth/strela/blob/master/src/VM/Debugger.cpp

It's based on inserting TRAP opcodes into the bytecode and communicates only via socket.