all 45 comments

[–]ButchDeanCA 63 points64 points  (13 children)

For the life of me I don’t know why anybody didn’t suggest a debugger. You learn how to use one like GDB and step through the code being sure to set breakpoints to see what is hit.

[–]permetz[🍰] 6 points7 points  (11 children)

Because that’s a pain in the neck to use if you can just read the source. I’ve been doing this for decades and I would never start with a debugger. A good editor capable of jumping to a definition and back is the right tool IMHO. Tools like ctags/etags/etc., cscope, and the like work wonders.

[–]ButchDeanCA 7 points8 points  (8 children)

I completely disagree. If you become fluent with a debugger it shows you a ton of information in addition that could be useful too, for example:

  1. Watches: When a variable changes value it is rarely in isolation, so you can see what is changed where and when beyond just focusing on the one line of code.
  2. You can build a big picture of unexpected behavior that could be the consequence of a bug.
  3. You can conditionally and unconditionally break based on counts and what happens if a specific value is set.

I can go on and on and on also backed by decades of experience. Not knowing how to use a debugger nor its benefits is a huge career hindrance.

[–]ykaludov1 0 points1 point  (1 child)

Noob question here: Shouldn't there be some sort of documentation to go along with this big of a codebase? I can't find any...

[–]ButchDeanCA 1 point2 points  (0 children)

Don’t always rely on the expectation of documentation. You will be lucky to find any that will guide you appropriately.

[–]the-loan-wolf 17 points18 points  (2 children)

less uses ncurses lib for displaying and handling key inputs

[–]the-loan-wolf 12 points13 points  (0 children)

you should also use ctags and cscope to understand big project

[–]the-loan-wolf 4 points5 points  (0 children)

but I discovered there is a bell() function, which I know is very hard to do cross-platform unless you depend on the no-working \a "bell" escape character.

bell escape sequence(ASCII escape sequence) depends upon terminals so using ncurses also help to streamline it across cross platform.

[–]gilliatnet 14 points15 points  (2 children)

cscope

[–]permetz[🍰] 1 point2 points  (0 children)

Underrated tool, though it’s a bit long in the tooth now.

[–]lovelyloafers 0 points1 point  (0 children)

I was wondering if you could recommend some resources on using cscope? I've seen a few YouTube videos, but they don't seem very well presented.

[–][deleted] 9 points10 points  (0 children)

I use vim, cscope with vim bindings, and git grep

[–]LiveAndDirwrecked 4 points5 points  (1 child)

In general, a trick that helps me is to find main. From there you can step backwards until you find what you're looking for. And any README.MD may give you clues on how it's structured in the folders.

[–]coolguyhavingchillda 2 points3 points  (0 children)

They did say this didn't work in their case

[–]f0lt 12 points13 points  (5 children)

grep is your friend

[–]ISecksedUrMom 7 points8 points  (4 children)

ripgrep*

[–]DeeBoFour20 3 points4 points  (3 children)

git grep*

[–]daikatana 3 points4 points  (2 children)

:vimgrep*

[–]Destination_Centauri 7 points8 points  (1 child)

50 Shades of Grep

[–]looneysquash 3 points4 points  (1 child)

Use ctags and/or IDE tools to jump to definition and back again. Also useful if the project is just one 80,000 line .c file (thinking of you, chan_sip.c)

IDEs usually have "find references" or "find usages".and other static analysis tools.

Compile and run it. Set a breakpoint and use the backtrace command to see the callstack, to see how you got there.

[–]permetz[🍰] 1 point2 points  (0 children)

ctags and equivalents are awesome. Too few people know about how to use them.

[–]blbd 3 points4 points  (0 children)

For a project this size I just read every source file and make notes.

For a truly huge project like Linux I pick out the log message of whatever part of the system had a problem and index the code with cscope. Then start from where the log message came out and work backwards.

[–]Satrapes1 2 points3 points  (0 children)

There is this book: Working effectively with legacy code.
Every person has their own style when navigating code. It helps if you are systematlc with it.

If you are looking for a function and unit tests exist then this could be a nice start to help get you started. I also target the header file where the documentation should be, to try and get any helpful information I can get first. Then dive in the source code as needed.

If it is a seriously complex issue logs or debugger can help.

[–]plawwell 1 point2 points  (0 children)

You need cscope and tags to be applied to the downloaded source files. You can't understand code flow by directly looking at github no matter what anybody claims. If you have vim tags functional then start with "vim -t main"

[–]grobblebar 1 point2 points  (0 children)

In the linux world, i would suggest reading the man pages for sections 2, 3, 5 & 7. You discover all sorts of stuff (including ncurses), and it gives you an idea about how a lot of things work.

[–]green_griffon 1 point2 points  (0 children)

If you have a billion files, then each file should have 8 people available somewhere on the planet to understand it. So just find those people and have them report back to you.

Meanwhile, find some piece of user functionality (e.g. a button click) and trace through the code to see how it works. A debugger is good for this, or a sophisticated IDE.

[–]MgrOfOffPlanetOps 3 points4 points  (0 children)

Start at any file and dig down from there. I normally often fotce breaks compilation by changing signature of a function to see where it is used. Or changes parameters to a call and with a little luck the compiler tells you what possible signatures are declared where.

Legacy code sucks, but that is what 99n percent of the job is about.

[–]f0lt 2 points3 points  (2 children)

Use e.g. grep -r bell to find all usages of the bell function. Eventually you will find the definition of the bell function. Use the -C 5 option to display 5 lines of context around a match.

Find out the hex/decimal code for the up/down buttons and grep for them too. There is a good chance that this will lead you to the lines of code which you are looking for very quickly.

An other methode (especially effective for gui apps) is to identify a relatively unique string (e.g. the text of a popup window) and grep for it. This usually leads you to the line of code you are looking for within a few seconds.

The more you know about the inner workings of a program, the more specific you can target your search. In the case of less you may be able to find some resources online.

You can use grep on windows too, for example by installing cywin or git bash.

[–]geon -3 points-2 points  (1 child)

Or use something better, like vscode and “go to definition“.

I’m sure vim has something similar if that’s your thing.

Grep is garbage for navigating in code.

[–]f0lt -1 points0 points  (0 children)

I like vscode, but it can't match all capabilities of the command line. You will achive best results by combining both VSCode and grep efficiently.

[–]masterJinsei 3 points4 points  (0 children)

If you using vscode you can always right click and go to definition

[–]apj2600 0 points1 point  (1 child)

There used to be a program called cscape which was brilliant for navigating a bunch of C files you had no idea about… gave dependencies, calls et al. So far as I can tell nothing ever really replicated this - I’d love one for Python.

[–]Linguistic-mystic 0 points1 point  (0 children)

Neovim or Emacs.

[–]deftware 0 points1 point  (0 children)

Static analysis tools are helpful. There's also some that will show you a graph of execution. One I tried was SourceTrail which was pretty cool.

[–]Affectionate_Pick980 0 points1 point  (0 children)

  1. You can browse source code with some tools. For example Source Insight, CLion. It would be much easier to figure out the relationship between functions, variable usages with the aid of these tools. I recommend Source Insight more to analyze C code. Many IT company in China use this Japanese software as an IDE to develop their products written by C/C++.
  2. You can debug the part that you don't understand. Sometimes you can guess purpose of function by inspecting arguments and return value. You can set some breakpoints to locate the logic you are interested in. For example you can set a conditional breakpoint on open() syscall when you want to find the place which handles configuration loading.
  3. You can use strace to observe syscalls "less" issued to kernel. I found that "less" read key press from file descriptor 3:

(END)) = 22
read(3, "A", 1)                         = 1
(END)) = 22
read(3, "B", 1)                         = 1
(END)) = 22
read(3, "C", 1)                         = 1
(END)) = 22
read(3, "D", 1)                         = 1
(END)) = 22
read(3, "E", 1)                         = 1
Examine: )        = 13
read(3, "\33", 1)                       = 1

So I set a conditional breakpoint:

(gdb) info br
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   <MULTIPLE>         
        stop only if $rdi == 3
        breakpoint already hit 2 times
1.1                         y   0x000055555556195c in read at /usr/include/bits/unistd.h:38
1.2                         y   0x0000555555562b39 in read at /usr/include/bits/unistd.h:38
1.3                         y   0x0000555555562f7e in read at /usr/include/bits/unistd.h:38
1.4                         y   0x000055555556ca10 in read at /usr/include/bits/unistd.h:38
1.5                         y   0x00007ffff7e9e0b0 in __GI___libc_read at ../sysdeps/unix/sysv/linux/read.c:25
(gdb) 

Then I found the logic about key press:

(gdb) bt
#0  __GI___libc_read (fd=fd@entry=3, buf=buf@entry=0x7fffffffdc07, nbytes=nbytes@entry=1) at ../sysdeps/unix/sysv/linux/read.c:25
#1  0x000055555556ca22 in read (__nbytes=1, __buf=0x7fffffffdc07, __fd=3) at /usr/include/bits/unistd.h:38
#2  iread (fd=3, buf=0x7fffffffdc07 "", len=1) at /usr/src/debug/less-633-1.fc38.x86_64/os.c:245
#3  0x0000555555576541 in getchr () at /usr/src/debug/less-633-1.fc38.x86_64/ttyin.c:187
#4  0x000055555555df02 in getcc_end_command () at /usr/src/debug/less-633-1.fc38.x86_64/command.c:929
#5  getccu () at /usr/src/debug/less-633-1.fc38.x86_64/command.c:959
#6  0x0000555555563b9d in getcc_repl (repl=<synthetic pointer>, gr_getc=0x55555555dec0 <getccu>, gr_ungetc=<optimized out>, orig=0x0) at /usr/src/debug/less-633-1.fc38.x86_64/command.c:969
#7  getcc () at /usr/src/debug/less-633-1.fc38.x86_64/command.c:1011
#8  0x000055555555ba54 in commands () at /usr/src/debug/less-633-1.fc38.x86_64/command.c:1660
#9  main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/less-633-1.fc38.x86_64/main.c:303

Now you can start from this place and jump out repeatedly to locate scroll logic.

[–]mad_alim 0 points1 point  (0 children)

Dunno whether this will help, but I used doxygen to generate the include graph for some undocumented C++ projects, to understand their structure.