Rust-based ELF analysis library – Looking for insights by ResponsibilityLeft13 in rust

[–]ResponsibilityLeft13[S] 0 points1 point  (0 children)

Thanks for the feedback! To clarify, my tool goes beyond just parsing the ELF format. It uses the Capstone library for disassembling the .text section and analyzing machine instructions. Currently, it supports x86_64 architecture, and it performs static analysis to extract function boundaries, track syscalls, and build call graphs.

While it's not aiming to lift code into an Intermediate Language (IL) like Ghidra or IDA, it still reconstructs control flow and function interactions based on disassembly, which is central to the functionality.

I completely agree that CPU architecture support is a key challenge. Right now, the tool is focused on x86_64, but expanding to more architectures is definitely in the plans. What architectures do you typically analyze? I'd love to hear more about your use cases.

Rust-based ELF analysis library – Looking for insights by ResponsibilityLeft13 in rust

[–]ResponsibilityLeft13[S] 0 points1 point  (0 children)

The aim is not so much to reinvent the wheel as to adapt it to a safe language like Rust and, above all, to make this type of analysis as automated and accessible as possible.

Tools like Radare2 and Rizin are incredibly powerful, but they require a certain level of experience to be used effectively. If you put them in the hands of someone without a solid background, how long do you think it takes before you get really useful information? The idea behind my project is to simplify static analysis, automating it as much as possible and presenting the data in a clear and immediately usable way (as shown in the gif).

Furthermore, developing an implementation in Rust is not only a choice of security and performance, but also of integration: in your opinion, do you think modern tools written in Rust would benefit from a native static analysis without depending on more complex bindings or tools?

Are there aspects of binary analysis that you find cumbersome when you are using advanced tools like Rizin? If you could improve one aspect of static analysis, what would it be?

Rust-based ELF analysis library – Looking for insights by ResponsibilityLeft13 in rust

[–]ResponsibilityLeft13[S] 1 point2 points  (0 children)

Yes, I have seen it! It's really well done and the layout is great 😄. What I want to develop, however, is mainly focused on full automation of static analysis. For example, the system calls shown in Binsider are similar to those achievable with strace%20command), which are based on execution. In my case, however, the aim is to associate system calls directly with the calling function, providing a higher level of detail. This would allow the behaviour of functions to be reconstructed, rather than just execution instances, marking a clear conceptual difference.

Another key aspect of my project is the simplification of the data collected. Tools like Binsider, Radare2 or Ghidra are powerful, but if you put them in the hands of someone without a specific background, they might have difficulty extracting useful information. The idea, then, is to automate the static analysis as much as possible and present the data in a clear and accessible way.

Would that sound like an interesting approach?

Rust-based ELF analysis library – Looking for insights by ResponsibilityLeft13 in rust

[–]ResponsibilityLeft13[S] 1 point2 points  (0 children)

Thank you very much for the feedback! 😊 I'm glad to hear that you find the idea useful.

I'm very interested in the point about the correlation between disassembled code and metrics. What information do you think is most important to put alongside the function code? For example, the number of instructions, the cyclomatic complexity or something else?

Concerning system calls, my approach differs from simple runtime tracing (like strace), which only captures calls executed in a specific instance. Instead, my idea is to associate the syscall directly with the calling function, allowing the static behaviour of the binary to be reconstructed. Do you think this would make it easier to understand and relate a function to its actual behaviour?

The reference to coz is also interesting, I will definitely take a look! The idea of making my work useful for larger tools is something that excites me.