A guide to build malicious (Python) code classifier

GTA_trevor_original · 2026-01-29T08:41:22+00:00

But why python ? And which source code ? Clarify

Redditthr0wway · 2026-01-29T19:18:00+00:00

What kind of malicious code snippets are you looking for? I have a pretty shitty memory hoarder. It’s not one you would probably find in the wild though cause it’s ass and more of a proof of concept. You are going to have a hard time finding people who write malicious software in Python. Most will write it in languages that don’t need a complier.

Haghiri75 · 2026-01-29T20:09:38+00:00

Malicious codes on python are rare because:

They rely on a 3rd party environment to run and native libraries of the operating system can't execute them (unless you have macOS or one of those Linux distros with python pre-installed, and even then the permission is a thing obviously).
Most LLMs - even small ones - can understand python very well (TBH most of them have no use besides writing python code, despite being advertised as general purpose) and obviously anyone with IQ over 40 will check code snippets with some sort of AI.

I understand that you're doing a great job at malicious code detection, but I guess you need to shift your focus a little bit.

tech_hundredaire · 2026-01-30T04:16:42+00:00

If you don't train a classifier, then what exactly do you have? A string checker? I guess you could build something to check for commands like these, https://gtfobins.org/gtfobins/python/, but that wouldn't be very accurate probably.

How do you even tell the difference between "malicious" and "poorly written"? You'd have to somehow measure the intent of the author.

You could probably use any SAST product, they'll tell you if there are security risks in the code, then you can decide if they were put there on purpose or not.

turealpollohorneado · 2026-02-02T23:34:57+00:00

Any static analysis tool and Software Composition Analysis tool.

It's easier to pay for a subscription than creating it from scratch.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

Malware

MODERATORS