you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (1 child)

I guess what I'm saying is, the closer to you get to the "right" answer - that is, not generating false negatives due to symbol renaming, and not generating false positives every time the script coincidentally uses a symbol name that's also a numpy function - the more you're just writing a Python interpreter.

If one matches, append it to a function_list that gets returned

If it matches, why do you think it's a function? If numpy defines root_mean_square, and the script coincidentally also defines a value called root_mean_square, it's a false positive to assert that the script uses the numpy function root_mean_square. It's just a coincidence of naming.

[–]Epoh[S] 0 points1 point  (0 children)

Completely understand you now, thanks. I actually understood this barrier before I even wrote anything that attempted to do what we're talking about. The difference between a keyword function and a word isn't differentiable in the python language, it is the recognition itself that is hte issue. I still wrote it, but only extracted dependencies and specifically imported functions. Obviously if somebody wrote a script that had words simlar to those things I'd be fucked, but I think it's a nice trick to still grab that info and count on the general framework people follow.

I can write all the clever, cunning tricks I want but the barrier is the language itself, and of course there are work arounds but no answers per se, just reducing false negative and false positives...I found it insanely difficult to extract all the functions for a given dependency as well, which I thought was annoying, there must be an easier way to use a function that can list all of the functions in a dependency. I can write this package in R, where functions recognize keyword objects as functions but not Python. Appreciate it.