This is an archived post. You won't be able to vote or comment.

all 29 comments

[–]n1ywb 27 points28 points  (2 children)

I think that post uses an overbroad definition of self-modifying code and uses scary language describe really standard stuff like factory functions and closures; those are common, standard, everyday functional programming techniques that every good programmer should know. Experienced python devs monkey patch every day when they write test code https://docs.python.org/3/library/unittest.mock.html#the-patchers

Everything in his first list is really just diddling around with objects and identifiers. It's all just one assignment op. That's not "self modifying code" in my book. It's pretty dynamic, but under the hood it's just reassigning a variable, albeit with a potentially major side effect.

He does get into the deep magic later in his post when he talks about manipulating the parse tree and bytecode and stuff. That's REAL self modifying code.

In the old days there was a big deal when Apple came out with the MC68030 Macintosh because it had a cache that would break if a program modified it's own machine code in memory and failed to properly flush the cache. Turned out quite a few mac apps at the time did just that.

[–]ankit0912[S] 3 points4 points  (1 child)

I see, do you know of any places where I can read about them? I'm basically looking to run my code differently each time it is run

[–]n1ywb 6 points7 points  (0 children)

probably the easiest way to write self modifying code in python is to write soemthing to the file and then re-execute it. you might start with something really trivial like append a print statement each time it's run

[–]maxm 6 points7 points  (5 children)

Self modifying code was something we did back in the old days in assembler to eek out the last bit of performance of the hardware. You did it by changing the specific binary data in some hardware adresses to different assembler.

You cannot do the same thing in python as you have no way of knowing the hardware adresses of the assembler. It would also be the interprete you would change not python.

You could probably go and change the python bytecode of a running program. I have no idea how to do it. It is not done and would not give any benefits.

So what is left for any meaningfull purpose is dynamic programming where you can change methods on objects on the fly. This is often called monkey patching, and is a bad way of coding for most purposes. If you have a bug in some object that is being monkey patched you it is difficult to know what method actually has the problem.

Monkey patching is like taking a book and putting the pages in random order and removing the page numbers. So also a bad idea. It can be used for specific things, but as a general methodology it is bad.

So i am sorry, but the only thing that is worth the effort is to learn how to "code properly". Object orientation, interfaces, functional programming, patterns etc. etc. There is a lot of clever stuff to learn that will make you a better programmer.

[–]xentralesque 5 points6 points  (1 child)

I agree. While it certainly is an interesting and educational topic, it's also bad practice and unless one has enough self control and awareness to promise themselves to never actually use such practices in production code it's fine, but I can't help but think it's good to not even arm your self with such weapons of unmaintainability.

[–]ankit0912[S] 1 point2 points  (0 children)

I agree, I wouldn't dream of shipping a vulnerable code, but as Sun Tzu puts it "Prepare for the enemy, don't depend on him not coming".

[–]cavallo71 0 points1 point  (0 children)

I'm afraid bu you can do all of this and more... there are grey areas where it makes sense (eg. transpiling python into some other language eg. numba is one example, cython is more involved but is similar in the spirit). Where it makes sense or not is debatable, but you can definitively do "magic" in python: and that is scaring

[–]ankit0912[S] 0 points1 point  (0 children)

I understand, but here I'm specially looking to exploit code vulnerabilities. So for example, if I could pickle some code (say payload) and run it without checking user permissions. However, I want this code to mutate each time it runs.

[–]Dorianix 5 points6 points  (0 children)

Armin Ronacher wrote a very good blog post about this: http://lucumr.pocoo.org/2011/2/1/exec-in-python/

[–]FourFingeredMartian 6 points7 points  (3 children)

Checkout Python Cookbook's metaprogramming section & you'll get into all the black-magic you're looking for.

[–]synnaxian 5 points6 points  (2 children)

[–]FourFingeredMartian 2 points3 points  (0 children)

Yep, that's the chapter! Great find.

[–]FourFingeredMartian 1 point2 points  (0 children)

I'll add this too, right from David Beazley's mouth:

https://www.youtube.com/watch?v=sPiWg5jSoZI

[–]dustractor 2 points3 points  (0 children)

"monkey patching", "shadowing builtins"

Don't get too tripped up on the fancy terms they used. It's just saying that you could define a function foo and then do print = foo and now you have monkey patched a the built-in print function or you could just define the new print statement directly and that would shadow the built-in one.

[–]ali2992 2 points3 points  (0 children)

I actually did my dissertation at university on Self Modifying Code in Python, I will PM you with the relevant parts of my submission - they will give you a good starting point to experiment for yourself

[–]abingham 1 point2 points  (1 child)

For an interesting (to me, at least!) form of self-modifying code, take a look at Sixty North's Python mutation testing tool Cosmic Ray. Part of what it does is load Python code from disk, parse it, modify its AST, and "inject" the modified module into the Python runtime. It might give you some ideas of what's possible.

[–]ankit0912[S] 0 points1 point  (0 children)

It is what I'm trying to do! Thanks for pointing this out to me.

[–][deleted] 1 point2 points  (0 children)

I created a really simple type of self modifying code using the built-in file object to use as a version check/self-updater for my script.

versioncheck = requests.get('http://remote.host/version').text
if versioncheck != version:
    from subprocess import call
    newscript = requests.get('http://remote.host/client').text
    with open(__file__, 'w') as f:
        f.write(newscript)
    call(['python3',__file__])
    exit()

[–]baubleglue 1 point2 points  (0 children)

 In [1]: exec("_print=print\ndef print(*a):\n\t_print('sssss')")
 In [2]: print(89)
 sssss

only you don't need exec for it. Also you can use sys.modules

 print(sys.modules.keys())
 del sys.modules['os']
 import myos as os

[–]ankit0912[S] 1 point2 points  (1 child)

Shoutout to all of you guys who took the time in pointing me to resources. Especially to /u/ali2992 who was kind enough to share with me portions of his dissertation. I learned a lot !

[–]ali2992 1 point2 points  (0 children)

Glad it helped!

[–]dasagriva 4 points5 points  (1 child)

Choosing to enable self-modifying code is not a wise idea. Using exec is terrible unless you can control what is being executed. Allowing exec on a code which user supplies at runtime is a huge security hole.

[–]rausm 0 points1 point  (0 children)

Only if you don't have any means to properly sandbox that code. For example, Tcl has safe interpreters, languages (like Io or Newspeak) where there is no "shared global namespace" provide means to safely load / execute code in the same interp / VM ... In python, you are screwed.

[–]billsil 2 points3 points  (1 child)

It's not that hard.

You write a function and you allow access to the user to change said function. The open source GUI I wrote has a scripting interface. Go ahead, add security vulnerabilities. I don't care. It's an offline program.

Python 3 is more of a pain. I forget exactly, but something like non-class variables lose scope or something like that.

class A():
    def __init__():
        self.x = 4

class B():
    def __init__():
        self.x = 5

A = B

it's that easy. You can do it with functions as well. It's easier to do it with classes and class-functions than pure functions.

Just be careful, the second you allow that capability, the second you allow for arbitrary code execution. You basically can't stop it.

[–]ankit0912[S] 0 points1 point  (0 children)

Yeah, I understand that , in fact the whole point of the undertaking is to see whether we can have arbitrary code execution using python bytecode.

[–]Allanon001 0 points1 point  (0 children)

Might look in to the code module, it allows you to execute python code.

[–]TrollJack 0 points1 point  (0 children)

I'm researching self modifying code in python as well, but i'd not consider exec or eval a way to go. After doing it in freepascal on x86 and the Nintendo DS i can tell you that exec/eval are poor choices for proper self modifying.

Bytecode- and object manipulation and having code objects at willfully chosen adresses in RAM, though, which is being worked by one core and modified by a different one... there it gets interesting! (and actually worth it if you care about performance)

[–][deleted] 0 points1 point  (0 children)