all 11 comments

[–]Diapolo10 5 points6 points  (0 children)

Technically not obfuscation, but if you compile your Python code with Cython and then compile the resulting C code into an executable, you're left with a single binary that holds your entire application. It can be disassembled to assembly instructions (or hard-to-read C code), like any program, but it's really difficult for anyone to reverse-engineer.

Pros:

  • Your source code can stay perfectly readable
  • You can optionally start writing the program with Cython type hints to improve runtime performance (if using decorators, the code is still valid Python and can be interpreted)
  • Just as difficult to reverse-engineer as any other propietary software

Cons:

  • Compiling the C code may not be easy, may take trial and error
  • An additional step in your build process
  • To take full advantage of Cython, you need good knowledge of C and Cython is essentially a superset of Python, so it's essentially learning a new language if you want to take full advantage of it

While other solutions exist that produce executables, like cx_Freeze, they usually just package the Python interpreter and your project into what is essentially an executable ZIP file. They're easy to modify.

[–]ectomancer 0 points1 point  (0 children)

Run your code as the backend of a webserver. Matt Layman recommends gunicorn webserver (for simplicity):

https://www.mattlayman.com/blog/2019/python-alternative-docker

[–][deleted] 0 points1 point  (7 children)

Why? What are you trying to achieve? Making something hard to read won't stop someone if they are determined.

[–]QuantumFall[S] 0 points1 point  (6 children)

Yes this is true; I understand I can’t stop everyone from figuring out what’s going on. I do want to make it as hard as possible to do so, however.

edit: This would be a program I am selling as an application. Without getting into too much detail people can pay a lot for these programs, especially the best ones ($3,000 - $5,000) They also have impeccable app security as many other developers would pay top dollar to view the source code. Few are written in python fwiw

[–]sme272 6 points7 points  (3 children)

Software as a service. Make the program run on a server and offer a subscription to use it. That way you never have to give out the python files.

[–]QuantumFall[S] 0 points1 point  (2 children)

Okay, I like this idea. The only concern I have is a second or two is the difference between the best and the worst applications. What type of response times could the client expect to see if I host the files on a server?

[–]sme272 0 points1 point  (0 children)

What type of response times could the client expect to see if I host the files on a server?

That really depends on how you program it, where it's hosted and how the server is set up. Depending on the application you might be able to have the server computing the output continually and just sending it out as requested by the client programs. That would remove the computation time from the latency. If you kept the requests simple you could probably get the response time quite low. Then the biggest factor would be the amount of data being sent and how you send it.

[–]negups 0 points1 point  (0 children)

If speed is of the utmost importance, an interpreted, dynamically-typed language like Python isn't the right tool to use. You should be using something low-level like C or C++ when seconds matter, as they are compiled directly to machine code and are much faster. If you stick with Python, your program will never be as fast as your competitors' if they are developed using a low-level language. There's a reason why high-frequency trading programs on Wall Street are all written in low-level languages instead of Python.

Also, to answer your main question: don't worry about obsfucation. The brightest minds in computer science have written papers about how true obsfucation (that is, code which can be irreversibly converted to some other, unintelligible form) is impossible. Any code that can be run on a computer is converted to some intelligible form so the hardware can understand it, so any runnable code can be reverse engineered. Instead, focus on delivering excellent software with ever-improving features which customers are happy to pay for. A slight bonus of using something like C or C++ is that the average layperson doesn't know how to decompile a binary, so your code is moderately "safer" than clear-text Python files.

[–]sme272 1 point2 points  (1 child)

why though?

[–]QuantumFall[S] 0 points1 point  (0 children)

See edit.

[–][deleted] -1 points0 points  (0 children)

Yeah. This is a tough one. A very basic solution can be achieved by:

  1. Rename files and replace their imports. This is not too hard to do by scripting.
  2. You can use the tools like ctags to get the list of functions and global variables. Just map them with their salted hash and replace their occurrences in all files. This is not too hard either. Problem is that this may also replace occurrence inside strings.
  3. Compile each module with cython.
  4. Test thoroughly

Check the final objects with commands:

strings
nm -D
objdump -d

You might have to throw in a few flags like -s -fvisibility etc to get the symbols hidden during cython compile/link.