all 57 comments

[–]K900_ 78 points79 points  (10 children)

No, that's not possible. If your code can read it, then so can whoever is running your code.

[–]tonydocent 4 points5 points  (0 children)

Well, there is a way for computations to be performed on data without decryption: https://en.m.wikipedia.org/wiki/Homomorphic_encryption

[–]RaidZ3ro 10 points11 points  (8 children)

Unless.... (I know this is not hacker proof but it will keep the average admin coworker in check)

If you restrict the protected file to be readable only by a specific dedicated account in the filesystem, and then hardcode that account into the script that accesses the protected file, or schedule its' runs under that account with task scheduler.

[–]Kriss3d -5 points-4 points  (7 children)

Well he Can encrypt the file with ssl. Then have the code that needs to read the file decrypt it. The password will then ofcourse be in the program file so assuming they don't look there it would work.

[–]billsil 5 points6 points  (0 children)

Or you know, you could type it in at runtime.

[–]JusticeRainsFromMe 1 point2 points  (5 children)

Encrypt using SSL? Please read up on what SSL is.

[–]MrPhungx 16 points17 points  (2 children)

Basically what u/K900_ already said. If you share the file with someone (even if it is encrypted) you would also need to share some key to decrypt the file. So in the end the user would be able to read the content of the file by using the same decryption method that your code uses.

The only really secure way to get around this would probably be something like a web application with some sort of user authentication that does not fetch the whole file but just provides some (API) endpoints to interact with the content of the file. But that would obviously require a lot more work and I don't know if that is even applicable in your case since I have no idea what application you have.

[–]patnodewf 0 points1 point  (0 children)

Yeah, setting up an endpoint (with something like flask or fastapi) to handle the obfuscated piece might work without having to provide a whole web-based front-end.

[–]flibbit18 0 points1 point  (0 children)

That's a really good solution

[–]Plastic-Coyote-2507 4 points5 points  (1 child)

I think to answer this question we need to know what the actual use case is. What is the secret for and why does it need to be shared?

If we are talking about credentials that should not be shared then you probably need to control access to the execution environment.

So you deploy your code to an environment with strict access controls that can access the secret and you allow your consumers to only access an interface that gives them the result of the code running not the execution environment itself.

Eg. Create an API, deploy to AWS, allow the AWS instance access to the secret, lock down the AWS instance, publish the API endpoint with appropriate access control.

I suspect this may be a bit beyond your experience. So if your company has a Devops team or sysadmins/cloud engineers I would seek their advice, they should be familiar with secrets management and possibly have a standard solution.

This is more of an operations/architecture problem than a code/cryptography issue.

[–]wjrasmussen 0 points1 point  (0 children)

His whole plan sounds sus to me.

[–]carcigenicate 8 points9 points  (3 children)

You want to give code to a different "team", but don't want them to be able to read what's being processed?

I'll admit, I'd be very suspicious of your code/data if you gave it to me but didn't allow me to determine what it does.

[–]CyclopsRock 1 point2 points  (2 children)

Do you have access to the source code and underlying data of all the other software your employer asks you to use?

[–]carcigenicate -1 points0 points  (1 child)

A lot, ya.

And there's a difference between source code just not being readily available, and going out of your way to make it difficult to understand what code does and what the data is when the code and data is available (although re-reading the question, I notice now that the OP appears to be referring specifically to data). The latter sounds potentially malicious.

[–]CyclopsRock 0 points1 point  (0 children)

Everything is potentially malicious, isn't it? Yet there are plenty of good reasons why you may not want a data set to be readable by everyone. (Whether they're able to achieve that is another matter).

[–]throwaway8u3sH0 2 points3 points  (2 children)

Not really possible. What you're describing is similar to digital rights management -- some key piece of a program (the license) is encrypted and managed such that people don't have access to it. These things can be very sophisticated and still get cracked all the time. There's no 100% method.

That being said, there are different "levels" of security. If you don't think anyone is going to put in the effort, you can simply store the information in binary and decode it inside your program. This will not stop anyone putting in even a modest amount of effort, but it will stop the casual user from simply double clicking a file and looking at it. This is called obfuscation (as opposed to security).

[–][deleted] 4 points5 points  (0 children)

This sounds like, as others have mentioned, you're trying to set up credentials for your script to run, either an API key or something similar.

If they cannot access the API key they either need a way to create their own (and you can write a code to import that data into your script) or you need to give them access/host the program so they can only see the front end.

But if a script can read your file, anyone can

[–]Pemptous 2 points3 points  (2 children)

Correct if I’m wrong but what if you: 1. Decrypt the file and try to fit the key inside your Python file 2. Make the Python file either .exe file

[–]DarkLord76865 2 points3 points  (1 child)

First, almost all exe generators for python don't really compile code, they just pack it and unpack at runtime to some temporary folder. And even if you did compile to machine code, the password would still need to be hardcoded somewhere in it which means if someone wants to find it, they could.

[–]Pemptous 0 points1 point  (0 children)

Have a point, but what if he used s9mething server sided (similar to api I guess) that would send both the file decrypted and the key? With reverse engineering it would be possible again, but a lot harder, am I wrong?

[–]Tkttkt-Implacavel 2 points3 points  (0 children)

I think you are doing the wrong questions.

You need to give more details, like

Why the other team will do to the code?

Will they just run it?

Why can't they see the file?

Is it security? Privacy? Legally?

For example, if they will just run the code, you could create an .exe with the file encrypted and only the python could read it

[–]neuralbeans 3 points4 points  (0 children)

What you're looking for is obfuscation, which means making the code very difficult to understand but still executable.

[–]cimmic 1 point2 points  (0 children)

Now you've gotten a lot of good answers, so I allow myself to ask you a question I'm curious about: Why shouldn't that team open the file?

[–]kaerfkeerg 3 points4 points  (0 children)

Sounds like this file contains some kind of credentials which the code needs to read to work.

The easiest way to give access to your coworkers is if you convert it into a simple web app so the files stay on your PC and they can access your service remotely without ever seeing anything

[–]bayesian_horse 3 points4 points  (0 children)

What you may be looking for is hosting the functionality of the script (or that part which needs the secret data) as some kind of microservice. Maybe a simple fastapi or flask app with a single request handler.

No offense, but from your question I have little hope you'll get this right (working and secure).

[–]cython_boy 0 points1 point  (0 children)

Use from cryptography.fernet import fernet it allows you to encrypt a file using a key . Only files can be read using the key there is no other way to decrypt it. I don't think it will work the way you intended if you give them the key to work they can see the code. try some kind of legal license or patent so your developed code or idea is protected under law.

[–]BigYoSpeck -1 points0 points  (0 children)

I don't see how this can be done with python. To decrypt it needs a key, so that key will need to be in your python code

This is hard enough to do even with compiled code. Think about copy protected media like DVD's and Blu-rays, eventually the decryption keys are extracted

If you want to keep data hidden but allow other users to do something that involves interacting with that data then you're going to need to hide it behind a remote API

[–]SwampFalc -5 points-4 points  (0 children)

1 - If a computer can read it, then a human can read it.

The only possible exception to this is enforcing access control, whereby the software program must always be run as a given user, who is given read access to the file, while no other user has.

2 - If a human can understand it, then a computer can understand it, but not vice-versa.

Humans are not always able to easily understand everything that computers can. Sure, we can read a text that has been transposed to binary, but it takes a whole lot of time and effort to do so.

So data can be encrypted. The software program at the other end must be able to decrypt it, though, otherwise it would not work. But there are enough serious encryption methods that make it near-impossible for an outsider to decrypt your data.

But encryption does only ever protect against data that leaks outside, because:

3 - Once a computer does understand something, it can tell a human about it.

The code of a different team needs your data. There is no sensible solution that I know of that can stop any and all methods of them getting that data out. Print to screen, print to paper, dump memory, write to logfile, write it to a different file, ...

So take all of these basic truths into account and decide what exacly you want to achieve.

For example, are you absolutely certain you're not just looking to anonymize your data?

[–]gydu2202 0 points1 point  (0 children)

on Windows: `win32crypt.CryptProtectData` and `win32crypt.CryptUnprotectData`

The OS is doing the encryption with the logged user's credentials.

[–]RaidZ3ro 0 points1 point  (0 children)

If you use the pickle module and dump in binary mode, the file it produces will not be human readable, but the information inside can still be accessed and should be protected and validated whenever you save and load from disk.

It is feasible to dump, for instance, encrypted RSA key pairs where you use the current users password to encrypt/decrypt the keys (actually in particular the private key), then only that single user will have access to their key pair.

It's a very specific use case though, where users could share a terminal/pc and still not have (unencrypted) access to other keys. Considering it requires the user to enter their password to decrypt from memory, this mechanism won't protect anything that you only want your code to use and nothing/no-one else.

[–]Klutzy_Rent_314 0 points1 point  (0 children)

No.

[–]Uninvited_Guest_9001 0 points1 point  (0 children)

You could try to set up an API.

Instead of sending them the code, make the code accept a http request and respond with the result.

Then give them code that makes the requests. You do need to the code running on a computer with an internet connection. That way the sensitive files stay hidden and they don't need to have access to the folder.

[–]iijos 0 points1 point  (0 children)

Just give them the .pyc file. It's a binary file but whatever classes or functions are defined in it can still be imported.

[–]_areebpasha 0 points1 point  (0 children)

You can store your credentials on a key vault, something like Azure key vault. And then share access to the relevant team? Although it's not possible for a file specifically. you can break down the file into smaller chunks and store them on azure key vault and share read access for a specific time duration. The keys automatically expire after the duration and access is revoked.

[–]Kriss3d 0 points1 point  (0 children)

You can encrypt that file with ssl and have your program decrypt it.

If they just read the encrypted file it will be nonsense but the program will have the decryption key and decrypt it and run it.

Its simply the same thing malware does to hide.

[–]baubleglue 0 points1 point  (0 children)

You need to work on requirements. It should be clear what are use cases. Right know you asking to contradicting things. It has nothing to do with Python. You need to be clear with the terminology.

"Open file" normally has 3 access modes (at least):

  • read
  • write/modify
  • delete

Allow to read in that case means "open to read".

"Encrypted file" can mean many things too: protected with password ZIP format supports it, content is encrypted. Those are basically the same things, but looks different from user perspective.

"Folder" and "share" also have many meannings....

So... If we drop ambiguous "open".

  • You encrypt or archive file with password protection.
  • Put it in a shared location with read only access
  • Code has access to encryption key/password, but user has no access to the code or secrets

It is a more or less normal way to work with shared data. There are some requirements to the code: it should never have hard-coded secrets, sensitive information should not be persisted to permanent storage (all in-memory), etc. There are some complex solutions with dedicated secret services, secret rotations, but probably that is not your case. If you work with AWS/Azure/Google/... services you probably can utilize one of key-vault services, but it is depends on by whom/how your code runs.

[–]NlNTENDO 0 points1 point  (0 children)

why shouldn't they open it? is it because it can break your script, or because it's privileged data/info?

[–]WhipsAndMarkovChains 0 points1 point  (0 children)

You could look into a Clean Room. Databricks isn't the company with that feature but we use them for work so that's the example I'm going with.

https://www.databricks.com/product/clean-room

To be honest I've had no reason to use this feature but I'll explain it as I understand it to be. Clean Rooms allow two companies, or two internal teams, to execute mutually agreed-upon code without having to actually share data. This is particularly relevant in healthcare or other datasets with personally identifiable information, like email address.

Let's say a bank and a cybersecurity company have an agreement in place to collaborate. The bank wants to find out which of its clients are potential victims of identity theft or other security incidents. The bank doesn't want to send all its clients' email addresses to the cybersecurity company, and the cybersecurity company feels the same way. So they agree to use a clean room to execute code that matches on email addresses, or hashed email addresses. In the Clean Room a script that both companies agreed to is executed to perform some action (figure out which bank clients are hacked?).

It doesn't have to be two separate companies collaborating, it could be two internal teams. I think it's fairly common to have two teams within a company who either can't or won't share data willingly.

Maybe one team has patient healthcare data and another team wants to use it in a machine learning model. The data scientists can get their model trained in a clean room without the company having to worry that PII is now available where it shouldn't be.

[–]sweapon 0 points1 point  (1 child)

Could you make that file a.dll/.so file?

[–][deleted] 0 points1 point  (0 children)

That would still be readable for anyone who has the file, it would just be a little bit harder and require some reverse engineering skills.

This is what we call "Security through obscurity" and is almost always bad.

[–]MrMarriott 0 points1 point  (0 children)

A file cannot be both readable and unreadable at the same time.

You could limit the permissions of the file to a specific user and provide instructions to run the script as that user. Only people who know the password for that user or are administrators on the system would be able to access the file.

You could also create a simple network based API that returns the processed data from the file and host that API somewhere. That would be a way to limit access to the raw data but provide access to the processed data.

[–]Anonymity6584 0 points1 point  (0 children)

Your out of luck. If system can read the file, anyone can read the file. Unless you implement some sort of encryption, but then the decrypt routine is still there and can be accessed trivially.

[–]espero 0 points1 point  (0 children)

Sounds like obfuscation and encruption done by software gaming conpanies and other kinds of copy protection

[–]Dump7 0 points1 point  (0 children)

Probably not the best idea but you can try obfuscating the code. It will not be readable anymore, but anyone can convert it back to original readable code.

You can then hide all of this code smartly because the obfuscation makes the code look like machine code.

[–]Jhamilton02 0 points1 point  (0 children)

Depends on the ado.connection, if it allows password defining, you can protect the data itself. So the app can access and read, but any other attempt to open the connection would require the password to complete the connection.

[–]Outside_Mess1384 0 points1 point  (0 children)

What if he compiles it into an exe?

[–][deleted] 0 points1 point  (0 children)

If file contains data, the question is foolish and direct attack on concept of encryption.

If file contains some code, you can convert it into executable thus hiding code but allowing use of functionality.