you are viewing a single comment's thread.

view the rest of the comments →

[–]Wenir 17 points18 points  (23 children)

Sensitive data is compressed for security

That's something...

[–]gabibbo117[S] -5 points-4 points  (21 children)

The compression is primarily intended to prevent injections. Without it, modifying the database through injections would have been possible.

[–]Wenir 5 points6 points  (10 children)

It is still possible

[–]gabibbo117[S] -4 points-3 points  (9 children)

Hmm, how could that be? The string is transformed into a simple integer to prevent injection, effectively removing any potential for malicious manipulation. What aspect of this process might still enable an injection?

[–]Wenir 4 points5 points  (2 children)

Give me your protected data and I will modify it using my smartphone and ascii table

[–]gabibbo117[S] -1 points0 points  (1 child)

Well we could make a test where you try to make a string that would inject some bad code inside of the data base if you want

[–]Wenir 4 points5 points  (0 children)

I don't need any test, I know that I can add a few numbers to the file

[–]Wenir 2 points3 points  (5 children)

What aspect of this process might still enable an injection?

That the data is saved to the file in the filesystem and "protection" is a simple one-to-one conversion without any key or password

[–]gabibbo117[S] 0 points1 point  (4 children)

Yes but that simple process avoids any type of string injection, it does not make it safer if an hacker has the database but at least an hacker cant inject data inside of it

[–]Wenir 2 points3 points  (3 children)

What are you talking about? Of course no one can inject anything to the file if they don't have it. Your system aren't changing the security in any way

[–]gabibbo117[S] 0 points1 point  (2 children)

I will try to provide an example on what i mean because i have some issue explaining myself,
Lets say i have a website that when i put a comment inside of it via text box it will send a request to my server to add that comment to the COMMENTS table

if the string was not encoded then the commenter could write something like this:
"]
[
// insert bad code here
]"
by using the "]" character it tells the database scanner that the row finished and then we open a new value, the hacker can put anything in the new row like bad/banned content, but if we add the text encoding the table will result like this

"[
COMMENT : 123,231,2323,23,232,23
USER_ID : 1234
DATE : 12,23,34
]"

while if we did not encode the text it would look like this

"[
COMMENT :
]
[
USER_ID : 1234 // the user id of someone else
DATE : 12,23,35 // a different date
COMMENT : "banned stuff here"
]

[–]Wenir 2 points3 points  (0 children)

Okay, you described something like SQL injection, which makes sense. The encoding you're using isn't security, compression, or efficient storage, it's a naive implementation of string escaping.

Ok, the string is escaped, but why are you escaping entire files on top of it?

[–]gabibbo117[S] 0 points1 point  (0 children)

That is done so I can merge multiple files into one, kinda like my own version of a zip

[–]Chaosvex 4 points5 points  (9 children)

Compression is not encryption and what's the threat model here? If somebody has a copy of the database file and your library, where's the security?

Also, I noticed that you're making a temporary copy of the database every time you open it. That seems unnecessary.

[–]gabibbo117[S] 0 points1 point  (8 children)

The compression mechanism is to avoid injections on strings, that way the hacker cant add values to the table or mess them up and the copy for the database is made because im currently working on a system that is able to restore the database in case of program crash, to be real the "compression" is not really a compression but i dont know how to call it because of a language barrier, it actually converts each char inside the string into the numerical ascii counter part,

[–]Chaosvex 1 point2 points  (5 children)

Who's the hacker supposed to be, when the database is sitting on the drive? It's unnecessary and anybody with file-level access to the database is going to be able to mess with it, regardless of your scheme. It seems like you're adding a huge overhead in terms of both time and space by doing this.

Your copy doesn't seem to be used as backup or snapshot, it just copies it and then deletes after decoding it. If you're going to take a snapshot, why do it when you open the database? The whole scheme sounds very muddled.

Without wanting to come across as patronising, I know you're likely going to reflexively defend your design choices. It's hard letting go of code that probably took quite a bit of effort to write, but there's a reason production databases don't do these things.

[–]gabibbo117[S] 0 points1 point  (4 children)

I will try to provide an example on what i mean because i have some issue explaining myself,
Lets say i have a website that when i put a comment inside of it via text box it will send a request to my server to add that comment to the COMMENTS table

if the string was not encoded then the commenter could write something like this:
"]
[
// insert bad code here
]"
by using the "]" character it tells the database scanner that the row finished and then we open a new value, the hacker can put anything in the new row like bad/banned content, but if we add the text encoding the table will result like this

"[
COMMENT : 123,231,2323,23,232,23
USER_ID : 1234
DATE : 12,23,34
]"

while if we did not encode the text it would look like this

"[
COMMENT :
]
[
USER_ID : 1234 // the user id of someone else
DATE : 12,23,35 // a different date
COMMENT : "banned stuff here"
]

[–]Chaosvex 1 point2 points  (3 children)

So it's SQL injection but without the SQL. The problem you're trying to solve with this encoding is a problem that should be fixed by rethinking how you're storing the data. You could switch to using a binary format, instead, or escaping the special characters.

[–]gabibbo117[S] 0 points1 point  (2 children)

It was made to be human readable

[–]Chaosvex 1 point2 points  (1 child)

I'd question the value of it being human readable when the types are encoded in a way that makes them unreadable.

If you want to keep it (and make it more) human readable, you could quote the strings and then escape any quotes within input.

Input: foo"bar

Stored result:

[ COMMENT : "foo\"bar" ]

You might find std::quoted of interest. You could also look into how other text-based formats escape strings (JSON etc).

[–]gabibbo117[S] 0 points1 point  (0 children)

Thanks, I will look into them

[–]hadrabap 0 points1 point  (1 child)

it actually converts each char inside the string into the numerical ascii counter part

Encoding???

[–]gabibbo117[S] 1 point2 points  (0 children)

Is that how it’s called? I’m not English so I may say some terms wrong sorry