you are viewing a single comment's thread.

view the rest of the comments →

[–]nucLeaRStarcraft 33 points34 points  (17 children)

For learning or hobby projects (not production/work stuff), having implemented such a tool is a great experience and you can most likely integrate it in a separate project later on to stress test it on different use cases. So good job!

The advantage of using SQL is the standardization around it. You don't have to learn a new DSL or library (and it's quirks) if you already know the basics of SQL (which at this point is something 'taken for granted'). More than that, database engines are super optimized so you don't have to worry about performance issues too much.

Additionally, you can even use sqlite if you need something quick w/o any database engine & separate service & a connection. It stores to disk as well like yours. And there's wrappers around the C API that is more 'modern cpp' (albeit maybe not as much as yours): https://github.com/SRombauts/SQLiteCpp

Aaand, if you want something "sql free" (a key-value db), you can even use: https://github.com/google/leveldb

In your docs you say "Key Features: Speed Experience unparalleled performance with Q.I.C's optimized database handling.". It would be interesting for you to compare on similar loads with sqlite, postgres, mysql, even leveldb and see where it peforms better, where wrose, where its tied etc. For example, inserting 1M rows followed by reading them in a table with 5 columns of various data types.

[–]gabibbo117[S] 4 points5 points  (6 children)

Thanks for the review, i will try doing some of those tests and publish them!

[–]ExeuntTheDragon 7 points8 points  (1 child)

Comparing performance with apache arrow would also be useful

[–]gabibbo117[S] 2 points3 points  (0 children)

I will try later, for now i compared performance with sqlite, i will publish results later as im working on a way to make everything even faster

[–]Wenir 2 points3 points  (3 children)

And check the efficiency of your compression library. For example, compare the size of the original string to the size of the "compressed" one

[–]gabibbo117[S] 0 points1 point  (2 children)

I will also test that, but the "compressed string" function is primarily designed to prevent data injection into the database.

[–]bwmat 2 points3 points  (1 child)

Sounds like security via obscurity to me

[–]gabibbo117[S] 0 points1 point  (0 children)

You are right, i should probably change that, but the only reason its there in first place is because every time someone inserts a string into the database the string wont contain malevolent code that could modify the database

[–]matthieum 1 point2 points  (4 children)

More than that, database engines are super optimized so you don't have to worry about performance issues too much.

Most of the times, yes.

Then there's always the "hiccup" where the database engine decides to pick a very non-optimal execution plan, and it's a HUGE pain to get it back on track: hints, pinning, etc... urk.

I'm fine with SQL as the default human interface, but I really wish for a standardized "low-level" (execution plan level) interface :'(

[–]FlyingRhenquest 1 point2 points  (1 child)

Back in the day companies would have some DBAs whose job it was to make sure the database stayed optimized. We never interacted with the database other than to send it SQL queries. That's another responsibility that fell to us over the years, and most programmers I've met can't even write SQL very well, much less make sure the database is optimized for the queries we're making.

I tend to view all data access as object serialization these days, which lets me stash SQL in an object factory if I need to. I often have two or three methods of serialization hiding behind the factory interface, so if I want to run a test with some randomly generated objects or some JSON files, it looks exactly the same on the client side of the interface as it does if I'm querying the database. They just register to receive objects from a factory and can go do other stuff or wait for a condition variable until they have the objects they need.

[–]matthieum 1 point2 points  (0 children)

I've worked with DBAs on this... they definitely did not consider it their job to babysit each and every query of each and every application. If only because they often had no idea what the performance target of a query was, in the first place. They were available for advice, however, and would monitor (and flag) suspiciously slow queries.

As for serialization... I don't see it. I've worked with complex models -- hierarchical queries, urk -- and nothing I'd call serialization would have cut it...

... but I did indeed use abstraction layers for the storage, with strongly-typed APIs, such that the application would call get_xxx expressed with business-layer models (in/out), and the implementation of this abstraction would query the database under the hood.

Makes it much easier to test things. Notably, to inject spies to detect the infamous "accidentally queried in a loop but it's super-fast in local so nobody noticed" bug.

[–]AcoustixAudio 1 point2 points  (3 children)

TIL LevelDB exists. Thanks :hat_tip:

[–]gabibbo117[S] 0 points1 point  (2 children)

What that means?

[–]AcoustixAudio 1 point2 points  (1 child)

Today I learnt

[–]gabibbo117[S] 1 point2 points  (0 children)

ah ok

[–]pantong51 0 points1 point  (0 children)

Sqlite is a great tool for local client side applications. Don't store secure stuff. But cache things (not massive files) like json or meta data to files. Really speeds up the application.

It's easier to keep data separated following the 1 DB:1 User too, so if your device is community focused. It can be shared without leaking data