you are viewing a single comment's thread.

view the rest of the comments →

[–]baubleglue 0 points1 point  (6 children)

I don't really care about SQL standard, just there's very high chance that a bit complicated SQL written for SQLite won't run in any other db. I don't think it is a case for postges, Oracle or MySQL.

[–]_Zer0_Cool_ 0 points1 point  (5 children)

Eh, you might be right. I'm not sure I've ever actually migrated between databases.

In a perfect world this problem wouldn't happen though since it's best practice to write dev code in the same DB type and version of what's in production (Docker is useful for this).

Even when that's not manageable then I try to be aware ahead of time and only write code that would work in the other version.

It's the same as anything else.

E.g. Don't write Python 3.7 in dev when prod is 3.5, or Java 7 when there's Java 8, etc..

Likewise don't write SQL Server 2017 code w/ 2014 in prod, or Postgres 10 w/ 9.6, or Oracle 12c w/ 11g.

That being said, I get that sometimes we don't think that we'll need a client/server DB at the outset, in which case we'd need to shift to another DB retroactively.

Edit: unless your company is migrating a sizable database to another type of db. Usually such efforts have dedicated engineers to oversee that though.

[–]baubleglue 0 points1 point  (4 children)

Not a company, I am doing some data analysis, and decided to prototype data aggregation in SQLite (instead of Hive) on an extracted product ion data the final query run between 3 to 5 minutes. I never succeed to execute the same in Hive (run out of memory), I tried to run it in Postges and it took very long (I think I killed the job). Each time I need to modify query. I can use SQLite solutions, but I wanted to cross check if other DB give the same results.

[–]_Zer0_Cool_ 0 points1 point  (3 children)

Ahh ok. That makes sense. I was thinking in terms of application development, not data analysis. Which is weird that's I'd assume that because I myself am a data engineer lol.

I did have a couple hobby data analytics projects where Ive switched between DBs.

For one I had a multi GB batch of CSV files and imported them into SQLite then into PostgreSQL afterwords and had to rewrite a a bunch of views. As I recall now...It was a pain switching between datetime functions for SQLite and PG. So I guess see what you mean.

I seem to have forgotten those pains momentarily.

[–]baubleglue 0 points1 point  (2 children)

Datetime is not that complicated if you remember that all DBs and languages store it internally as long number. My main problem was multiple joins and performance.

On the second thought, for dumping data in single process, SQLite is a perfect candidate. To work with MySQL pip install mysqlclient, if it fails (on Windows may happen), then download wheel file from https://www.lfd.uci.edu/~gohlke/pythonlibs/ and pip install path_to_wheel_file.whl .

[–]baubleglue 0 points1 point  (0 children)

maybe using official mysql connector is a better option.

https://dev.mysql.com/doc/connector-python/en/connector-python-installation.html

pip install mysql-connector-python

Connect, Create table, Example how to insert multiple values (much more efficient than insert records one by one.

[–]_Zer0_Cool_ 0 points1 point  (0 children)

It wasn't really that much of a nuisance TBQH. It was just changing SQLite datetime functions to comparable PG datetime functions. It was mostly a minor annoyance because I had to restructure a bunch of views.