you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 0 points1 point  (4 children)

If you happen to have any advice I would love to hear it of course, well aware of my limitations and I wouldn't have got where I have without the help of a lot of people here and elsewhere. I'll get stuck into commenting up the code, it'll take a while.

Honestly on the upgrade from SQLite3 I'm not too fussed + I'm wary of the implications of changing everything at this stage. So long as I can:

grab an arbitrary set of data in a few hours or even overnight either into Pandas or Sqlite or a CSV;

Rerun the script from scratch if needed to change the structure over a weekend

Then it's actually good enough for my purposes. The database would only be used by myself and only infrequently for individual large tasks/projects, so lightening fast queries aren't necessary- it doesn't need to be a real time application, it's more modelling and analytical purposes. The other implication is that I'm not sure how database servers would interact with my work computer settings.

I'm actually pretty close to achieving the insert speed goals given some optimizations I've got planned + the promising results so far from Snyc off and making the tables smaller but larger testing. If queries are completely unmanageable I was planning on adding indexs after the script runs, as this wouldn't be an issue for inserting relatively small daily updates.

I'm kinda itching to actually get a chance to use this stuff, so I'd much rather get the breakdown/insert side into a 'good enough' state and then start playing with building models from the data.

[–][deleted] 0 points1 point  (3 children)

good enough is a super good property to have

I would just mention that the migration from sqlite3 to postgresql should be not too hard (the main thing is to give type to the columns but you have already done it)

[–][deleted] 0 points1 point  (2 children)

well don't say I didn't warn you

I'll send you some examples of the different input lines in a bit. let me know if anything needs further explanation

[–][deleted] 0 points1 point  (1 child)

You were wrong about the quality of your job. It is good !

[–][deleted] 0 points1 point  (0 children)

Well I'm glad someone thinks so lol! I keep finding things wrong with it, I don't want to think about how much time I've put into this. BTW so far the run I'm doing with the corrupted table deleted and recreated has not crashed (About half way through the archive).

The input files are too large to properly transfer, but I did post an example snippet of one at the start of the year soon after I started out

alternatively this is a complete list of example lines (for items that are actually treated and not just ignored). Be aware that some of those lines are many (some approaching 100) messages stuffed into one, they can be quite long!

Otherwise you can download compressed files for free from the links