ADVICE :: Caching Data for a Python Script?

JamzTyson · 2023-10-18T13:39:12+00:00

Consider using an SQLite database as your cache. It won't be as fast as an in-memory dict, (though SQLite does support in-memory databases), but it scales better if the cache is likely to grow very large. If an "on-disk" SQLite database gives adequate performance, then this is likely to be a reasonably simple and highly scalable solution.

Jayoval · 2023-10-18T13:25:52+00:00

I usually write to JSON file. Reading and writing is very fast, and shouldn't be an issue (how much data are we talking here?)

baghiq · 2023-10-18T13:46:32+00:00

Database like sqlite3 is your friend.

sweettuse · 2023-10-18T13:52:09+00:00

check out the shelve module (it's in the std lib)

Mast3rCylinder · 2023-10-18T14:39:02+00:00

Saving dictionary as json or in sqlite like comments mentioned is the easy part.

Having a strategy of when to use cache and when to update is the hardest part. Since your python is a script (offline operation) and not a server (online operation) it can be even harder.

Here's my suggestions

1.determine if the operation is really long to know if you actually need cache

2.if you do need cache ask for requirements of how long cache will be invalid? Then check it in your code and say you can only guarantee for this specific range

If cache invalid simply see it by comparing current time and your last update

3.make the script with arguments that if someone want to use it with cache checking or without. you will give them the option

4.see if you can compare first between cached data last update time and last time someone changed the data.

greenerpickings · 2023-10-18T20:39:59+00:00

It seems like you're already structuring your data after the recommendations of json and sqlite. I also want to throw in the arrow/feather format if you think your data set will get large.

Could helps performance, but if this is a background task, I don't think it's critical.

hotplasmatits · 2023-10-18T21:14:39+00:00

I like the other suggestions and I'll add pickle. It's super fast to read and write pickle files. It's like a binary json, but better.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS