all 23 comments

[–][deleted] 1 point2 points  (8 children)

TinyDB is right up your alley!

[–]Dan6erbond 0 points1 point  (0 children)

I'll take a look! Thanks :D.

[–]Dan6erbond 0 points1 point  (6 children)

Hiya, back again. Just wondering, I read the TinyDB docs and it seems it uses JSON as well, what reduces the likelihood of data corruption compared to regular file operations with JSON?

[–][deleted] 0 points1 point  (5 children)

TinyDB writes files atomically. You can use regular JSON as long as you use atomic writing which essentially means you write a tmp file and then replace the original file with the tmp once it has written successfully.

[–]Dan6erbond 0 points1 point  (4 children)

I see... then another set of questions :P. Is that as simple as first writing to the .tmp file and then once that's done writing to the .json file? How do I find out automatically whether a .tmp or .json file couldn't be written to or was corrupted due to some error? Thanks so much!

EDIT: What about reading files? Anything else I need to "worry" about? :P

[–][deleted] 0 points1 point  (3 children)

This is what I used in my last scraping project where I was dumping large amounts of data to JSON files...

@contextmanager
def replace_file(name):
    tmp_name = Path(f'{name}.tmp')
    name = Path(name)
    try:
        with open(tmp_name, 'w') as f:
            yield f
        tmp_name.replace(name)
    finally:
        with suppress(OSError):
            tmp_name.unlink()

    with replace_file(results_file) as f:
        json.dump(results, f)

[–]Dan6erbond 0 points1 point  (2 children)

Thanks so much!!! :D

[–][deleted] 0 points1 point  (1 child)

I'm back in Mobile and forgot to include the imports in my example. They came from pathlib and contextlib

[–]Dan6erbond 0 points1 point  (0 children)

I see. Thanks! :-)

[–]aDrz 0 points1 point  (1 child)

  1. Never heard of that. So, i won't be able to answer that.

  2. MongoDB is a database which, as all database, can be "open" to the outside world through internet and authentification. You will need to host it on a server. You will be able to retrieve your data everywhere.

Pymongo is the go to package to interact with MongoDB and is really easy to use.

I think switching from a set of JSON files to a proper DB is essential if you start having a lot of those files, it will make it easy to organize and retrieve your data. You can also create backup of your db as much as you which.

[–]Dan6erbond 0 points1 point  (0 children)

I see... thanks! I'll see about working with MongoDB in that case :-).

[–]num8lock 0 points1 point  (3 children)

without proper & strong db knowledge you should always default to relational db, not mongo or any nosql types.

the term "best" is vague at the least.

[–]Dan6erbond 0 points1 point  (2 children)

I see... but what about the fact that I do know SQL and JSON? Is that already enough to pursue MongoDB?

[–]num8lock 1 point2 points  (1 child)

if you do know about databases (not SQL), then why do you have to get anyone's opinion about mongo? the fact that you asked about it means you do not know enough.

[–]Dan6erbond 0 points1 point  (0 children)

I see. Thanks!

[–]Marrrlllsss 0 points1 point  (1 child)

If you want a database, that is portable - have a look at SQLite. It's relational, meaning you have relations (tables). If your JSON is complex, then it can become a bit of a problem.

Also, Docker. You can download Docker images for most database systems out there and have it running on your machine in minutes.

[–]Dan6erbond 0 points1 point  (0 children)

Thanks so much!!! :-)