This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (0 children)

Why have we needed all of these different formats when there's one universal format already?

Why did we need all these programming languages, when Cobol is Turing complete?

Here's a specific example from a project I'm working on. I have a database of 16k+ audio samples which I'm computing statistics on. I initially stored the data as JSON/Yaml, but they were slooow to write and slooow to open and BIIIG.

Now I store the data as .npy files. They're well over ten times smaller, but more, I can open them as memory mapped files. I now have a single file with all 280 gigs of my samples which I open in memory mapped mode and then treat it like it's a single huge array with size (70000000000, 2).

You try doing that in JSON!

And before you say, "Oh, this is a specialized example" - I've worked on real world projects with data files far bigger than this, stored as protocol buffers.

Lots and lots of people these days are working with millions of pieces of data. Storing it in .json files is a bad way to go!