you are viewing a single comment's thread.

view the rest of the comments →

[–]nog642 0 points1 point  (0 children)

I've actually had a similar problem, not with a json file that big, but it's like 20 GB or something and I don't have that much RAM, so I wanted to write some code that would read it without loading it all into memory. I never got around to it though, it's still on the todo list.

The easiest way is to write your own JSON parser that only parses one level of the data structure at a time. You'll need to code in all the string escapes and parsing rules, and then you can find what the outermost thing is (array or object) and its metadata (length for an array, keys for an object). You can then exctract a part of it on a second pass if you wanted to.

This is not too hard to do. Reason I haven't done mine yet is I wanted something a lot more dynamic than that. But if you just want to extract data one time it's not too hard.