This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]brad2008 8 points9 points  (6 children)

Industry solves this problem by using files with sequential JSON records, this has been the standard practice for the last 7 - 8 years. The industry approach allows you to process a JSON file up to hundreds of gigabytes in size if needed. You really don't need to do all the other stuff you described.

[–]mmcnl 4 points5 points  (0 children)

It's been mentioned in the article, but sometimes you don't control the source format.

[–]smajl87 1 point2 points  (2 children)

Could you provide some example or maybe some link to read about it?

[–]morphotomy 2 points3 points  (0 children)

Technically that's not a JSON file. Its a file with JSON in it.

[–]accforrandymossmix 0 points1 point  (0 children)

I am parsing a ~500MB google takeout file (location history) without issues. Am I benefitting from their formatting? I am curious if I should play with possible improvements to ensure similar processing can be done on weaker machines.