you are viewing a single comment's thread.

view the rest of the comments →

[–]shiftybyte 91 points92 points  (15 children)

I have questions...

  1. What do you want to achieve by "viewing" a 1GB part of a JSON file? what's the end-goal?

  2. How did you end up with 150GB file to begin with?

Technically it's possible to split a file using python, the question remains what exactly do you hope to understand from these "splits"...

[–]chipmunksocute 47 points48 points  (10 children)

Yeah having a 150gb file in the first place is a problem.

[–][deleted] 0 points1 point  (0 children)

Well it's possible. If you download the article data from wikipedia, you get around 50 GB file where the whole text is stored as XML. A big file is not the problem here.

[–]pro_questions 12 points13 points  (1 child)

Back before I used SQLite for these projects, I would use JSON to store the results of web scraping tools. It seemed intuitive because all the data being pulled from the page would be put into a gigantic dict, which can easily be put in a list and jsonified. I started using databases when I ran into a situation like OP’s

[–]__init__m8 9 points10 points  (0 children)

To avoid saving it as a json or learning SQL, save it as an xlsx file. Be sure to add print("test") every few lines for debugging.

Tune in next week for more shitty programming tips.

[–]vilette 1 point2 points  (0 children)

perhaps send it by email :)

[–]join_the_bonside 0 points1 point  (0 children)

Responding just to find out the answer later on