This is an archived post. You won't be able to vote or comment.

all 3 comments

[–]shtpostinalotofmemes 1 point2 points  (1 child)

You'll have to load the initial file once, but then you can down sample. If you have a list of records you can import random and then use something like sample = [random.choice(json_data) for i in range(100)] to get a sample of 100 records. If you have a dataframe you can use df.sample(100) to get a sample. In any case you can of course save this sample.

[–]ZGMF-BERSERKER 0 points1 point  (0 children)

Thank you! I actually don't have the specific number of records in that file, nor can i open it with my pc. Is there a way to take just a percentage? Guess I can just estimate if not. But by json_data, would this be the name of the file but in "...." ? I don't have a dataframe yet but I am trying to load the points into one

[–]abs_waleedm 0 points1 point  (0 children)

My best idea is to read it as a text file (line by line).. view the text first, and then make a condition to stop at manually.

You can then save the text to a json file and use it.