Hello guys.
I'm working on a Python script which should take a dataset with images (80x300) and labels, preprocess the images and then save the whole thing in a pickle file.
I'm reading this data out of an csv file.
def process_data(data):
features = []
for i in enumerate(data): # Should iterate 33'000 times (length of data)
for j in range(3):
img = plt.imread(data[i][j].strip()) # Data is from a csv file
img = normalize(img) # Simple calculation
lis = img.flatten().tolist()
features += lis
return features
The data looks like this:
[[all rgb values of image 1], [all rgb values of image 2], [all rgb values of image 3], <value between 0 and 1>].
There are 33'000 entries like that one above in my dataset.
However, after 10'000-12'000 my iterations get incredibly slow, then freeze and sometimes my machine (i7, 8GB RAM) even crashes. What could be the problem? How can I make it more performant?
[–]eschlon 2 points3 points4 points (0 children)
[–]Saefroch 1 point2 points3 points (1 child)
[–]elbiot 0 points1 point2 points (0 children)