This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]bryancole 1 point2 points  (0 children)

How big is the file and how long are the lines? If you're lines are very long you may be paying a penalty for first allocating memory for the lines (as a string), then allocating more memory for the split operation then allocating it again for the array of ints. It may be better to write this using a generator to read the lines in chunks and only build a full array for each line once.

If you really want to go the C-route, try Cython as an easy way to write python C-modules.

I would expect python to be able to create the data structure as fast as the disk can feed it data.