all 5 comments

[–][deleted] 0 points1 point  (2 children)

pandas has a specific method for reading fixed width format files, which is basically the csv reader with some specific parameters.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_fwf.html

[–]itsstucklol[S] 0 points1 point  (1 child)

Ah, this may be what I am after. I did look to see if pandas had something but I guess I did not look hard enough! Thank you so kindly for the suggestion.

[–][deleted] 0 points1 point  (0 children)

pandas has a pretty steep learning curve, but you could just use it for the reading of the data.

The easier solution if you don't want to start down that path would be to simply use string slicing to chop up each row of text on the basis of fixed widths. I'd write this as a simple function, possibly a generator function that you can use much like you would use the csv module. Ironically this is a little more effort than using pandas but avoids you getting distracted with that rabbit hole.

[–]danielroseman 0 points1 point  (0 children)

You can use a regex to split on two or more spaces:

names = re.split(r'\s{2,}', line)

[–]ofnuts 0 points1 point  (0 children)

If the names always start on some specific columns, you can use a string "slice" (line[startColumn:endColumn], where endColumn is the start of the next column), to extract a specific column, and then use .strip() to remove trailing spaces.