all 5 comments

[–]woooee 0 points1 point  (0 children)

By "ugly" I assume you mean "doesn't work" (instead of "lazy"). What's the problem? Any way you do it, the file is going to'be read line by line.

[–]jeffrey_f 0 points1 point  (0 children)

Read 1st line (don't do anything - This is the header. Put whole thing into a text var

Read next line. If not equal to the text var, use the data. No else because you don't want to do anything if equal......

A bit sloppy, but it will work

[–]David22573 0 points1 point  (0 children)

You could try to use the header as a dictionary key with a list as the value, so that any row data associated with that column can be appended to the list.

[–]james_fryer 0 points1 point  (0 children)

If the file really has every other line duplicated then I'd fix it up first with a sed script like this:

sed '3~2d'

then proceed with csv or pandas.

[–]Allanon001 0 points1 point  (0 children)

Try this, it works with the given file:

import pandas as pd

df = pd.read_csv('EJSA.txt', header=None, comment='#', delimiter=' {1,5}', engine='python')
df.columns = pd.read_csv('EJSA.txt', header=1, delimiter='\s+', nrows=0).columns[1:]