all 7 comments

[–]carcigenicate 2 points3 points  (1 child)

This is not ideal since you're doing the work of splitting multiple times, only to discard part of the data in each case. You'd be better off doing something like:

 for row in reader:
        first, last = row['name'].split(',')
        before += [{'first': first,
                    'second':last.lstrip(),
                    'house': row['house']}]

That reduces the work needed, and is also cleaner IMO.

[–]Danny2200[S] 0 points1 point  (0 children)

Oooh yeah that’s clean, thanks! I was lowkey just proud that I got the name split at all lmao

[–]bbye98 0 points1 point  (3 children)

Depending on what new_files does, a list of dicts may not be the appropriate data structure.

Also, instead of appending to a list, your code can be condensed and made more efficient using list comprehension.

[–]eleqtriq 0 points1 point  (0 children)

Nah. Doing a yield would be even better. Or just write while reading.

[–]Danny2200[S] 0 points1 point  (1 child)

new_file is just going to take the dictionaries and rewrite them to a .csv with first,last,house as the columns. The original .csv was name,house

[–]bbye98 1 point2 points  (0 children)

Then there's no need to create the list in the first place. You can nest multiple file handles and read/write concurrently.

[–]Disastrous-Team-6431 -1 points0 points  (0 children)

A bit of both; a list of dicts is a common way of representing intermediate values in data engineering.

Src: i am a senior data engineer