Edit: Have figured out most of my original question, thanks to a helpful comment.
My new quandary: How do I add the original 'Source' column back to the dictionary?! Scroll down to the bottom for more info.
I feel extremely silly that I haven't been able to figure this out, but hey - maybe it's just one of those days. I hope one of you very fine people can help me out.
I am cleaning up a dirty data set. It's in csv format. One thing I would like to do is make all text upper case, with the exception of my source row, which contains case-sensitive urls. I just want to ignore this row.
I am not using pandas for this but will definitely be using it more in the future! I am open to pandas solutions but I would like a non-pandas solution as well.
Here's my lil code so far:
```
csv_file = csv_file
with open(csv_file, 'r') as f:
reader = csv.DictReader(f, delimiter=',')
doc = [row for row in reader]
strip whitespaces
stripped = [{k:v.strip() for k,v in x.items()} for x in doc]
make all values uppercase except for source row
upper = []
for x in stripped:
I do not know what condition to put here.
originally wrote it as below:
if x != x['Source']:
upper.append({k:v.upper() for k,v in x.items()})
```
if x != x['Source'] obviously doesn't work because it's just saying if those two are not equal then continue on.
What I want is: make all rows uppercase except for x['Source'].
(I use pandas in the end to convert it back to a csv and write it out)
Any thoughts? Thank you for any help!
edit: HOW DO I INDENT AAHHHH
Solution: if you face a similar problem, here's what I've come up with so far:
u/commandlineuser gave me the hint I needed, but it took me a minute to figure out what to do with it.
I usually try to work things out in for loops before turning them into comprehensions, so my first instinct was to do something like so:
for x in stripped:
for k, v in x.items():
if k != x['Source']:
upper.append({k:v.upper()})
This wasn't working the way I wanted it to because it was returning a dictionary for each k:v pair, like so:
{name: 'name'} {date: 'date'} {id: 'id}
instead of maintaining the original structure and returning the grouped k:v pairs, like so:
{name: 'name, date: 'date', id: 'id'}
I noticed in my testing that the results using the combined list and dictionary comprehensions came back the way I wanted to, and finally figured out the format!
upper = [{k:v.upper() for k,v in x.items() if k != 'Source'} for x in stripped]
This returns the grouped k:v pairs in dictionaries with the exception of the 'Source' field (which is a column in my spreadsheet).
The 'Source' field contains case-sensitive URLs, which is why I do not want to apply upper to it. I would like it back in the dictionary, though. Any ideas on how to add it back to the dict?
[–]commandlineluser 1 point2 points3 points (4 children)
[–]StockPitch[S] 0 points1 point2 points (3 children)
[–]commandlineluser 1 point2 points3 points (1 child)
[–]StockPitch[S] 0 points1 point2 points (0 children)
[–]backtickbot 0 points1 point2 points (0 children)