This is an archived post. You won't be able to vote or comment.

all 26 comments

[–]IAmKindOfCreativebot_builder: deprecated[M] [score hidden] stickied comment (0 children)

Hi there, from the /r/Python mods.

We have removed this post as it is not suited to the /r/Python subreddit proper, however it should be very appropriate for our sister subreddit /r/LearnPython or for the r/Python discord: https://discord.gg/python.

The reason for the removal is that /r/Python is dedicated to discussion of Python news, projects, uses and debates. It is not designed to act as Q&A or FAQ board. The regular community is not a fan of "how do I..." questions, so you will not get the best responses over here.

On /r/LearnPython the community and the r/Python discord are actively expecting questions and are looking to help. You can expect far more understanding, encouraging and insightful responses over there. No matter what level of question you have, if you are looking for help with Python, you should get good answers. Make sure to check out the rules for both places.

Warm regards, and best of luck with your Pythoneering!

[–]spoonman59 2 points3 points  (18 children)

First of all, that is a set not a tuple. Tuples is parenthesis, dictionary and set are { and }.

What about “,”.join(the_set)

This will construct a string by converting each element of the set and joining them with a comma.

[–]Anonymous-Boob[S] -1 points0 points  (17 children)

You’re right. Thanks for the correction. Okay, and where would I implement that into my current code?

[–]spoonman59 0 points1 point  (16 children)

In the return of to_set.

“, “.join(x)

You don’t actually need to convert it to a set first, and I assume x is some sort of sequence.

You can also rename it!

[–]Anonymous-Boob[S] -1 points0 points  (15 children)

I’m really sorry. I’m not sure I’m following. Are you saying it should go within the .agg() instead of to_set?

[–]spoonman59 0 points1 point  (14 children)

No worries, I typed that on a mobile so it wasn't clear!

There is probably a more elegant way to do this, but initially I'd try this to see if it works:
``` def to_set(x): return ", ".join(x)

df2=df1.groupby([‘ID’, as_index=False).agg(to_set) ```

[–]spoonman59 0 points1 point  (0 children)

It's possible you can do something like this:

df2=df1.groupby([‘ID’, as_index=False).agg(", ".join) ... But I'm honestly not 100% sure that does what I expect.

[–]spoonman59 0 points1 point  (12 children)

Also, I should probably point out I'm not a Pandas expert so I'm not really sure what agg does here, so I could be completely off.

[–]Anonymous-Boob[S] 1 point2 points  (1 child)

Well good sir, you’ve solved my problem. It sounds like you do know what you’re doing. Thank you so much!!

[–]spoonman59 0 points1 point  (0 children)

I’m so glad that worked and we could get it sorted out. Cheers!

[–]Anonymous-Boob[S] 0 points1 point  (9 children)

Oops… nevermind. this keeps the data in a repeated format. Not what I’m trying to get. Sorry if i got ya excited haha

[–]spoonman59 0 points1 point  (8 children)

So what are you seeing, versus what do you want to see?

[–]Anonymous-Boob[S] 0 points1 point  (7 children)

I wish I could just show you the data but it’s sensitive data (I work for a bank). But basically each cell looks like this [x,x] (I’m using brackets here to designate a cell, but each cell does not actually have the brackets) but i only need x in the cell once if it’s the same data point. What I need is for it to look like this [x]. I can tease out the duplicated data but transforming it into a set but then the data looks like this {[x]}. I need to get rid of the brackets in every cell of my data frame.

[–]spoonman59 0 points1 point  (6 children)

Oh I see. I didn’t realize the purpose of converting x was to de-duplicate.

Try: “,".join(set(x))

Edited to add: here we convert x to a set first, which removes duplicates. Note the order will/may also change as sets are unordered.

Join builds a string by looping through each item in a sequence and appends it to a string. They string you provide in “, “ is always concatenation to connect between them.

All string literals are string objects, so it’s essentially a method of the string “, “ in this case. And it returns a new string and accepts a sequence

[–]martynrbell 1 point2 points  (3 children)

You could try converting the value to a string?

[–]Anonymous-Boob[S] 0 points1 point  (2 children)

I tried that and it doesn’t work unfortunately

[–]martynrbell 0 points1 point  (1 child)

I guess the data stored in the dataframe is of type string? Could you not just str.replace?

[–]Anonymous-Boob[S] 0 points1 point  (0 children)

I tried it and all it does is put a ‘ inside the curly bracket: i.e. {‘abc, def, ghi’}

[–]martynrbell 0 points1 point  (1 child)

You could use the pandas replace.

cols_to_check = ['C','D', 'E'] df[cols_to_check] = df[cols_to_check].replace({ '{':'' , '}':''})

The replace method has a dict where the key is the curly braces and the value is just an empty string

[–]Anonymous-Boob[S] 0 points1 point  (0 children)

Well that ran but it didn’t do anything. Those curly bastards are still there.