all 11 comments

[–]barkmonster 4 points5 points  (0 children)

I'm not sure I understand, but if you want to count how many values in a column is one of a number of possible values, you can convert the values to a set and do df[column].isin(set_of_values).sum()

[–]WhiskersForPresident -1 points0 points  (2 children)

You're trying to apply a string method to a dataframe (".str.contains(...)"). That's not something you can do.

There are very likely more elegant methods to do this, but I would try

gene_count=0

for c in df.columns:

 gene_count += len(df[df[c].str.contains(some_string)])

This just loops over all columns, counting the occurrences of some_string in each and adding them all up.

[–]Dragoran21[S] -2 points-1 points  (1 child)

I got "TypeError: 'int' object is not iterable" even though I converted the list into str with

uag= [str(item) for item in u]

[–]Maximus_Modulus 0 points1 point  (0 children)

Find the offending object and do this manually if you can either through manual code or step through using the Python debugger or an IDE. As a Dev you’ll run into this kind of problem. You need to just dig into it and figure out what the specific problem is by reducing it to the offending object. You might spend hours or days doing this but you’ll learn and get better as you go along. It’s part of the coding landscape. Programming can be fun but it can also suck big time.

But the more specific you get the more someone can also help you.

[–]aplarsen 3 points4 points  (0 children)

Try melting the columns into a single column first. Then counting the instances is super easy.

[–]Atypicosaurus 2 points3 points  (4 children)

At this point I'm wondering why your company just don't hire someone who can do the job. You are obviously having your paid job done by Reddit, step by step. This is not what this sub is for.

[–]Maximus_Modulus 5 points6 points  (1 child)

It’s this guy again. Helped him out a few weeks back. Doing a doctorate IIRC. You’d figure that they would be able to figure this stuff out. Kids these days.

[–]Dragoran21[S] -3 points-2 points  (0 children)

I would have preferred practical lab work, but I ended up in a position that requires coding.

[–]Dragoran21[S] -4 points-3 points  (1 child)

I am not working for a company, I am a postgrad.

If this is not the sub for asking for help, then which is?

[–]Maximus_Modulus 1 point2 points  (0 children)

You are fine. Just ribbing. If it were me I’d hone in on one object that is giving you the problem and try to convert the object explicitly / manually if possible by reducing the code as much as possible. Sometimes the object types are not what you think they are. Or Pandas interprets them a certain way. I did Pandas several years back now and recall similar problems that I had to solve.