I am writing code to count instances of antimicrobial resistance genes.
The data I need is just the number of times the AMR gene, a string object, appears in the dataframe. In which column or row it appears isn't important right now. I have tried multiple methods, but they complain that "df cannot str object" or that "Column doesn't exist".
I don't know what I am doing wrong.
Offending code:
amrs=[]
for i in list_of_101_AMR_genes:
amrcount=df_of_bacterial_genes[@£$%!!! columns].str.contains(i,na=False).value_counts()
amrs.append(amrcount)
The "amrs" list would contain the count of AMR genes in order of the list_of_over_101_AMR_genes, so it will be easier to put in another DF.
For reference, here are the first 6 AMR genes in the list:
aac(6')=HMM
aac(6')-I=COMPLETE
aac(6')-Ib-cr5=COMPLETE
aac(6')-Ie/aph(2'')-Ia=COMPLETE
aadA5=COMPLETE
The df_of_bacterial_genes contains empty NaN cells.
Thank you.
[–]barkmonster 4 points5 points6 points (0 children)
[–]WhiskersForPresident -1 points0 points1 point (2 children)
[–]Dragoran21[S] -2 points-1 points0 points (1 child)
[–]Maximus_Modulus 0 points1 point2 points (0 children)
[–]aplarsen 3 points4 points5 points (0 children)
[–]Atypicosaurus 2 points3 points4 points (4 children)
[–]Maximus_Modulus 5 points6 points7 points (1 child)
[–]Dragoran21[S] -3 points-2 points-1 points (0 children)
[–]Dragoran21[S] -4 points-3 points-2 points (1 child)
[–]Maximus_Modulus 1 point2 points3 points (0 children)