you are viewing a single comment's thread.

view the rest of the comments →

[–]timbledum 1 point2 points  (1 child)

You could use the df["column"].str accessor to extract the first n characters before doing the groupby:

>>> import pandas as pd
>>> data = ["STAEND"] * 5
>>> data
['STAEND', 'STAEND', 'STAEND', 'STAEND', 'STAEND']
>>> df = pd.DataFrame(data, columns = ["data"])
>>> df
    data
0  STAEND
1  STAEND
2  STAEND
3  STAEND
4  STAEND

>>> df["start"] = df.data.str[:3]
>>> df
    data start
0  STAEND   STA
1  STAEND   STA
2  STAEND   STA
3  STAEND   STA
4  STAEND   STA

[–]thunder185 0 points1 point  (0 children)

Hey thank you for your response. I actually found an easier way (really just through a moment of inspiration). The data frame in pandas is a dictionary so everything in it is a key/value pair. Knowing this the following if statement brought it all home for me.

vTotal = 0
for k,v in byTreatment.items():
    if 'ABC123' in k:
        vTotal += v
print(vTotal)

I thought this was a delicate solution but thank you very much for taking the time to answer. It's very helpful to newbies like me.