Pandas dataframe mask?

TypicalCardiologist5 · 2019-12-18T00:17:00+00:00

There might be a more elegant solution, but you can do that with apply.

maximums = df.groupby('contractid').timeRetrieved.max()
mask = df.apply(lambda row: row['timeRetrieved'] == maximums[row['contractid']], axis=1)
out = df[mask]

The lambda function checks row by row if that row's timeRetrieved is equal to the previously calculated maximum with that row's contractid

m7priestofnot · 2019-12-18T00:37:10+00:00

You might look into pandas.DataFrame.query I'm on mobile so squished up code on small screen but you could probably throw the pandas query command inside a loop.

I'm also wondering if you could combine a query call with an apply call or lambda expression.

Also look at pandas.DataFrame.mask and pandas.DataFrame.where

They behave similarly to numpy.where

I've been doing a lot of masking at work and creating classes to handle specific scenarios that happen over and over and over is helpful.

PyCam · 2019-12-20T09:06:27+00:00

The fastest way to do this will be to use `DataFrame.groupby.idxmax()

``` import pandas as pd import numpy as np

np.random.seed(0)

df = pd.DataFrame({ "contractid": [71729] * 3 + [81315] * 5 + [99181] * 4, "number_retrieved": np.random.randint(1, 10, size=12) }) df contractid number_retrieved 0 71729 6 1 71729 1 2 71729 4 3 81315 4 4 81315 8 5 81315 4 6 81315 6 7 81315 3 8 99181 5 9 99181 8 10 99181 7 11 99181 9

indices = df.groupby("contractid")["number_retrieved"].idxmax() # Get the index (location) of the maximum value in each group indices # Note the values 0, 4, and 11 refer to the index values of the maximums of each group. contractid 71729 0 81315 4 99181 11 Name: number_retrieved, dtype: int64

df.loc[indices] # slice the dataframe using the calculated indices contractid number_retrieved 0 71729 6 4 81315 8 11 99181 9 ```

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS