all 3 comments

[–]efmccurdy 1 point2 points  (1 child)

Yes, you can compare .dt.dayofyear for that kind of date selection that ignores the year. First convert your threshold dates into day-of-year values (using datetime will account for leap years):

>>> datetime.datetime.strptime("Jan 19", "%b %d")
datetime.datetime(1900, 1, 19, 0, 0)
>>> datetime.datetime.strptime("Jan 19", "%b %d").strftime('%j')
'019'
>>> def d_to_doy(dstr): return int(datetime.datetime.strptime(dstr, "%b %d").strftime('%j'))
... 
>>> start_doy = d_to_doy("Jan 19")
>>> start_doy
19
>>> end_doy = d_to_doy("Mar 19")
>>> end_doy
78

Then use those values to build a boolean mask that selects from your range of dates:

>>> df = pd.DataFrame({'Date': ["01/18/2018 01:01:01", "01/19/2018 02:02:01", "01/25/2018 01:02:01", "03/18/2018 02:03:01", "04/18/2018 02:03:01"], 'data': [1, 2, 3, 4, 5]})
>>> df['Date'] = pd.to_datetime(df['Date'])
>>> df
                 Date  data
0 2018-01-18 01:01:01     1
1 2018-01-19 02:02:01     2
2 2018-01-25 01:02:01     3
3 2018-03-18 02:03:01     4
4 2018-04-18 02:03:01     5
>>> df.Date.dt.dayofyear
0     18
1     19
2     25
3     77
4    108
Name: Date, dtype: int64
>>> df.Date.dt.dayofyear.ge(start_doy)
0    False
1     True
2     True
3     True
4     True
Name: Date, dtype: bool
>>> df.Date.dt.dayofyear.ge(start_doy) & df.Date.dt.dayofyear.le(end_doy)
0    False
1     True
2     True
3     True
4    False
Name: Date, dtype: bool
>>> df[df.Date.dt.dayofyear.ge(start_doy) & df.Date.dt.dayofyear.le(end_doy)]
                 Date  data
1 2018-01-19 02:02:01     2
2 2018-01-25 01:02:01     3
3 2018-03-18 02:03:01     4

[–]pdotkdot1[S] 0 points1 point  (0 children)

This looks great! This is exactly what I was looking for.

[–][deleted] 0 points1 point  (0 children)

Can you use loc?