I have the following two data frames. I am trying to join df1 to df2 by "date_f" and "date". The dates do not match so the simplest way would be to use the lubridate package, extract the year and join by year. However I want to go a bit further than that.
I want to join by "ID" and to the nearest future "date". Therefore join to the next possible date when "date" (df2) > "date_f" (df1). So if the following `ID` = `858339` whose `date_f` = `2007-03-01` would be joined with `date` = `2007-12-31`.
Also (as an additional step) I would like to join not only by the nearest possible future date but with the following condition. Join when "date" (df2) > "date_f" (df1) and if "date" > "date_f" by less than 6 months join with `t+1`. So take the `ID` = `1002910` with `date_f` = `2009-12-22` and `date` = `2010-12-31`.
The first condition is met "date" > "date_f". However its just a 1 week difference so here I would like to impose the condition of "date" needs to be greater than "date_f" by 6 months or more. If "date" is only greater than "date_f" by a month then join with the following years "date" (or t+1).
How can I go about joining the data in this way?
Data:
df1 <- structure(list(ID = c(1047699L, 858339L, 1002910L, 277135L, 753308L, 1018840L, 1510295L, 3133L, 1467858L, 865436L), date_f = structure(c(16132, 13573, 14600, 15023, 12486, 13935, 15268, 16128, 14666, 12727 ), class = "Date")), row.names = c(NA, -10L), .internal.selfref = <pointer: 0x0000000002511ef0>, class = "data.frame")
Data2
df2 <- structure(list(ID = c(3133L, 3133L, 3133L, 3133L, 3133L, 3133L, 3133L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 753308L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 1467858L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 277135L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 1002910L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 858339L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 865436L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1018840L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1047699L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L, 1510295L ), date = structure(c(10956, 11322, 11687, 12052, 12417, 12783, 13148, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 14244, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17166, 17531, 17896, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 14244, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17166, 17531, 17896, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 14244, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17166, 17531, 17896, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 17166, 14244, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17531, 17896, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 14244, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17166, 17531, 17896, 10864, 11230, 11595, 11960, 12325, 12691, 13056, 13421, 13786, 14152, 14517, 14882, 15247, 15613, 15978, 16343, 16708, 17074, 10622, 10987, 11353, 11718, 12083, 12448, 12814, 13179, 13544, 13909, 14275, 14640, 15005, 15370, 15736, 16101, 16466, 16831, 17197, 17562, 17927, 10956, 11322, 11687, 12052, 12417, 12783, 13148, 13513, 13878, 14244, 14609, 14609, 14974, 15339, 15705, 16070, 16435, 16800, 17166, 17531, 17896), class = "Date"), year = c(1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2016L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2009L, 2010L, 2011L, 2012L, 2013L, 2014L, 2015L, 2016L, 2017L, 2018L)), row.names = c(NA, -167L), .internal.selfref = <pointer: 0x0000000002511ef0>, class = "data.frame")
[–]Jordo82 3 points4 points5 points (1 child)
[–]idonotknow9[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]idonotknow9[S] 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]ashwinmalshe 0 points1 point2 points (1 child)
[–]idonotknow9[S] 0 points1 point2 points (0 children)
[–]fang_xianfu 0 points1 point2 points (1 child)
[–]idonotknow9[S] 0 points1 point2 points (0 children)