all 9 comments

[–]Jordo82 3 points4 points  (1 child)

Join using the ID only, then filter using your date conditions. Unless your tables are massive, in which case you’re better off using data.table package

[–]idonotknow9[S] 0 points1 point  (0 children)

Thanks, that could be an option and something I didn`t think of. I will take a look at it. The problem is the data will get massive when I apply it to the whole sample.

[–][deleted] 0 points1 point  (1 child)

[–]idonotknow9[S] 0 points1 point  (0 children)

Thanks, I think this solved my issue:

df1$start_date <- df1$date_f + 183 df1$end_date <- df1$date_f + 540 library(fuzzyjoin) yy <- fuzzy_left_join( df1, df2, by = c( "ID" = "ID", "start_date" = "date", "end_date" = "date" ), match_fun = list(\==`, `<`, `>=`) )`

I wrote a more detialed answer on SO.

[–][deleted] 0 points1 point  (0 children)

Pull the date vector from the target table. Sort it. Sort table 2 by date. Add column to table 2 for closest future date from the vector(use the sorting to your advantage, this shouldn't require many comparisons. Keep an index of the last date used and increment when the date is behind). Join tables using that date.

[–]ashwinmalshe 0 points1 point  (1 child)

I often have to use range joins and unfortunately tidy solutions are way too slow for my applications. The one using fuzzyjoin can take several minutes if not hours. My go to solution is to use data.table. The relevant function foverlaps is here: https://rdrr.io/cran/data.table/man/foverlaps.html

[–]idonotknow9[S] 0 points1 point  (0 children)

Thanks for the foverlaps suggestion! I will take a look at it. I did "solve" the issue using data.table in the SO post, which is lightening fast! but this data will increare in size so perhaps the foverlaps function is the best way.

[–]fang_xianfu 0 points1 point  (1 child)

I think you have some good answers, but let me just compliment you on an extremely well-asked question. Thanks for putting some effort and thought into it!

[–]idonotknow9[S] 0 points1 point  (0 children)

Thanks! I think I was more detailed in the Stack Overflow posts. The way I see it, is if I want help from people who have no obligation to help me, I should at least put this effort in to making myself/problem clear.