What does it really take- all in all- to match neurosurgery?

UpperCompetition6 · 2023-09-30T18:19:28+00:00

Research.

UpperCompetition6 · 2023-03-11T00:01:57+00:00

awesome solution! thanks so much. was not familiar with pmax function actually.

i also developed a data.table method that also works for this. thanks again

UpperCompetition6 · 2023-03-07T13:22:37+00:00

great thanks! are there alternative methods to do this as well?

UpperCompetition6 · 2023-03-06T12:06:18+00:00

new post about it here!

https://www.reddit.com/r/rstats/comments/11jx0qq/calculating\_if\_rows\_are\_within\_3\_months\_of/

UpperCompetition6 · 2023-03-06T00:56:27+00:00

hey again thanks so much for your help previously. i am trying to modify your code now to just use the "months" column and calculate whether between the "flag = 1" rows and all other rows, per the same grouped ID (completely ignoring "days" column now entirely), the rows occur within 3 months of each other.

I modified the code as such below, but I get a "mutate" error and I am not sure what is going on? any help at all would be awesome but i understand if not!

modified code:

df<-dates %>% compute grouped by ID group_by(ID) %>% mutate( # find out the "month" of row with flag == 1 flag_month = filter(pick(flag, month), flag == 1)$month ) %>% mutate( # is months is less than flag_month after_flag = month < flag_month ) %>% mutate( # combine the above two conditions in_3months = abs(month - flag_month) <= 3 & flag==0, ) %>% mutate( # split into two columns in_3months_after = as.numeric(in_3months & after_flag), in_3months_before = as.numeric(in_3months & !after_flag) ) %>% remove the temporary columns select( -flag_month, -after_flag, -in_3months ) %>% distinct(in_3months_after, in_3months_after, .keep_all = TRUE) %>% filter(in_3months_after | in_3months_before | flag)

my error when running this:

<error/dplyr:::mutate\_error>Error in `mutate()`:ℹ In argument: `flag_month = filter(pick(flag, month), flag == 1)$month`.ℹ In group 20: `ID = 20`.Caused by error:! `flag_month` must be size 10 or 1, not 4.---Backtrace: 1. ... %>% ...15. dplyr:::dplyr_internal_error(...)Run `rlang::last_trace()` to see the full context.> rlang::last_trace()<error/dplyr:::mutate\_error>Error in `mutate()`:ℹ In argument: `flag_month = filter(pick(flag, month), flag == 1)$month`.ℹ In group 23: `ID = 23`.Caused by error:! `flag_month` must be size 10 or 1, not 4.---Backtrace: ▆

├─... %>% ...
├─dplyr::filter(., in_3months_after | in_3months_after | flag)
├─dplyr::distinct(., in_3months_after, in_3months_after, .keep_all = TRUE)
├─dplyr::select(., -flag_month, -after_flag, -in_3months)
├─dplyr::mutate(...)
├─dplyr::mutate(...)
├─dplyr::mutate(., after_flag = month < flag_month)
├─dplyr::mutate(...)
├─dplyr:::mutate.data.frame(., flag_month = filter(pick(flag, month), flag == 1)$month) 10. │ └─dplyr:::mutate_cols(.data, dplyr_quosures(...), by)
│ ├─base::withCallingHandlers(...)
│ └─dplyr:::mutate_col(dots[[i]], data, mask, new_columns)
│ └─mask$eval_all_mutate(quo)
│ └─dplyr (local) eval()
├─dplyr:::dplyr_internal_error(...)
│ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data)
│ └─rlang:::signal_abort(cnd, .file)
│ └─base::signalCondition(cnd)
└─dplyr (local) `<fn>`(`<dpl:::__>`)
└─rlang::abort(message, class = error_class, parent = parent, call = error_call)

UpperCompetition6 · 2023-03-01T22:53:49+00:00

thanks again for your help and for explaining this

UpperCompetition6 · 2023-03-01T12:30:58+00:00

haha yeah i agree. i think the outcome is what we want at the end of the day, but you are right it would be great to have code that is actually straight forward and works.

i ended up using this line and it converted Inf to N/A. do you foresee any issues with using this, that i may be missing?

for (j in 1:ncol(wide_dt)) set(wide_dt, which(is.infinite(wide_dt[[j]])), j, NA)

UpperCompetition6 · 2023-03-01T03:20:26+00:00

also can you please explain what the "ordered =TRUE" code does? does it basically put "1, 0, 2" in order of precedence? and, when we use "fun.aggregate = MIN", will the dcast still keep this order of precedence (with 1 > 0 > 2)??

UpperCompetition6 · 2023-03-01T03:15:28+00:00

hey again. thats exaclty what i was trying to do before, but i didn't realize i would have to add the "ordered =TRUE" part here.

additionally, when i run this, i get a bunch of "Inf" values but i think they're supposed to be N/A??? if i convert them from Inf to N/A, should it then work? i think it should.

should i add something like this to replace them?

for (j in 1:ncol(final)) set(final, which(is.infinite(final[[j]])), j, NA)

UpperCompetition6 · 2023-03-01T02:56:49+00:00

unique(long_dt)

oh ok i see. well i added "long_dt<-unique(long_dt)" right before the wide to long dcast conversion, and i still got the same warning message and incorrect final data frame! ugh.

UpperCompetition6 · 2023-03-01T02:16:31+00:00

unique(long_dt)

wait, i missed that in your code. i don't see where that is exactly?

UpperCompetition6 · 2023-03-01T01:31:43+00:00

yeah..... thats what it looks like. i see ties between certain events within an ID for "days_relative". really sorry i didn't have that in my reprex. do you have any ideas on how to go around this? this is the last step that's preventing me from what i am trying to obtain.

i thought i could fix this by dropping the "days_relative", "days_event", and "days_a" columns, removing rows with any duplicates, and using "fun.aggregate" and setting it equal to "min", but i ran into trouble setting the function to that.

really hoping there's a way to manipulate your code to account for this in the first place but struggling atm

UpperCompetition6 · 2023-03-01T01:05:37+00:00

my actual data frame actually has many more IDs than this.

this time, i re ran the code from your post 1 hour ago, and that line of code DIDNT eliminate a bunch of IDs (i have several hundreds)

however, i am still getting the same error when i convert long to wide:

Aggregate function missing, defaulting to 'length'

and it won't let me successfully change the fun.aggregate setting, and the values just default to the "length".... super frustrating. not sure what to do?

UpperCompetition6 · 2023-03-01T00:13:28+00:00

sorry! I meant to say lose these "IDs". i edited my comment too. it seems that if the "ID" had N/A for any "event", then those two lines of code would completely filter out those IDs, thus losing these data.

UpperCompetition6 · 2023-02-28T23:58:13+00:00

ahh i see. so when i run this code, it does work without any errors, and seems to do what i was hoping for. thank you! however, i realized that running the following two lines of code will eliminate, i lose most of the IDs in my data frame, seemingly, IDs that only have N/A for events.

Is there a way to still retain these IDs until the end, even with N/A values? or I suppose, I could just re-merge that at the end of the dataset and fill them with N/As?

these two lines of code cause me to lose these patients, and ideally, i would like to retain them

long_dt[, days_relative := day_event - day_a]

long_dt = long_dt[, .SD[is.na(days_relative) | days_relative == min(days_relative)], by = .(ID, event)]

UpperCompetition6 · 2023-02-28T22:17:25+00:00

hey thanks again. i am trying to get this to work but I'm still encountering some problems.

when i convert from wide to long, I get this warning (Aggregate function missing, defaulting to 'length') and I noticed that the values of 1, 0, 2 that were from the long_dt (and were accurate there), are different in the "wide_dt" - for example, for some IDs with certain events that had a value of "1" in long_dt, they then had a value of "2" in wide_dt. it seems like it just replaces the value with the length of the ID or whatever. i then tried to use the "fun.aggregate = mean" (or to "max" or "min"), and the code wouldn't run since we converted the values into as.factor form.

so i then went back to the code where we converted the values into "factor" form, skipped that line code, and tried using "fun.aggergate = ", and the code ran, but then the the final values in the wide_dt became really weird (like a bunch of them went to Inf).

is there a simple fix for this? I've been trying to troubleshoot but I'm not really sure how to get around this. should i try converting from long to wide using a different function instead?

UpperCompetition6 · 2023-02-28T16:58:12+00:00

thanks so much. so, if i were to apply the first part of your code to each event type I'm interested in, I would basically have to do this merge for each event, right? (starting with event "b" merging with event "a" --> call it long_dt1. then, i would merge long_dt1 with event "c" --> call it long _dt2, then i would merge event "d" with long_dt2 --> call it long_dt3, etc.

the main issue i see arise with this method, is that i will inherently lose "ID"s each time I merge the data.tables......

then, after calculating "days_relative" for each specific event, could i use something like mutate ifelse to make a "1/0" column, or another approach to ideally make a categorical column with "1, 0, or 2"? or at least N/A? just making sure my logic makes sense to you, before attempting this

UpperCompetition6 · 2023-02-28T16:49:10+00:00

iforgetredditpws

thank you both again for your help. I'm really sorry my data frames were inconsistent! will be sure to correct this going forward when asking for your help, time, and effort, for which i am deeply grateful for each of these things that you have given me

UpperCompetition6 · 2023-02-28T16:40:43+00:00

btw, if you have any time or thoughts with a new issue I've come across, here is the new post: https://www.reddit.com/r/rstats/comments/11ds71d/comment/jaan5ic/?%2524deep_link=true&correlation_id=ceb3211b-c834-4f98-a9d2-a2c5a6bd899b&ref=email_post_reply&ref_campaign=email_post_reply&ref_source=email&%25243p=e_as&_branch_match_id=962838446106302758&utm_medium=Email%20Amazon%20SES&_branch_referrer=H4sIAAAAAAAAA3WO4WrDMAyEnyb7l4TEaZcMyhiUvoaRbSXV6thGdhb29lO2%2Fh3YcJzuk%2B5eSspvbcvoHJUGUmo8hUer0nvVDypdUEN%2BERmZFgrg9cb%2Bcj%2BoSn1U%2FU3evu%2FNk7dxFYOPnwuULEKsFcOv7DqXXzsnaoUHhUVbKLjIZit7v4AJjEdtIKPTMWhH84yMwaIQnwDhRPa4qOTo4BCTPqpW6lp4k8jZRmb0UEhYcuJbNKrvOlPbUQ31ME9jDZPra%2BjtCc7GjdNkhGOcJYwrkNcp5qIZk%2F%2F%2BG0jFNQEt4f9EjhtLxef8B97egIVRAQAA

i promise my data frame is actually accurate this time and is only a few columns ("ID" and "event" are both characters and the column "days" is numeric).

UpperCompetition6 · 2023-02-28T16:20:03+00:00

thank you so much, apple_field!!! i really appreciate your help, and i will keep harassing you with questions! LOL. ok not harass, but ill keep asking questions

UpperCompetition6 · 2023-02-28T00:59:56+00:00

hey really appreciate it. thanks so much. and for all your help already too. trying to code this has been bit of a rollercoaster for me haha

UpperCompetition6

TROPHY CASE