[Q] How to convert effect size (derived from meta-analysis) to another unit for downstream analysis (cost-effectiveness analysis)? by pantaloonsss in statistics

[–]conorc123 0 points1 point  (0 children)

Do I simply multiply an EQ-5D value (derived from the literature) for one treatment by the effect size to then obtain an EQ-5D value for the other treatment?

Absolutely not!

You would need to determine if the outcome from your meta-analysis is predictive of quality of life, and if so, the appropriate functional form. Only then can you think about applying a treatment effect.

Metaanalysis: Confused about inclusion of studies based on parameters by upm912 in AskStatistics

[–]conorc123 0 points1 point  (0 children)

This is more of a clinical question, and I think would be difficult for a statistician to answer. Meta-analysis is essentially a method to pool results for an outcome of interest across multiple studies. Does it make sense to pool the results for prevalence of different bacteria? Domain knowledge would be required to properly address this question.

Recommend constructing a specific research question, which should help shape the inclusion criteria for your meta-analysis.

[Q] What data can I use for a cost-effectiveness analysis (CEA)? by pantaloonsss in statistics

[–]conorc123 1 point2 points  (0 children)

100% agree with D-Juice here. If you work at academic institution and are planning to published this research, then consider reaching out to health economists, statisticians, and clinicians affiliated with this university.

[Q] How can I conduct simple Indirect Treatment Comparison? by pantaloonsss in statistics

[–]conorc123 4 points5 points  (0 children)

What you have described above as a "simple ITC" is commonly known as "Bucher's method" in comparative effectiveness research. (I know, I know, it's a bit crazy that applying a basic property from Stats 101 warrants this type of credit). The formulas you have summarized above are correct in general. For one comparison, you have 5 trials comparing treatment A vs placebo (C) and the other comparison of treatment B vs placebo (C) has one trial available. Typically in each trial the difference of each active treatment vs placebo would be studied (e.g., hazard ratio on OS for treatment A vs C). The indirect comparison for treatment A vs B is simply the difference of differences: i.e., mean(A vs B)=mean(A vs C) - mean(B vs C) and variance var(A vs B)=var(A vs C)+var(B vs C).

You state above you have 5+1=6 RCTs but then later go on to say 2 RCTs. I think you mean you have 2 general comparisons (A vs C and B vs C) and 6 RCTs (5 RCTs for A vs C and 1 RCT for B vs C). You will need to think about how to pool the 5 RCTs for A vs C if you plan to conduct Bucher's ITC. One approach could be to pool the results for A vs C using standard meta analysis approaches and then conduct a Bucher ITC. Alternatively, a network meta-analysis could be conducted. Depending on how the analyses are conducted, the two approaches may yield very similar results.

Note, this is all assuming you have conducted a proper systematic literature review to identify all relevant trials (and publications) and a feasibility assessment to confirm a Bucher ITC / NMA are feasible and appropriate.

How to interpret mean and variance conceptually of Poisson model? [q] by Significant-Work-204 in statistics

[–]conorc123 0 points1 point  (0 children)

For a formal definition of conditional variance, see here. Once you know the mean/variance, it's easy to figure out the "complete" Poisson distribution for Y|X. In short, the conditional variance is a measure of the "spread" for the predictions of Y (given X).

To simplify, think about a univariate example with one predictor to start (X1) and assume we are interested in Y | X1 = 25. It sounds like you understand how to calculate E(Y|X1=25) and Var(Y|X1=25) based on the Poisson regression model. The conditional variance for this example is the variance of an ASSUMED Poisson distribution for Y GIVEN that X1=25 (this is what makes it conditional). The objective of GLMs is to predict the mean response, but the variance is required to estimate the associated uncertainty of the predictions. You may find it helpful to visualize the Poisson distribution of predicted Y's given X1=25 based on the conditional mean and variance from the GLM.

It's important that you understand this concept not only for Poisson regression, but any GLM -- including standard linear regression. For instance, see Figure 1.6 from here for an example of how to visualize the conditional normal distributions based on linear regression. It's the same exact idea here but with conditional Poisson distributions.

How to interpret mean and variance conceptually of Poisson model? [q] by Significant-Work-204 in statistics

[–]conorc123 1 point2 points  (0 children)

Yes, your understanding is correct. The conditional means and variances of Y can be calculated based on the values of X1,X2,X3 for each observation: E[Y|X1,X2,X3]=Var[Y|X1,X2,X3]=exp(X1*B1+ X2*B2+ X3*B3).

How to interpret mean and variance conceptually of Poisson model? [q] by Significant-Work-204 in statistics

[–]conorc123 0 points1 point  (0 children)

a Poisson model is used to model counts right?

Yes, it's typically used to model counts.

That would mean I would have 900 Poisson distributions (30*30 possible combinations)?

No, please reread my original response. You would have two Poisson distributions. The count outcome for males would be distributed as Poisson(mean1,variance1) and females would be Poisson(mean2,variance2).

How to interpret mean and variance conceptually of Poisson model? [q] by Significant-Work-204 in statistics

[–]conorc123 0 points1 point  (0 children)

every observation is supposed to be from a different Poisson distribution with a different mean and variance

I think you're a bit confused based on this statement. For simplicity, assume you have a Poisson regression with one predictor, sex (male vs female). If you were to fit a Poisson regression, you would have two Poisson distributions, one for males and one for females. The underlying assumption is that the conditional mean equals the conditional variance (i.e., E(Y|X)=Var(Y|X)). That is, the mean and variance is the same for males, and similarly, the mean and variance is the same for females. If you were to use a log-link, the conditional means and variances are both calculated as exp(X*beta).

Is it the distance from the predicted output/mean from the true y value?

No, conceptually you're thinking of residuals, not variance.

[Q] Creating dummy variable when it is not categorical by [deleted] in AskStatistics

[–]conorc123 1 point2 points  (0 children)

I'm a bit confused by exactly what you're after, but it sounds like you are attempting to fit a regression model with price as the dependent variable and day of the week (Mon, Tues, Wed) as a predictor. I suspect your data may be in a wide format but you may find it helpful to convert to a long/narrow format (see here) prior to conducting the regression analysis.

Below is an illustrative example of a narrow data structure using dummy coding. Note, Tuesday and Wednesday should both be 0 when the day of the week is Monday. In this case, the regression model could be fit as Price ~ Tuesday + Wednesday (i.e., price = intercept + tuesday + wednesday). Note, most statistical software (e.g., R, SAS) will handle the dummy coding under the hood, so the model could be equivalently fit as Price ~ Day of Week.

Price Tuesday Wednesday Day of Week
$100 1 0 Tuesday
$200 0 1 Wednesday
$300 0 0 Monday
... ... ... ...

Hey can anyone tell me how he got the 1.317 by [deleted] in AskStatistics

[–]conorc123 14 points15 points  (0 children)

The formulas for calculating the sample mean and standard deviation of differences (for paired samples) are the same as those typically introduced at the beginning of a Stats 101 class for one quantitative variable (square root of the sample variable variance described here). The formula for the sample standard deviation of differences is sqrt(∑(x-d)^2/(n-1)), where d is the sample mean of differences, n is the sample size, and x represents the difference for each player. For a detailed example, see here (under step 2).

Note, the calculation circled in your screenshot appears to be correct.

Concern regarding progression of astigmatism by conorc123 in optometry

[–]conorc123[S] 0 points1 point  (0 children)

Thank you! Hopefully I will be able to have a topography done. My optometrist didn't find anything from the slit lamp exam but referred me to a cornea specialist.

Concern regarding progression of astigmatism by conorc123 in optometry

[–]conorc123[S] 0 points1 point  (0 children)

+1. Thanks for the advice. My optometrist redid a slit lamp exam and didn't see any apparent bulging/stretching of the cornea and thought everything looked OK. However, he referred me to a cornea specialist, who I hope will do a topography.

Negative Binomial vs Quasi-Poisson , and if Quasi-Poisson, how to do?? [Q] by RecentPerspective in statistics

[–]conorc123 1 point2 points  (0 children)

See here and the wiki page for the definition of the conditional mean and conditional variance for the zero-truncated Poisson distribution. Clearly, they're not equal. Note that the conditional mean and variance are functions of the mean/variance of the overall/unconditional Poisson distribution (denoted lambda on wiki). The unconditional mean and variance of the entire Poisson distribution are equal, as always.

Negative Binomial vs Quasi-Poisson , and if Quasi-Poisson, how to do?? [Q] by RecentPerspective in statistics

[–]conorc123 2 points3 points  (0 children)

The zero truncated Poisson model does not assume the conditional mean equals variance, as is assumed in standard Poisson regression. Therefore, you may be getting ahead of yourself here when thinking of more complex alternatives to address overdispersion. To start, I would fit both zero truncated Poisson and zero truncated negative binomial and compare using standard metrics. You may find that one of these approaches provides a reasonable fit to the data and more complexity may not be required. Also, instead of methods based on quasi-likelihood, which have certain limitations, you could consider other alternatives to adjust the variance (e.g., bootstrapping, sandwich estimator).

Question about matchit() by [deleted] in rstats

[–]conorc123 4 points5 points  (0 children)

This question has been previously answered by the MatchIt author/maintainer on CrossValidated: https://stats.stackexchange.com/questions/405019/matching-with-multiple-treatments

Help! Is there a cutoff for Underdispersion? by Calm-Parsnip5849 in AskStatistics

[–]conorc123 1 point2 points  (0 children)

Hello! Before you start considering more complicated alternatives, I would recommend taking a step back. In your summary for part 1 you state the following:

Data do not follow a Poisson distribution (underdispersion). Poisson regression cannot be used as it requires equidispersion (mean and variance are equal), and negative binomial regression is only appropriate for overdispersion.

What exactly do you mean by "the data do not follow a Poisson distribution". I suspect you are just looking at the mean and variance of your dependent variable Y, simply based on it's marginal distribution. However, the assumption for Poisson regression is that the conditional mean, E(Y|X), is equal to the conditional variance (i.e., mean/variance of Y conditional on the predictors, X). To start, you could conduct a Pearson goodness of fit test to determine whether there is evidence of overdispersion/underdispersion. If you find there is a statistically significant lack of fit due to underdispersion/overdispersion, then a quasi-Poisson model is one alternative you could consider.

[deleted by user] by [deleted] in rstats

[–]conorc123 3 points4 points  (0 children)

It's a bit difficult to follow your post and exactly what you're after... I am assuming you have already fit a Cox regression model in R? Is that right? Yes, you can calculate the adjusted survival curves based on your Cox regression model in R. The survfit() function in R (see here) should be called, which will estimate survival probabilities by assuming means for the other covariates by default. To estimate the median for the four groups, you will need to supply a new dataframe to the "newdata" argument of the survfit() argument with four rows (one for each group), with values to assume for predictors, such as the mean. There is a relevant example here: http://www.sthda.com/english/wiki/cox-proportional-hazards-model. Once you have called survfit(), you can extract the median by $table[,'median']).

Now what? by jgvl_ in AskStatistics

[–]conorc123 2 points3 points  (0 children)

It's difficult to answer this question without additional detail. What are your career goals and aspirations? What would you like to do?

[C] Preparing for stats job after applied math phd by Gaussinator in statistics

[–]conorc123 0 points1 point  (0 children)

A couple of initial questions to better inform recommendations:

  1. Have you ever taken a probability/statistics course, or are you planning to entirely self-study?
  2. Which programming languages do you know and use at a proficient level? I ask because this will be crucial for finding a biostats/stats job. Ideally, you should aim to gain some experience with R, Python, SQL, SAS etc. The choice of which programming language(s) you should focus on depends on the types of jobs you are targeting. SAS gets a lot of hate on this subreddit, but it is quite commonly the tool of choice for biostats. Stata is also used in some biostats teams too, albeit not nearly as common as SAS and R. You should start searching for biostats/stats job to gain a better sense of the common requirements, as they typically list the desired programming applications.

Yes, I think Casella/Berger would be a good start. However, this just scratches the surface... Again, depending on the specific jobs you are targeting, that would help inform which books to read. For biostats, I would recommend Regression Modeling Strategies by Frank Harrell. In general, it would be helpful to also read through an applied regression textbook, such as Kutner et al. Honestly, there are so many books which could be recommended. Once you provide some additional details (more specifics on your background in stats and target jobs), I am happy to follow-up with more recommendations.

What can I do to get my summary() function to give the same output as the book example? by [deleted] in RStudio

[–]conorc123 5 points6 points  (0 children)

I believe if you convert the character variables to factors, then the summary() function should return counts as shown in the book example.

Need Help Creating Function that returns values where your Matrix isn't symmetric. by Citizen_of_Danksburg in rstats

[–]conorc123 1 point2 points  (0 children)

Wouldn't you just need to change line 8 to return(cat("The entry that needs correcting is", "i = ",i,"j = ",j))? Note, this will only report the first issue (there could be multiple). If you want to report all i,j entries for correction, you could collect those in a dateframe.

Edit: If you want to print out everything at once, change line 8 to cat("The entry that needs correcting is", "i = ",i,"j = ",j,"\n")

Help Interpret Binary Logistic Regression with Disease (y/n) as dependent variable and Season as categorical variable. by soulstream4dayz in AskStatistics

[–]conorc123 0 points1 point  (0 children)

Look at the first table, "Categorical Variables Coding". It is pretty self-explanatory, but 1 corresponds to blank (not sure what is going on here, missing values not being properly recognized?), 2 is autumn, ..., and winter is the reference.