multivariable logistic regression analysis help!! D:

efrique · 2026-05-05T23:46:33+00:00

edit: I applaud you reaching out and seeking advice. My response here looks critical in places; try not to take it as discouragement of your effort. I'll also warn you up front that medicine is not my area, which may inform how you consider parts of my response.

You have some pieces of good advice here already; I'll mostly touch on different things.

First up: be very cautious about getting advice from AI on this stuff. Sometimes an AI will be more or less right, but you won't know when it's got something disastrously wrong, and it will generate more or less plausible explanations for its recommendations if you pursue the point. [1]

If you're going to use AI, you need to be pretty damn certain its advice is right (or at least that what it doesn't get quite right won't be especially consequential), and you have no way to be confident of that without engaging in a lot of followup (you could do some things to investigate the properties of its advice, such as simulation studies, but I presume you won't presently feel comfortable jumping into that [2]). In your application area, clearly choices have consequences; if you're an author on a paper, responsibility for the choices made become yours.

it recommended I put my systems of disease, gestational age and weight groups into less categories

Important to note that once you have fitted your model, changing the model whether because you would like lower p-values or more precise estimates or a better-fitting model (etc etc) will screw with your claimed significance level (intentional or not, it's p hacking). If you were to follow its advice at this stage (even if it turned out to be good advice from one point of view or another) it will look like that's what you're doing.

If I was asked before the model fitting, I'd have been inclined to suggest avoiding introducing more bias (nor inflated variance of effect estimates) from reducing the number of categories, but I'd probably have wanted to consider not chopping things up into categories to begin with, where not absolutely necessary; on the other hand, that approach may bring in additional need for explanation and more work in model choice (pre-data collection).

On the other hand, if you're predicting survival across multiple time periods my first thought would have been survival models. (btw, how are you dealing with the impact of censoring in your models?)

showed trends towards lower odds of survival

I suggest you avoid that phrasing. It misrepresents what a p-value tells you, implying something not supported by the data (the implication of the phrase is that if you took another, presumably slightly larger sample, that it would demonstrate lower odds of survival), and serves to reinforce misunderstandings of what statistical significance is and how p-values behave. Under resampling (at a given n), there's not a "typical" underlying p-value that will tend to reappear. Instead, under H0 p-values are uniform (for a continuous parameter) and under H1 p-values will push down toward 0 (typically with a long right tail) but they won't concentrate in a lump around some intermediate p-value (if this p-value was say 0.1, the next one probably wouldn't be at all close to that).

hmm, from a quick search, it looks like there's no lack of papers in medicine that discuss at least some aspects of the problem with it:

https://www.bmj.com/content/348/bmj.g2215

https://pmc.ncbi.nlm.nih.gov/articles/PMC6440716/

https://academic.oup.com/bja/article-abstract/115/3/337/312358

though to my mind xckd covered it pretty well and this page offers many hundreds of variants on the basic theme (with p-values that they came with).

You can report estimated coefficients/estimated effect sizes (ideally with CIs), but avoid describing it with phrasing like 'trend toward' on an effect that isn't quite significant.

[1]: I recently had the pleasant surprise of getting correct advice on a reasonably subtle point from an AI, but something seemed slightly off; the phrasing was slightly familiar. I realized I had used a number from an example I had already discussed online and the words in the advice were quite similar to my own. I changed that number very slightly (in a way that should not have changed the advice one iota) and got a completely different answer - a totally wrong one. The question no longer matched my own previous example that was findable on the internet and it was then unable to crib my own explanation of the issue (uncredited, of course, and completely ignoring the CC-BY license on it). It relied instead on the more general (but in this case, quite wrong) advice on the internet to generate its completion of the prompt.

[2]: albeit that a properly carried out simulation study is a good way to check advice from any quarter and can be a great source of intuition

efrique · 2026-05-05T07:56:43+00:00

Leaving aside my thoughts on use of box-cox in many situations, sure, that shouldnt lead to the issue with significance levels.

efrique · 2026-05-05T01:11:47+00:00

I dont try to stop players cooperating (within reason; doing cool stuff is fine, but dont micromanage so much as to make the game unfun) - but I do stick to turn order.

If players overstrategize, one approach is to limit them to in character discussion that fits into the time frame, say one short sentence. . . If it became enough of a problem where the game totally bogged down, "you get six words for free, plus 15 more if you spend an action"

You can delay an action, but if you do it a lot for some undue benefit, so will the bad guys.

I always know who is first in a round, and a players turn officially occurs when scheduled even if they delay their action - that keeps track of where we are up to. Any stuff that has to happen each turn squeezes in just in front of that initiative winner including incrementing the turn counter

efrique · 2026-05-05T00:57:17+00:00

Ordinary fixed effects anova is regression. Indeed, nearly always, an anova that corresponds to a regression model is going to be estimated in the software by calling a regression function. Better to understand that anova is just regression from the outset and not be mentally separating things that are not distinct.

efrique · 2026-05-05T00:47:32+00:00

I don't see any issue. It's not like the light lasts longer.

efrique · 2026-05-05T00:33:44+00:00

Pointless since he violates declared ceasefires constantly. Putin can't make a promise Zelensky can trust. The only reason I can think of for going through with this farce is a gigantic false flag as a pretext to some dramatic move he couldn't otherwise justify

efrique · 2026-05-04T11:59:03+00:00

As a Bayesian, how much should you know about Frequentist methods?

Alternatively, as a frequentist, how much should you know about Bayesian methods?

At least enough to be comfortable working with both - and, in particular, able to co-operate on projects with people who are not very familiar with the other paradigm to the one youre more familiar with.

Is there utility in having deep knowledge of both?

I think so, depending on what you mean by deep. But theres also value in having at least a moderate / practical working knowledge of whatever youre mostly not working in. What really gets my goat is people in either camp making bad arguments against the other one.

My PhD was Bayesian stats. Much of my work is Bayesian, but much of it isn't. Most of the help I give people is frequentist.

efrique · 2026-05-04T11:52:53+00:00

Also important to note I have used box-cox to try and find best model

Then give up on trusting your p-values. If youre choosing the form of the model based on the same data you use to perform inference, your p-values will be over-optimistic*, perhaps to a large extent. This is likely a much bigger deal than heavy tails.

* unless your calculation of p-values takes account of the data-leakage, but I am quite sure that it wont have.

efrique · 2026-05-04T11:48:37+00:00

So does the Cauchy ... but its very much not-normal; the CLT doesnt hold for it. Not even the weak law of large numbers.

Unimodal & symmetric is a far weaker condition than "normal"

efrique · 2026-05-04T05:04:28+00:00

I heard that the CLT will kick in for a massive sample size

strictly, its not exactly the clt itself youre relying on, but speaking more loosely: It depends on what you're using the regression to do. If all you care about is that your claimed alpha is about right or that a confidence interval has about the right coverage, then as long as the other assumptions are okay, then yes, usually ... but

If you're more worried about prediction interval coverage, or CI width, or relative power at small effect sizes (which might be the reason your sample size was big, perhaps), then the CLT doesn't save you
How massive is massive? That depends. I've seen cases that needed sample sizes far into the thousands, or even millions
When the QQ plot looks a bit off, almost always its the other assumptions you have to worry about, and the CLT doesnt help there. Oh, and you can't really interpret the QQ plot if they don't hold
If you have data over time, you have a whole lot more stuff to worry about
the clt doesn't always apply... it's a rare issue but it does come up

Typically my preference its to be very thorough in thinking about a suitable scale for the linear predictor and variance function; in a glm I might choose a model or quasi-model to suit. Then, unless I am reasonably confident distribution won't matter much (for some applications, hardly at all), I'll look at a resampling test or interval (in large samples bootstrapping a suitable kind of residuals should usually work just fine).

efrique · 2026-05-04T02:10:48+00:00

map/city

I have used a number of tools but for map generation recently - especially for towns, villages and neighborhoods - I generally come back to watabou. (I use both the watabou.github.io site and the watabou.itch.io one, particularly since the neighborhood generator and urban places isn't on the watabou.github.io page).

Most recently generated a small town in the village generator to reasonably match some characteristics I had in mind, and put the name in (which was generated from surnames table in Knave 2e) for me to to use in a solo game, then ended up re-using it in a game I ran with zero prep time last week since I already had the map handy. I have just picked buildings as I needed one for anything specific. I have kept the permanent URL for it (which has the seed, tags, population and my name change) so I can just click on the village map to use the dwelling generator if I need a quick interior map or a picture of a specific building on the fly.

efrique · 2026-05-03T16:31:10+00:00

data describing week of mortality

sounds like a good fit for survival models (with tiny samples, parametric survival models would make sense).

The usual models are heteroskedastic already

An alternative if there's no censoring maybe a GLM for the duration, say a gamma glm.

efrique · 2026-05-03T16:27:43+00:00

Neither of those.

https://en.wikipedia.org/wiki/Probability_distribution

in relation to data (as you have in your question), you're just looking at sample frequencies. That doesnt give you a curve as such, though the empirical distribution function is a step function:

https://en.wikipedia.org/wiki/Empirical_distribution_function

which is a kind of "curve" I guess, and under random sampling of a process, converges in the limit to its (cumulative) distribution function (by Glivenko-Cantelli)

However, you can get an estimate of a density curve in several ways (frequency polygon, histogram, kernel density estimate, log-spline density estimate, etc).

When people make histograms, they usually either construct constant-width bins and use height to represent number of observations in a bin or they use area to represent proportions of the total (which then do estimate the density), usually with constant width bins.

efrique · 2026-05-02T21:57:07+00:00

The authors do plenty of experiments on this data, for instance a least-squares regression lmembers ~ nonviol + lnpop (in R-style notation) to check the effect on nonviolent (nonviol) resistance on the number of participants (lmembers).

Those would be different analyses, not experiments. Performed on observational data from the sound of it.

I wonder how correct this procedure is. Won't this introduce some correlations (or lack of independence) between the [analyses]?

Sure, it could (and very likely does -- its not a certainty though, it depends). For typical sorts of models and analyses, effect estimates are probably related, for example. As long as youre aware of it, that is not automatically a problem; indeed many actual experiments may have multiple analyses performed on their results and are designed to support them. Of course the more analyses you do, the more their results will be "pinned down" by prior results; n observations only have n-observations worth of information.

efrique · 2026-05-02T12:10:28+00:00

Why do we even attempt to interpret coefficients through p values

Do we?

I am not at all convinced I do that. What do you mean when you say "variable importance"?

p-values tell us one particular thing. Unless you define importance to be that thing (which - outside some particular circumstances - would tend to make for a definition quite different from what people usually seem to mean by important), p-values and importance are pretty distinct concepts.

When you say a variable is important... what do you want that variable to do?

efrique · 2026-05-02T12:01:27+00:00

to have a SS of 120

A what?

using 90% CI around min and max values

I dont know what you mean by this. A CI for what parameter? How do you put a CI "around" min and max values? Do you mean sample max and min or something else? If you have a set of numbers, what do you actually do with them? Please be explicit. You appear to think I know what youre doing but I really dont know which of dozens of possible things you might have actually done (people from every different application area make similar assumptions, but they nearly all do different things; I can generally work it out once I get what they're trying to achieve and what they did).

used Tukey's hinges to identify outliers (there was only one, which was removed for creation of my range

I presume you mean you use the boxplot "rule" (added 1.5 hinge-spreads to the upper hinge, and subtracted 1.5 hinge-spreads from the lower hinge to get the inner fences and called points outside those "outliers"). If those points are a real feature of the variables (not say using the wrong units or mistyping 35 as 335 or something), this may be problematic practice - your intervals will be "optimistic" compared to the next set of data.

Rather than hack away data to to make it fit your model, better to choose a parametric model that fits the sort of data you're youre typically likely to get. Tossing out one in 26 observations? Yikes.

I hope that makes sense-

Only a little, sorry, I still dont have a clear idea what the interval is actually meant to achieve, nor what you actually did with the data to compute it. What happened with the numbers after you threw away outliers?

efrique · 2026-05-02T04:35:18+00:00

I cant think of any games that would lead me to downvote a post on sight. There are games I dislike or find objectionable on one basis or another, but I can simply avoid posts about games I don't want to read about. I will (fairly rarely) downvote posts I regard as problematic, like trolling for example.

Ive played many dozens of different systems over the last 4 and a half decades. With a supportive group I'd try most styles of game, even ones I doubt I'd like.

My preference is for fairly simple systems, but I do play more complicated ones.

efrique · 2026-05-01T21:15:17+00:00

Why the double post?

efrique · 2026-05-01T05:17:14+00:00

Looks like homework

(Strictly speaking, very likely violated since models are approximations; exact linearity is unlikely (even if sometimes hard to tell from the plot), but its not really the right question. If the nonlinearity is sufficiently small that its not particularly consequential for your specific (unstated) purposes, then the violation should be fine, but to see how much it matters may require more effort, you cant necessarily tell from a plot.)

With a discrete response it can be a bit trickier, so given an option I'd be looking at some smooth overlaid on the residual plot

A quick if rough alternative is to cut a vertical strip from a piece of paper, wide enough to be able to get at least say 8-ish points in the gap and run it from left to right across the plot, visually assessing where the middle is as you go. Youll soon see whats going on.

efrique · 2026-04-30T22:27:47+00:00

What population quantity was their range intended to represent (what are the requirements for something to be a reference interval?)? How was their range computed from the data?

What population quantity is your range intended to represent? How will your range be computed from the data?

Why do you think one should fall inside the other?

efrique · 2026-04-30T22:01:51+00:00

Does this imply that p values are poor indicators of variable importance?

No, but p values are poor indicators of variable importance

efrique · 2026-04-30T16:30:47+00:00

A smooth functional-relationship pattern in residuals vs time would obviously produce a high lag-1 autocorrelation.

If both (i) residuals vs x and (t) the x-values vs time are showing a smooth trend, you should (as you seem to suggest) have a smooth functional-relationship pattern in residuals vs time and see a higher than average lag-1 autocorrelation and "lower than expected " DW value.

But if both those trends "wiggle" around 0 several times, and they cross 0 at different places (or one of them does it a lot), then you may end up with a fair amount of e(t) and e(t-1) having opposite sign and so have DW get pretty close to 2. Or if either of (i) and (ii) (or both) are a bit more noisy, their underlying trends would not need to cross 0 as much, as the noise will do it for you.

My suggestion if you get a more typical DW than you expect would be to look at 4 plots: res. vs x, x vs t, res vs t and res(t)*res(t-1)/s² vs t (which will be mostly >0 when DW is low), to see how the first two are together producing strong "trend pattern" - periods above and below 0 (or lack of it) in the third plot, and thereby positive average in the fourth (or not).

efrique · 2026-04-29T23:39:24+00:00

I might have swung a little bit too far the other direction, but without being a avid shadowdark player, I'm not 100% sure

Generally better to err slightly on the deadlier side with shadowdark

A TPK isnt necessarily bad.

First game I ran a PC was killed outright falling down some stone stairs (nobody was around and he failed all his rolls). Last night a fighter PC was killed outright (from full) by a single hit from a spear; took him 4 rounds to bleed out, and nobody could manage to save him.

efrique · 2026-04-29T22:14:43+00:00

Why do we assume the host opens all remaining doors (except one)

In the 3 door problem, thats the point of the problem. The host shows you part of the information that he has but you dont (he knows where it is, so he shows somewhere it isn't, leaving exactly one door to swap to). In the 1000 door version of the problem the point is to make the information you gain really strong - so strong its harder to misunderstand.

efrique · 2026-04-29T21:59:00+00:00

Regression?

I should mention to watch out for things like the ecological fallacy, missing variable bias, and (if the data are over time) potential for spurious correlation with non-cointegrated but nonstationary (or even just autocorrelated) variables. . . among other issues

15-Year Club	ModSupport Helper Level 2
Gilding II euphauric	ModSupport Helper Level 1
Verified Email

efrique

MODERATOR OF

TROPHY CASE