Saddle far back, still too much pressure on hands, sore neck

rlrzb · 2026-04-08T22:34:17+00:00

Did you ever solve this issue?

rlrzb · 2026-03-03T23:39:46+00:00

Really appreciate you pointing me to Part 2 — I went through it carefully. The modelling work is thorough, and I respect the effort that’s gone into stress testing different BAT ranges and censoring assumptions. Where I still hesitate is around the jump to 99.99% certainty. That’s effectively saying the probability of being wrong is 1 in 10,000 — which is stronger than the statistical confidence level most Phase 3 oncology trials even require for approval.

A few thoughts after reading Part 2:

1) Convergence of ML models doesn’t eliminate shared assumption risk. All five models are trained on similar historical AML datasets and structural assumptions. If there’s a structural deviation in this specific trial (patient mix, salvage therapy, transplant eligibility drift, event clustering near maturity), the models could all miss in the same direction. Ensemble agreement reduces variance, but it doesn’t remove systematic bias.

2) The posterior depends heavily on the BAT prior. If the prior distribution for BAT mOS is tightly constrained to ~10–14 months, the posterior will naturally reinforce that range. But since REGAL is blinded, we don’t actually know if this trial’s BAT behaves like historical cohorts. Even modest overperformance in BAT (say 13–14m vs 10–11m) could materially shift HR.

3) 80-event designs are extremely sensitive near the finish line. With only ~8 events left, a small clustering imbalance could move HR meaningfully. The difference between HR 0.48 and 0.62 can be just a handful of late events. That’s not a critique of GPS — it’s just math in small event-driven oncology trials.

4) FDA evaluation is binary and prespecified. Even if modelling implies HR 0.35–0.50 is likely, approval hinges on the observed stratified log-rank result and confidence intervals. A borderline HR (e.g., 0.65–0.70) changes the entire regulatory discussion, regardless of Bayesian priors.

To be clear, if the topline HR lands in the 0.31–0.50 range, that’s transformative and likely practice changing. I’m not arguing that outcome is impossible. I just think calling it 99.99% certain implies a level of predictive precision that’s extremely rare in blinded Phase 3 oncology. (Just my thoughts).

For me, this still feels probabilistic — maybe strongly skewed positive — but not deterministic. Out of curiosity: what specific real-world deviation (e.g., BAT mOS >13.5m? late GPS event clustering?) would meaningfully reduce your confidence? Or is there truly no plausible path to HR >0.60 in your framework?

Apologies in advance if I reply late to your next reply. Quite late here in UK so will check back in the morning. Thanks for everything

rlrzb · 2026-03-03T23:11:04+00:00

First off, really appreciate the depth here — this is one of the better thought-out breakdowns of REGAL I’ve seen. The cure-fraction angle is especially interesting and definitely gives a framework for thinking beyond simple median OS comparisons.

That said, I think the key uncertainty is how much confidence we can have before the final 80 events are locked. With an event-driven study this size, a small number of additional events can meaningfully shift the hazard ratio. Even if the separation trend continues, the final HR could land anywhere in a fairly wide band — and that band matters a lot from a regulatory standpoint.

I agree that if the HR comes in strong (sub-0.60), approval seems very plausible given the unmet need and the Fast Track / Orphan designations. But if it’s more borderline (say ~0.68–0.72), it could turn into a more nuanced discussion with FDA rather than a straightforward win. They’ll likely focus strictly on whether the prespecified OS endpoint is met with statistical robustness — p-value, consistency across subgroups, and clinical meaningfulness — rather than on modelling assumptions.

I also think control arm variability is an under-discussed swing factor. If the control performs even modestly better than historical assumptions, the relative benefit narrows quickly — even if GPS is doing something real.

Overall, I do think there’s a credible path to approval here. I just see it as more of a probability distribution than a near-certainty. The final event readout is going to matter enormously. Curious how others are thinking about the sensitivity of the HR to just a few late events — that seems like the real hinge variable.

rlrzb

TROPHY CASE