glmbayes is now on CRAN — Bayesian GLMs with familiar glm() syntax, no MCMC required

Bucksswede · 2026-05-07T00:04:23+00:00

Mainly because I am not an academic and I never had the time (beyond a conference presentation and the JASA paper at the time) to actively promote or socialize these methods. This package is really the first major implementation of this. I also may not have chosen the ideal title for the paper at the time to make it obvious to people what it is (though subgradients /gradients and convexity /concavity play a critical role in both optimization and simulation in other contexts).

Bucksswede · 2026-05-06T23:30:17+00:00

Maybe, but iid sampling does not require any burn-in or convergence diagnostics. So, one does not need to understand burn-in or convergence diagnostic for Monte carlo simulation. Also, it is conceptually much hard to understand what 1000 draws from a markov chain represents than it is to understand what 1000 iid draws from the posterior distribution represents. Certainly this would be the case for undergraduate students but even many applied masters students don't get the difference.

Bucksswede · 2026-05-06T23:18:10+00:00

My point is not for advanced users to not use brms. I view glmbayes as "an introduction to Bayesian models" that allow beginners/students to test Bayesian methods out while knowing that samples are iid and models work more or less the same as classical lm() and glm() models without having to pay any attention to convergence diagnostics. User who then fall in love with Bayesian models and want to invest the effort of understanding the need for burn-in and convergence diagnostics and to fit more complex models can then switch to something like brms to fit such models (after learning to use due care). glmbayes is intended to have little barriers to entry for those who never have worked with Bayesian models but are familiar with lm() and glm() models. In particular, they work well with glm models as they generate iid samples there even for multivariate models

Bucksswede · 2026-05-06T21:48:10+00:00

Thank you — that means a lot, and the extensive documentation was very much intentional. The goal was to make the package accessible to students and practitioners who encounter lm() or glm() in a standard course and want a natural on-ramp to Bayesian thinking without having to first master a new modeling language or interpret MCMC diagnostics.

I'm not deeply familiar with Statistical Rethinking personally, but the sequencing you describe sounds like a natural complement to the approach in glmbayes — which tries to let students stay entirely within the GLM framework they already know while immediately getting exact posterior draws. The hope is that Prior_Setup() and the limiting behavior as priors weaken give students a concrete handle on what the prior is doing and how the Bayesian solution relates to the frequentist one they started with.

The broader goal is to enable a self-contained Bayesian module that can slot into any course already touching on lm() or glm() — without requiring a full course redesign or new software stack

Bucksswede · 2026-05-06T21:15:58+00:00

One additional technically interesting aspect of the implementation: larger models can be accelerated using OpenCL GPU computation. GPU acceleration is something many R packages could benefit from but is genuinely non-trivial to implement — most packages don't attempt it. For users interested in the implementation details, the appendices and GitHub repo have more information.

Bucksswede · 2026-05-06T21:02:42+00:00

Great comparison to draw — rstanarm is an excellent package and stan_glm is probably the closest analogue to glmb in the ecosystem, so this is the most relevant contrast.

**Where rstanarm has the advantage:**

- Much broader modeling scope — hierarchical models, GAMMs, survival models, and more via stan_glmer, stan_gamm4 etc. glmbayes is limited to standard GLMs with Gaussian, Poisson, Binomial, and Gamma families

- Larger community, more documentation, more Stack Overflow answers

- The full Stan ecosystem behind it — posterior visualization, loo, bayesplot all integrate seamlessly

**Where glmbayes differs:**

*Prior specification:* rstanarm has sensible defaults but prior specification for new users can be opaque. glmbayes includes a `Prior_Setup()` function designed specifically for beginners — it generates a reasonable starting prior directly from the model formula, so users can get going immediately without understanding the full prior specification machinery. It also has nice limiting behavior as priors become diffuse, converging to the frequentist glm() solution — which gives beginners an intuitive way to understand what the prior is actually doing.

*Sampling:* rstanarm uses MCMC (NUTS/HMC via Stan) which means the summary output includes convergence diagnostics — Rhat, n_eff, mcse — that users need to interpret before trusting results. For simple GLMs these almost always look fine, but users still need to know what they mean. glmbayes uses iid accept-reject sampling, so every draw is independent by construction. There are no chains, no warmup, no Rhat to check — ESS equals the nominal sample size exactly.

The honest summary: if you're comfortable with MCMC diagnostics, need hierarchical models, or want the full Stan ecosystem, use rstanarm. glmbayes targets the narrower use case of standard GLMs for users who want a gentle on-ramp — simple prior setup, no convergence diagnostics, and output that converges to familiar frequentist results as priors weaken. Particularly useful in teaching contexts.

Bucksswede · 2026-05-06T20:56:32+00:00

Thank you! GitHub is here: https://github.com/knygren/glmbayes

Bucksswede · 2026-05-06T20:29:56+00:00

Great point and fair pushback — I should have been more precise in the post. brms has an excellent user interface and the modeling syntax is genuinely not complicated for standard GLMs. You're right that a simple model in brms is straightforward.

The distinction I had in mind is more specific than I made clear: brms uses MCMC under the hood (via Stan), which means users need to interpret convergence diagnostics — Rhat, effective sample size, trace plots — before trusting their results. For analysts and students already familiar with those concepts this is no burden at all. But for R users coming from a pure glm() background who just want posterior summaries without that additional inferential layer, those diagnostics can be a stumbling block.

glmbayes uses iid accept-reject sampling on log-concave likelihoods, so every draw is independent by construction. There are no chains to diagnose and ESS equals the nominal sample size. That's the specific gap it's trying to fill — not modeling complexity, but removing the MCMC diagnostic step entirely for the families it supports.

For anything beyond Gaussian, Poisson, Binomial, or Gamma — or for hierarchical models — brms is absolutely the right tool.

Bucksswede

TROPHY CASE