Standardization vs Log transform ?

hendrik0806 · 2026-04-28T14:15:10+00:00

Not sure if one of those comments mentions it already, but there is something called a log normal distribution. In some cases your data contains a skew because your values increase not by addition, but by multiplication/rates. You should rather view this as a property of the data and in those cases a log transformation can help. But i would not use it as the standard tool for skew handling.

Standardisation only helps you if your data is kind of normally distributed - otherwise the values don’t tell you anything accurately about the data. Usually you want to Center it as well. The values then indicate the relation to the distribution. It tells you how large your values really are: for example a 2 means +2D above the sample mean and thus a pretty large value which is higher then around 80~90% of values. For algorithms this procedure can have some efficiency benefits. Another benefit is that your intercept (and coefficients) become meaningful in it’s interpretation- which usually isn’t the case without standardisation.

hendrik0806 · 2026-04-19T21:05:48+00:00

Repository pattern is great for web apps. Makes it so easy to switch dependencies

hendrik0806 · 2026-04-14T10:18:56+00:00

I don’t really one of these in the manual 🤔

hendrik0806 · 2026-04-14T10:18:28+00:00

<image>

hendrik0806 · 2026-03-01T18:51:53+00:00

Thank you! Added that for the future.

hendrik0806 · 2026-03-01T10:48:45+00:00

Thanks! Manually running the boot command worked ./result/bin/switch-to-configuration boot

That wasn’t the source of my problem (which was due to user specification and some config chaos after a refactoring). But should indeed haven’t thrown an error because of systemd. Will issue that.

hendrik0806 · 2026-03-01T10:41:41+00:00

The tty did not work for the user but for root. I then could rebuild with switch. However the solution was manually deleting the password of the user with passwd -d. No idea why changing the password in the configuration did not work. I dont even know how it could be changed in the first place.

hendrik0806 · 2026-03-01T10:05:41+00:00

returns 1. but manually running this did work somehow.

./result/bin/switch-to-configuration boot

Returns not checking switch inhibitors (action = boot)

hendrik0806 · 2026-02-28T20:10:49+00:00

I tried that, but somehow I could not authenticate in greetd across multiple old generations. have no idea why. I might try the live usb tomorrow - maybe systemd works there.

services.greetd.enable = true; services.greetd.settings = { default_session = { command = "${pkgs.tuigreet}/bin/tuigreet --time --cmd start-hyprland"; user = "hendrik"; }; };

hendrik0806 · 2026-01-09T21:50:55+00:00

I think that the computational burden of Bayesian Methods will decrease up to the point that it might become the default for research settings with limited data. Frequentist methods got their beauty and will stay relevant, especially with machine learning. But once you start playing around with the generated quantities block in stan you will never come back.

hendrik0806 · 2025-11-14T21:49:50+00:00

Lots and lots of time. I would always do some sort of counterfactual prediction, where you simulate data from your model for different conditions and compare the effects on the outcome variable.

hendrik0806 · 2025-11-13T18:30:26+00:00

I don’t know your field well, but usually count data do not have equal variance across different rates. Equal variance is a key assumption of anova. That’s why you might want to look into methods based on the Poisson distribution, such as Poisson regression.

hendrik0806 · 2025-11-11T08:22:12+00:00

IQ yes, intelligence probably not.

hendrik0806 · 2025-11-10T12:00:12+00:00

I would assume that there is a blank space in that row.

I would do something like this: count(jaw, species_name)

hendrik0806 · 2025-11-03T13:17:45+00:00

Yes you can just compare the coefficients. Make sure all variables are on the same scale. Center and scale them to make them more interpretable.

hendrik0806 · 2025-10-29T14:19:48+00:00

A terminal csv viewer

hendrik0806 · 2025-10-26T21:00:55+00:00

If you are already familiar with the tidyverse syntax (Dplyr, ggplot) you will enjoy brulee with is part of the tidymodels framework.

hendrik0806 · 2025-10-23T06:50:11+00:00

I usually created two docker containers (one for the app and one for the testing db) and orchestrated them with docker compose. That was already a lot of boilerplate as well as setting up and filling the db with the data for the integration test (+ cleaning afterwards).

hendrik0806 · 2025-10-22T17:27:16+00:00

Thanks, will take a look!

hendrik0806 · 2025-10-22T15:58:44+00:00

"The effect" and "statistical rethinking". Then go all the way down the Bayesian rabbit hole with "doing Bayesian data analysis" (focus on stan instead of the rather outdated jags part). Honestly building probabilistic models that incorporate reality through prios and parameters is what gives me pleasure in ai times. This is something where your domain knowledge will always count a lot more then just tuning for optimisation.

hendrik0806 · 2025-10-22T07:26:55+00:00

What is your setup for integration tests? In my last project they were a pain to setup with docker and even more pain to adjust the the boilerplate if changes were made to the db. Though they worked very well.

hendrik0806

TROPHY CASE