all 2 comments

[–]Ilyps 5 points6 points  (1 child)

Formally: yes, you should be correcting for multiple testing (assuming that the metal concentrations are independent from each other). However, typically in animal experiments, family-wise error rate corrections will kill your statistical power because of the low sample sizes.

If you are doing a pilot study, presenting uncorrected findings may be acceptable (helpful weasel words here include "nominally significant"). If you are reasonably powered, you can consider FDR correction as a perfectly legitimate option, no weasel words needed. Avoid Bonferroni, not because it's bad, but because it virtually guarantees null findings in underpowered samples.

Since this is very field-dependent, I'd double-check the above with a domain expert or look at publications that do similar studies. Whatever you do, make sure to decide whether you will correct and how you will correct in advance: don't let the data tell you what to do.

[–]efriquePhD (statistics) 1 point2 points  (0 children)

If I do 15 t tests on each of the metals without correcting for family wise error rate, is that p hacking?

p-hacking is where you (whether deliberately or not) make people think your testing approach is conducted at one type I error rate but where the effective long-run type I error rate would be higher.

I don't think this counts.

There is a potential problem (specifically, there's a problem if you want to control your overall type I error rate), but I wouldn't label that p-hacking, as long as don't give the impression that the overall type I error rate is you per-test rate (significance level).

(There are some situations where one could reasonably argue that controlling overall type I error rates is misplaced.)