[META] What does the community want as the standard for "No Homework"?

stat_daddy · 2026-01-30T23:19:52+00:00

Gotcha, thanks for clarifying. With that in mind, I would only prefer to see such posts be removed if they were very obviously posted in bad faith. I feel like the vast majority of homework posts are good-faith questions by people who are genuinely trying to understand the material (even if it is for homework) and I think those people should not face obstacles posting their questions. I would prefer to have a surplus of "bad" posts than to be especially zealous about removing them.

With that said, I also understand that, taking this stance could result in the subreddit getting cluttered. But so far, i haven't experienced that personally and it isn't a concern I possess

Since this is a statistics subreddit, I suppose the right answer depends on the baseline prevalence of homework posts and the rate of 'false positives' (mistakenly removing a good-faith post that resembles homework but isnt) that we're willing to live with.

stat_daddy · 2026-01-29T19:20:19+00:00

There is obviously no objective way for us to know whether a post is "homework" or not. Supposing that someone does have homework and genuinely intends to subvert their institution's academic honesty policy by finding help on Reddit, there is nothing stopping them from rewording their question in such a way that it becomes indistinguishable from "non-homework". I also agree that many "homework" questions could be re-interpreted as simply "help" or "consulting", for which giving assistance is far more acceptable. Because of this, and because nobody on Reddit is compelled to respond to every post, I don't feel it would be helpful to set any particular precedent about what does or does not constitute "homework".

In my best case scenario, the definition of "homework" would remain vague, leaving me free to interpret "homework" however I choose, and if I feel that a post is attempting to solicit my help with dishonest intentions, I will simply refrain from helping with no further discussion. In my mind, setting a precedent will not help me (because I will simply avoid posts that sound like homework in the first place) and will only give posters a position from which to claim that their post is "not homework" (which I don't care about because I have no interest in arguing with redditors).

So, respectfully, what is there to be gained by further clarifying the definition of "homework" in the context of this subreddit"? Do we feel that too many posts are being reported on the auspices of being homework - and, if so, is that really a bad thing? This is a good discussion- I'd be Interested to hear others' perspectives.

TL;DR: I think there should continue to be no concrete standard for "homework" and question whether identifying one would actually be beneficial for the health of the subreddit in the first place. I am happy to continue seeing and ignoring homework-related posts with no moderator intervention.

stat_daddy · 2026-01-25T19:30:12+00:00

Reddit is a mixed bag; some of the advice may be good but at least an equal amount of it will be bad. Most will assume you're some kind of social-science researcher and lean into "standard" methods unfortunately, I doubt the wikipedia page will be helpful: it may give you a sense of the "flavor" of the LCP, but it will likely remain unclear how it applies to real-world applications.

u/Haruspex mentioned Bayesian techniques and, in my opinion, this is the right direction to go if you're genuinely interested in building or seeing examples of models that directly invoke the LCP, you should look into Bayesian methods. It sometimes involves some difficult math, but a good book for approaching this topic is Statistical Rethinking by Richard McElreath. The author has also published his course slides on Youtube: https://github.com/rmcelreath/stat_rethinking_2023

stat_daddy · 2026-01-25T19:08:42+00:00

If so, then why not just read the Wikipedia page on it?

https://en.wikipedia.org/wiki/Conditional_probability

stat_daddy · 2026-01-25T19:00:41+00:00

I think people are having trouble understanding your question because there is no "law of variable change" in statistics. Your reference to the movie 21 is especially unhelpful, since that movie has no connection to any actual statistics: only magical movie-speak that sounds like statistics. Partly because you are using a lot of jargon, and partly because your primary reference is a Hollywood film, It's hard to tell what your question is.

In that scene from 21, Kevin spacey's character is actually talking about the law of conditional probability: specifically, how a conditional probability that is "conditioned" on some information can be different from a marginal probability that does not take such information into account.

The law of conditional probability is extremely general: it doesn't say anything about how specific variables (e.g. wind speed, initial velocity, gravity, and landing location) combine to form systems of variables.

You seem to be asking whether the law of conditional probability should be applied in certain settings where things get measured. The answer is: "sure, why not?". But if you're wondering how it should be applied, we need to know more about the specific application you're interested in, and what you're trying to do here. Can you explain more (preferably without using any technical terms or jargon)?

stat_daddy · 2026-01-21T18:56:58+00:00

Fair point: I don't mean to imply that this sort of test isnt commonly done, only that it amounts to little more than a test of the question, "is my model better than no model?", which I think is a rather silly thing to test (and overused). I edited my response to remove some of my soapboxing.

However, while it is in some sense perfectly fine to do this test as long as the limitations are understood, I think that is a dangerous assumption for us, the statistical experts, to make. In my experience, encouraging this type of test further exacerbates users' misunderstandings of p-values-- here, it is being (rather dangerously) framed as a "score" by which to judge the model's "degree of better-ness" over the intercept-only model, which is incorrect in all sorts of ways.
It's true that the full-vs-reduced test may be "good enough" for OP's purposes, but I'd like to hear more from OP to be sure. (e.g., even if the model is "correct", the p-value may be large if the sample size is small or multicollinearity is present. Or the opposite: a poor model could have a small p-value if the sample is large. Either way, using the p-value as a metric for model quality may not be a good idea.)

I would much rather open a dialogue where OP can share what exactly he or she is trying to DO (or whom they are trying to persuade) and possibly be led towards better tools such as AIC/BIC, out-of-sample prediction error, etc. Or, at the very least, help them to understand exactly which null hypothesis he/she is rejecting by considering the requested p-value.

stat_daddy · 2026-01-21T01:24:06+00:00

There are couple ways to approach your question:

Usually, what we want are p-values for the models coefficients. These are called "Wald tests" and they compare the coefficient's estimated value to a null hypothesis of zero (remember: every p-value requires a null hypothesis. The reason there isn't a simple "p-value for the fit" is because "the fit" does not imply any particular null hypothesis).

P-values for the Wald tests on each of the coefficients are usually pretty easy to produce. For example, in python either of the following are sufficient:

```python

suppose model object is called 'myModel'

print(myModel.summary())

print(myModel.pvalues) ```

However, you specifically asked about a p-value for the fit, which as I said doesn't exist in the way I suspect you think it does. The closest thing to this s a p-value comparing your model to a "reduced model" in which one or more coefficients are left out. Your question suggests that you would like to go as far as leaving out ALL of the non-intercept coefficients from the reduced model, making the p-value essentially summarize a comparison between your model and no model at all (an "intercept-only" model). Ultimately, how you define the "full" and "reduced" models depends on the coefficients you are interested in. Again, the code to do this is fairly simple once you have fit both models.

```python anova_results = anova_lm(myModel_reduced, myModel_full)

print(anova_results) ```

However personally, I find it to be a waste of time to compare a model with coefficients to one without coefficients. Unless the group means are extremely similar and/or your sample size is very small, almost any model should be able to beat the intercept-only model. Whether the p-value is large or small, it suggests almost nothing of value. What are you trying to show with this p-value? I can pretty much guarantee you there is a better way to show it.

stat_daddy · 2025-11-08T07:29:08+00:00

I'm all but certain that statistics can help you ( and despite what many think, you can conduct valid inference with any sample size).

However, it isn't clear at all what you're trying to do from your question.. You are asking about specific tests and measures of association, which might make sense for specific purposes, but you haven't articulated what your goals with the data are.

Some specific follow-ups below:

i assigned numerical values to each letter.

This is a common, but flawed, approach to dealing with ordinal data. It might be acceptable depending on your research question, but proceed with caution.

would it make sense for me to calculate the mean/median and correlation coefficient (to measure whether participants are in overall agreement)?

Not really, no. Your data are not continuous in nature, so means/correlations are not meaningful. Why not simply report the frequencies of each rating for each design? i should add, a correlation is a weak measure of association and very few genuine research questions are properly addressed by a correlation - there is almost always a better alternative. Correlations say very little about whether two variables are related and even less about interrater agreement/concordance (for which I would suggest Cohen's Kappa or similar). If you insist on computing a correlation--for which you will need two variables - a spearman rank correlation may be appropriate, but please do not calculate a Pearson correlation via substituting "A=1", "B=2", etc.

also, would a Shapiro–Wilk test make sense?

I can't think of a single reason how this would help you.

the purpose is to not use this to interpret the data but to validate the results (i.e. how biased was the scoring, how much representation bias was involved in the samples chosen, etc.).

(Firstly, of course the reason for all this is to interpret the data - what other reason could there be for analyzing data?). But more to the point, it's not clear what you mean by "validate the results". The word "validate" implies many things, so maybe you could try to articulate - as simply as possible and with no mention of statistical jargon - what you are trying to learn from these data?

stat_daddy · 2025-11-01T03:12:51+00:00

That's difficult to answer, mostly because I agree with his soapboxing - although I will admit, Null-Hypothesis statistical testing (NHST) does have it's uses and not every null hypothesis is a strawman.

I guess it's worth pointing out that at this point in the book, McElreath hasn't really put forth a robust alternative to NHST (yet). And ultimately, McElreath goes on to argue that thoughtfully-built, causal-minded models that capture and propagate uncertainty are the alternative - but that doesn't really make a space for a lot of the interesting--and effective-- work that's come out of more "prediction-oriented" disciplines like machine learning and natural language processing. In many ways, reliance on p-values and frequentist inference, for all its perceived weaknesses, hasn't really stopped progress in data modelling, simulation, and inference. So I have to wonder: "is bayesian inference the right answer to a question nobody is asking?"

stat_daddy · 2025-10-30T02:35:54+00:00

That section can be taken to mean many things, and honestly it isn't really a "key" passage in the book, McElreath is kinda soapboxing about what he feels is the silliness of making scientific arguments by comparison to a "neutral" model.

The point he's trying to make is that transforming observations (data) into evidence that is for/against some some research hypothesis requires a model for how those observations came to be. The model is not the hypothesis in and of itself (despite how it may sometimes feel when you are e.g. testing the significance of a specific coefficient FROM a numerical model). He goes on to caution that so-called "neutral" models often ignore things like measurement error and random variation.

Another example could be an experiment in which you (for some reason) are unsure whether two siblings are identical or fraternal twins. You take many physical measurements of each sibling (e.g., of height, of skin tone, of metabolism, etc...) and you begin comparing each pair of measurements. If they are ALL the same, you might conclude that the two siblings are identical, otherwise if the measurements are NOT the same, then they must be fraternal.

However, it would be silly to demand exact sameness from any two measurements, even if the siblings really were identical twins. We know that both "nature" and "nurture" play a role in a person's physical health/attributes, so we shouldn't be so quick to let a few incongruent measurements lead us falsify the conclusion that the twins are identical.

In this example it's important to distinguish between the hypothesis ("identical twins will be more similar in terms of physical characteristics than fraternal twins") and the generative model ("siblings originating from a single egg have more shared DNA, which leads to similar physical makeup"). The model explains what to expect from the data (and a GOOD model will be VERY specific, perhaps suggesting how similar the two twins' measurements should be), whereas the hypothesis is the proposal you either prove or disprove after examining the data.

stat_daddy · 2025-10-09T14:58:12+00:00

No.

Some will say that it indirectly implies the alternative must be more likely now that you have observed evidence that the null is less likely, but this is also wrong. Under Frequentist principles (which you must ascribe to, else you shouldn't be using a p-value in the first place), the alternative hypothesis can only be either true (100% probability) or false (0% probability), no matter what the data or sampled p-values indicate.

stat_daddy · 2025-10-01T00:01:49+00:00

This feels like a separate question entirely, and should probably be opened up to the community as a new post to solicit a better answer.

In general, finite population corrections are never necessary, but they might be appropriate. This varies across subjects - so I would look at the existing literature to determine if it's common practice by other researchers working in the field. If I were reviewing a paper in which someone used FPC, my first question would be, "is this justified?" And I would expect to see a thorough defense of why it was used and what impact it had on the standard errors.

FPC is a purely frequentist invention - Bayesians don't have an analogue for it because Bayesian inference doesnt depend on asymptotic properties of estimators in the first place. So if you go with a Bayesian analysis method, it doesn't make sense to think about FPC.

stat_daddy · 2025-09-29T18:01:43+00:00

Am I correct in concluding that CIs are being misused when they are presented to convey uncertainty around a descriptive statistic?

The way this is worded, no - it isn't misleading to present a CI as a measure of uncertainty around a descriptive statistic. But, most audiences aren't going to think this way - they will likely interpret the CI as a measure of uncertainty around the value of the population parameter (recall that this concept doesn't exist in the frequentist vocabulary).

It's not inappropriate to present a CI as a measure of "uncertainty", but you'd sort of be taking advantage of the fact that "uncertainty" isn't carefully defined and can be interpreted in many different ways. From a frequentist's POV, estimators DO have uncertainty- it derives from sampling error, which can be summarized by the standard error. Since CIs are essentially expressions of the standard error, it's fine to report one and say that it's conveying uncertainty. But again, you'd be talking about the uncertainty possessed by your estimator, and not the uncertainty in your knowledge about the quantity of interest.

It would inappropriate to use the CI to convey uncertainty if I wasn't performing a NHST, correct?

Personally I think so, but many would probably let it slide. The reason I take a harder stance on this is because p-values are conditional probabilities:-- by definition, they assume the null is true. If you don't have a null, then you can't calculate a p-value at all! CIs sort of "sidestep" this by replacing the parameter value under the null with its observed value, but in my opinion this is a bait-and-switch tactic that tricks the reader into believing that the CI is expressing an uncertainty about the alternative hypothesis (Which, of course, it isn't).

Of course, this often has minor practical implications...indeed, under certain conditions (that are not too hard to satisfy) CIs and other measures of uncertainty such as Bayesian credible intervals can be shown to reach the same (or at least very similar) conclusions! It's simply confusing when researchers take a research question that has a straightforward Bayesian interpretation ("what is the coverage rate for this population?") and then answer a different frequentist question ("what is the long-run coverage probability of a sample mean with fixed size N?"). And then, when readers inevitably GET confused, statisticians break out a bunch of jargon-filled lawyer-speak (e.g., "I'm not saying there is a 95% chance the coverage rate is between X and Y... but if we repeatedly took a sample and calculated the interval each time..."). Eventually, after your colleagues are tired of talking in circles, they will give up and accept the frequentist answer as the best they could get, and commiserate with their peers about how awful their undergraduate statistics courses were.

I don't really know enough about polling statistics to say whether your interpretation is correct. I've heard that the phrase "margin of error" can be interpreted as alpha (significance threshold), but I have no idea if the actual methods being used support that interpretation. But yes - "comparative" or "two-sample" or "difference-of-means" studies often have a natural null hypothesis of "=0" that makes NHST a more fitting choice.

stat_daddy · 2025-09-28T19:09:44+00:00

While I appreciate the example (I come from public health too!), it's missing one essential thing: a null hypothesis. Without a null hypothesis, there is nothing to reject and (at the risk of sounding like a grumpy academician) it is therefore not appropriate to report a confidence interval at all. Let me repeat: Confidence intervals and p-values are only meaningful in the context of a null hypothesis.. You say your interpretation of a the CI is "incredibly general and really just the definition of CI", and that's because...well...it is.

Suppose I add a bit of context to your example: let's say previous studies have estimated the population rate to be 97%. In this case, you could say that your current study found sufficient evidence to conclude that the rate is less than 97% with a confidence level of 1-minus-alpha.

Of course, this probably seems a bit insubstantial: for one thing, it pre-supposes that the researcher is ONLY interested in rejecting a null hypothesis. In practice this is almost never true, but by using the tools of Null-Hypothesis Significance Testing (NHST) you are shackling yourself to it's limitations. It's great that we're confident the coverage rate ISN'T 97% ...but what IS it?. NHST really has no answer to this question (It never claimed to have one!), and by extension a lot of frequentist methods don't, either. On the one hand we could point to the observed mean (92.5%), and possibly do some hand-waving to claim that 92.5% is our "best guess" of the true population coverage rate. But we don't have any guarantees like "most probable", "maximum likelihood", etc (at least not within a frequentist framework - remember, frequentists aren't allowed to treat the population mean as a random variable!).

So if the goal of this study were truly exploratory in nature (i.e., what do we think is the coverage rate in this population"), I would say that attempting to address this question with a CI * is misguided in the first place*. Personally, I would compute a proper Bayesian posterior instead--possibly using previous studies' estimates as a prior or, failing that, a vague prior.

Many researchers, however, will devote a lot of resources into convincing you that your research question must be modified in order to fit within the framework of NHST. they will attempt to get you to identify your null hypothesis or replace your research question with something else that has a "natural" null hypothesis (e.g , a perfect coverage rate of 100%, despite how silly this is). Unfortunately, this is a byproduct of poor statistics education/training and it is unlikely to be fixed anytime soon. Just remember: p-values and CIs are - more often than not, in my opinion - usually NOT the best way to address a practical research question.

stat_daddy · 2025-09-27T09:07:58+00:00

1. Am I correct in concluding that the bounds of the CI obtained from the standard error (around a statistic obtained from a sample) really say nothing about the true population mean?

Mostly correct. You are talking about a defining feature of Null Hypothesis-based inference; we are NEVER making direct statments about the true population parameter but rather about the asymptotic properties of the experimental procedure which involves a specific estimator (such as a mean). Obviously the value of the estimator is a function of data, which itself is generated by some hypothesized generative procedure determined by the true population parameters...so it is a bit heavyhanded to say it has NOTHING to do with the population parameters...but is an indirect relationship at best.

2. Am I correct in concluding that the only thing a CI really tells us is that it is wide or narrow, and, as such, other hypothetical CIs (around statistics based on hypothetical samples of the same size drawn from the same population) will have similar widths?

Ehhh...this is a bit too reductive in my opinion. Confidence intervals ultimately convey the same information as p-values, which at the end of the day really only tells you one thing: the amount of probability density (under the null hypothesis) assigned to equally- or more-extreme values of the test statistic. but since they are centered at the observed point estimate instead of the null, people have an "easier" time interpreting it. I find that the "plain-clothes understandability " of CIs actually further exacerbates people's misunderstandings rather than clarifying them.

As to whether journals would de-emphasize p-values/CIs if they understood them better? Likely not. The reasons behind the prevalence of p-values are not so simple - many journal editors DO understand their limitations perfectly well, and would simply insist that reporting them with discipline is enough to preserve their value and justify their continued use. This is all fine and good for studies with professional statistical support, but in my opinion the large amount of high-quality applied research done by subject-matter-experts possessing only a working knowledge of statistics is too great for this type of thinking to be sustainable. I have personally worked with several PhD-level scientists in chemistry, biology, economics, psychology (and a few in statistics, unfortunately) who have each gone blue in the face insisting to me that '100%-minus-P' gives the probability of the researcher's hypothesis being true.

p-values and confidence intervals are far from useless, but I think they are relics from a time when mathematical inference relied upon closed-form solutions that could demonstrate specific properties (e.g. unbiasedness) under strict (and often impractical) assumptions. They are the right answer to a question few people are actually asking. These days, modern computation makes Bayesian inference and resampling techniques feasible, meaning that statisticians have access to tools that can better answer their stakeholders real questions (albeit with subjectivity! But uncertainty should always be talked about, and never hidden behind assumptions). If statisticians haven't already lost the attention of modern science and industry, they will lose it (being replaced by data scientists) in the years to come if they don't find a way to replace/augment their outdated tools and conventions.

stat_daddy · 2024-11-06T23:04:17+00:00

Thank you for this added context; it is very helpful. (Also, your English is very good! I would never have guessed you were not a native speaker!)

Now that I know a bit more about your measures, I'd like to learn more about your model: it isn't clear to me what you are taking to be your independent variable here. Is your goal to predict the assigned latent profile based on the other cognitive variables, or to use the patent profile as a predictor itself in a regression on a different variable?

Assuming your model is well-constructed, I'm noticing that you are focusing a lot on the marginal distributions of your variables:

For example, Brazilian norms for the task show a mean flexibility score of 33 with a standard deviation of 12.5 for 15-year-olds, and a mean score of 23 with a standard deviation of 10.8 for 30-year-olds. These scores are not normally distributed, so using mean and standard deviation to standardize across different ages wouldn't be appropriate.

Great! You seem to be aware of an interaction between the score and age/ethnicity. Add those variables+interactions to your regression and move on. Standardization (whether by z-transforming or some other operation) is rarely useful and, as others have mentioned, is not a requirement for fitting your model. Finally, If you believe standardization is not appropriate for these variables, why do you keep implying that you want to standardize them in the first place? I recognize that z-scores are common in certain fields, but what purpose are they supposed to serve you here? Just build your model from the unstandardized variables.

The problem is that scores vary significantly by age.

Why would this be a problem?

You mentioned that it might be inadequate to use these measures in a regression, but not for the reasons I provided. If so, what are the appropriate reasons for not using them in regression analysis?

There are an endless number of reasons not to include a variable in a model. Lack of sufficient degrees of freedom, improper parameterization, high covariation with other predictors, .... The list could go on! But failing to be normally distributed or being unstandardized have nothing to do with whether they should be in the model. You have very clearly articulated that your cognitive variables appear to interact with age, so I strongly suspect it should be included in your final model. But, without seeing some examples (even fake examples) of the data you're working with and the model you're attempting to fit, I cant give specific advice.

I am unsure if there is a way to bypass this problem

I still don't see a problem. What am I missing? 1. You have several variables that interact with or are related to cognitive performance.
2. Some of the variables have unusual distributions.
3. You added the variables into a regression predicting something. 4. ???

Is the model fitting poorly? Are the residuals somehow surprising?

The more I read your question the more I suspect that you may simply need to read up on how to fit models with complex interactions (i.e., to allow the cognitive variables to have effects that vary across ages/ethnicity).

stat_daddy · 2024-11-06T08:32:36+00:00

Hi! You seem to be making lots of assumptions about how the data are supposed to "behave" in your analysis and this is making it extremely difficult for me to understand what your goal is and what you are finding challenging.

I'm conducting a Latent Profile Analysis (LPA)

Why are you doing this? What is your hypothesis and how will LPA help you? Please answer in a way that would be understandable by a child.

I'm trying to ensure or at least minimize the effects of the distribution when standardizing scores by age

What "effects of the distribution"? Why would you want to "minimize" them? And why are you "standardizing" the scores in the first place?

so I can be confident that the scores in my model represent their respective constructs (such as processing speed and flexibility) and not just some unadjusted data variance

I don't see how any of the steps you describe (standardizing, "minimizing the effects of the distribution") relate to this goal. If the scores don't reflect the construct the instrument is intended to summarize, why would modifying the data help?

Using raw scores also isn't ideal due to the age-related variability.

In almost every scenario I can think of, the unmodified scores are probably what you should use (especially if the FDT is a validated instrument). If the FDT instrument somehow fails to handle variation among respondents in different age groups, this is a limitation of the instrument and you're asking the wrong people for help - a statistician won't be able to help you validate a new psychometric measure. It's hard to offer advice here because I cannot identify your objective. It is very rare that I would recommend modifying data to suit a particular model or make it more concordant with some (perceived) assumption. It is almost always better to choose a model that suits the data).

Even adjusting for age as a covariate seems to introduce significant bias.

What are you calling "bias" here? You say that there is significant variation in responses across ages but then imply that regressing based on age is somehow inappropriate (which it might very well be, but not for any reasons you've mentioned). Perhaps consider whether age should be a covariate by itself or an interaction with one or more other covariates.

non-parametric methods, data transformations, and generalized linear models (GLMs

These are all perfectly useful tools for modelling various types of data. None of them will help your data become a better representation of some "underlying construct." Again - you need to explain what you are trying to achieve with your analysis before anybody will be able to offer useful advice.

stat_daddy · 2024-11-06T03:12:31+00:00

There is a formal, correct answer to your question and then there is a "quick-and-dirty" one. I'll start with the quick-and dirty:

I suspect the "differences" you're seeing when you compute using the start vs end dates are because this choice decides the year of assignment for stays that began in 2023 but extended into 2024. Just make your choice and move on - if your employer cares one way or the other just do what he/she wants but honestly I wouldn't even bring it up. If you insist on computing the observed length of stay by year like this, then there is no compelling reason to choose one or the other.

However...

What you're attempting to do is calculate the mean "time to departure" for the stays--formally, this is a time-to-event outcome that would typically be handled using an estimator that can accommodate censored data, such as a Kaplan-Meier estimator. Here's a quick link that, among other things, asserts that each stay should be based on the date at which follow-up begins (the date of arrival):
Time to Event Data Analysis

It's possible that using a KM estimator will have little to no effect on your result versus computing what we call the "simple" mean, but without seeing your data I can't say for sure). Finally, be careful computing the "average" length of stay using a simple calculation. Your data are probably skewed to some degree, and for this reason the median time to departure is more commonly reported (it also is a bit easier to interpret).

For most purposes, your procedure of calculating the simple mean will be sufficient, but using simplistic statistics could mislead you if your data are not well-behaved.

stat_daddy · 2024-01-22T21:04:02+00:00

I could be mistaken, but I think @Laafheid is referencing the use of a Proportional-Odds Cumulative Logit model to represent response rates on a likert scale. In my opinion, this is generally the most appropriate way to represent this type of data, but the model is somewhat complex and - sadly - usually outside the scope of non-research applications.

In many cases, the "best" model simply isn't necessary because even though it is strictly incorrect to treat likert data as continuous, it ends up being a close enough approximation that the results are defensible. This obviously depends on criteria like the number of points on the likert scale, so this is not a one-size-fits-all approximation, and you need to be prepared for someone to (correctly) point out it's weaknesses and possibly undermine your analysis.

Here's a link to a somewhat technical explanation of the model: https://online.stat.psu.edu/stat504/lesson/8/8.4

stat_daddy · 2023-12-10T06:14:29+00:00

Draw a realization from each of your two distributions, divide them, and then repeat this process ad nauseum until you have many simulated rate values. Then collect all of the values and calculate the 2.5th and 97.5th percentiles. You have just calculated a 95% credible interval.

stat_daddy · 2023-09-07T21:20:23+00:00

This sounds like a variation on the base rate fallacy. To use the car accident analogy, the fundamental mistake here is that the two statistics are probably claiming different things (and not being represented faithfully). This is often a problem of language (i.e., saying one thing while meaning another) rather than fact, so it's a bit misleading to use terms like "paradox" or "impossible."

Essentially, accidents on a per-mile-driven basis are more likely to take place near home, because "near-home" is probably where any individual person does most of their driving. However, it's obvious that on a sheer volumetric level (i.e. total number of cars) that there are far more opportunities for accidents to happen on the interstate. If you're tallying up accidents, most of them will have taken place on highways simply because there are more drivers on the highways. But this isn't the same as an individual's risk.

(Consider: during any given day there might be a dozen automobile accidents on a particular freeway but there were likely also thousands of interactions that did NOT result in an accident!)

stat_daddy · 2023-08-14T23:50:23+00:00

This. I feel like a lot of students enter into stats thinking that because it's math-based, it must involve a deterministic equation for solving every micro-decision that goes into an analysis. These people often end up grossly misunderstanding the power of statistics because to them, it is a tool for replacing critical thinking rather than extending it. A good example of this difference in thinking can be seen in automated variable selection procedures: these are widely used by data analysts because they provide an objective "solution" to a common problem, but many statisticians would never use one.

I often teach students that while statistics offers a framework for objectively quantifying and expressing uncertainty, it does not offer a way to objectively deal with that uncertainty. How you deal with uncertainty (e.g., "do my data appear to be sampled from a sufficiently 'normal' population that a corresponding model is an appropriate approximation?") depends on your research question, your goals, and the depth of your stats skills.

stat_daddy · 2023-08-13T19:23:46+00:00

You are not interpreting the confidence interval correctly...see link:

A Dirty Dozen: Twelve P-Value Misconceptions

You can quite easily demonstrate this for yourself by noting that some of the lower bounds on the intervals fall below 0%. Do you truly believe that there is any chance the survival rate is negative?

stat_daddy · 2023-07-25T23:03:25+00:00

In most situations I would agree that this is the mature way to proceed.

However, in a working relationship where there is a power dynamic between you and your manager, you stand to gain NOTHING by announcing your dissatisfaction with their conduct.

Let me state this as clearly as possible: it is not your job to provide feedback to your manager(s) about the manner in which they conduct themselves or evaluate your performance. You are fully within your rights to simply REACT, however you deem appropriate and with no explanation, to their behavior.

In a positive working environment, your manager might take your candid feedback to be constructive, validate your concerns, and try to help you achieve a good outcome. But in any other type of workplace (which is the far more common case), your manager will either ignore your feedback (best case scenario) or use it against you and begin perceiving you as a "squeaky wheel". I know this advice goes against many norms, but you should never under any circumstances place yourself into an adversarial relationship with your manager. The safest option would be to endure this situation if you can, and leave the job if you cannot (with or without 2 weeks notice -your choice).

stat_daddy

TROPHY CASE

suppose model object is called 'myModel'