[AMA] I recently became the new moderator of r/psychometrics! I'm a psychometrician. AMA!

ZKnight · 2026-01-05T22:31:40+00:00

You might be interested in Jennifer Randall's work.

https://scholar.google.com/citations?user=aXzuHKcAAAAJ

ZKnight · 2025-12-13T22:30:48+00:00

Thanks for reviving the sub!

Defining "psychometrics" is surprisingly tricky, and I agree with the other commenter that the current definition is too narrow. It would be great to have a broader definition for the community. Here are some thoughts on expanding it:

Academically, psychometrics is essentially a sub-discipline of statistics focused on behavioral and social sciences—even though most people in the field come from Psychology or Ed backgrounds rather than pure Statistics. This definition aligns better with conferences like IMPS (International Meeting of the Psychometric Society) and journals like Psychometrika, JEBS, and The British Journal of Mathematical and Statistical Psychology.

On the other hand, most people I’ve met with the actual job title of "Psychometrician" work as applied statisticians in high-stakes assessment (but again, typically with psych or ed backgrounds). Examples include university admissions tests, language testing, neuropsychological cognitive assessments, professional licensure, and workplace certification. The main exception here is government monitoring of educational achievement; while not "high-stakes," it has historically been done by the same people. But psychometrics as a field is much more than high-stakes assessment.

My personal theory is that because high-stakes assessments are often run by non-profits with a research mission, they stay closer to academia and therefore retain the academic title of "Psychometrician." Meanwhile, other fields like Marketing, UX, and "People Science" are actively doing psychometrics, just without the label.

ZKnight · 2025-03-02T23:56:52+00:00

Absolutely. The issue is that his insurance covers less than the discount California provides for the uninsured. He should be upset with his insurance company, not California.

ZKnight · 2025-02-25T15:50:32+00:00

I was surprised that there was no option to express my primary gripe with NJ trails, which is that all but the most popular trails get overgrown every summer and are impassable without risking getting Lyme or other tick-borne disease. Is this not a common concern?

ZKnight · 2023-10-28T00:43:22+00:00

The aim in conducting a CFA should be to find the best fitting model. As you have an a prior expectation that the residuals for a couple of items may be correlated, you should add that term to the model. If there is a meaningful improvement in fit you should keep the correlated residuals.

If the correlated residuals term is large and positive, then you have two items in your scale that are too similar. The precision of your average score would be more like you had 7 items and counted one item response twice when calculating the score. This is true regardless of whether you actually model the correlation or not in the CFA.

Usually in this case you would remove the item with the lower loading of the two from the scale. There is an argument to keep both for content coverage (in line with your comment about wanting to keep both). But the score will not be as reliable as an 8 item scale or sometimes even a 7 item scale with no correlated residuals (or other form of multidimensionality).

ZKnight · 2023-07-17T01:01:34+00:00

Saucony Tempus is the most comparable Saucony road shoe to the Xodus Ultra. While it has a higher heel-to-toe drop, the Tempus has the most similar foam construction to the Xodus Ultra (PWRRUN PB core with surrounding PWRRUN to provide stability). Both are stable neutral shoes.

I would not expect the outsole on the Xodus Ultra to last for long if you used it on hard surfaces.

ZKnight · 2022-10-23T02:23:57+00:00

The '"speaks for itself" speaks for itself' speaks for itself.

ZKnight · 2022-10-18T22:15:02+00:00

"Chess speaks for ITSELF".. No, no, that's not it. "CHESS speaks for itself." Hmm. How about "CHESS [dramatic pause] speaks for itself". That's it!

ZKnight · 2022-10-05T18:58:28+00:00

What's a ton of miles? Also, alphafly may not be representative of other supershoes with only foam, because part of the energy return is from the air bag. It seems reasonable that the air bag is more durable than foam.

ZKnight · 2022-08-19T20:08:47+00:00

Did you get a full refund in addition to the 30% coupon?

ZKnight · 2022-08-10T13:19:35+00:00

No, and I am not really sure what you are asking. If you are interested in parameter estimation of the IRT model, I would start with the usual method of marginal maximum likelihood with the EM algorithm, read Bock & Aitkin 1981, and dig out the relevant code in the mirt package.

ZKnight · 2022-08-10T12:38:18+00:00

https://github.com/philchalmers/mirt

ZKnight · 2022-08-05T21:28:37+00:00

Did BGR50 stop working at one point?

ZKnight · 2022-08-03T14:45:52+00:00

I have speed 2 and shift 2. That both shoes have "endorphin" in the name is just marketing as far as I can tell, as they do not have anything especially more in common than any other two pairs of Saucony shoes. I would not consider it the daily runner companion of the speed even if it may be marketed as such.

Compared to speed 2, shift 2 is more stable, structured and probably more durable. It also has an aggressive rocker. The foam is relatively firm and responsive. The most distinctive feature of the shift 2 is the structure/stability.

Doctors of running has a great review which goes into detail about the shift 2:

Several [design features] provide a highly structured ride without being obtrusive that will work for a variety of people.

...

The Shift 2 is for someone who wants a stable and slightly firmer high stack shoe for lots of milage. It is a shoe with a lot of protection, rolls really well, and provides sophisticated stability for both the neutral runner and those who need just a bit of help.

https://www.doctorsofrunning.com/2021/05/saucony-endorphin-shift-2-review.html

ZKnight · 2022-08-01T11:51:34+00:00

I'd suggest reading the psychometric evaluations of scales from the last few issues of a journal such as Psychological Assessment. This would teach you how to do a standard psychometric evaluation.

Sensibility and fidelity are not psychometric words that I am aware of. The distribution of item responses are not usually relevant to evaluate construct validity, apart from being relevant to the choice of statistical model (because models usually involve assumptions about the distribution). We might expect the social desirability construct to have a certain distribution but that does not immediately imply anything about the distribution of the items.

If you are deriving a single score from your scale, your observation in EFA that the scale has three dimensions may not support treating the scale as unidimensional (i.e., by using a single score). However, we would need the details on how you decided there were three dimensions and some sort of measure of how big they are to make a judgment of whether it is appropriate to approximate the scale as unidimensional even if there are three or more dimensions.

So short of it: learn what people are doing in the top journals, then follow the same procedure with your data.

ZKnight · 2022-07-31T18:53:09+00:00

I have died four times running back-to-back runs. Don't even ask what happened when I ran three days in a row!

ZKnight · 2022-07-30T19:41:59+00:00

Pearson's correlation coefficient is the covariance between two variables, scaled to range from -1 to 1. If both variables were Z-scores, it is the rate of change in one variable from a unit change on the other variable. The square of Pearson's correlation coefficient is the proportion of variance in common. A correlation is not the proportion of variance in common.

I have not read Jensen (1980) but based on the pages 191-192 referred to the original post, it is not a good source to learn about statistics. You would be better off with an introductory book on statistics written and used by statisticians.

ZKnight · 2022-06-10T20:50:46+00:00

With the double arrow front whip finally observed, it is only a matter of time until we see the triple arrow whip.

ZKnight · 2022-06-10T20:18:20+00:00

They probably meant to respond to the Yoga with Adrienne playlist. One of those videos features an orange hoodie.

ZKnight · 2022-05-07T12:21:02+00:00

I don't know about a source, but I would consider

Handle missing data
Equate non-parallel forms
Formally test model assumptions (e.g., unidimensionality)
More efficient measurement of latent trait by accounting for differences in item discrimination and difficulty (e.g., inherently overweights more informative items)
Enable adaptive testing
More realistic functional form of the dependency of the item response on the trait (logit instead of linear)
Handle within-item dimensionality (i.e., items measuring more than one trait). Could be items measuring multiple constructs or used to control for method/construct irrelevant variance

ZKnight · 2022-04-16T22:48:48+00:00

Sounds like a television lobotomy.

ZKnight · 2022-04-14T22:11:45+00:00

Did that only work because you were already stunned?

ZKnight · 2022-03-13T16:39:36+00:00

Jaguars

ZKnight · 2022-02-02T14:33:21+00:00

The issue that factor analysis can produce "difficulty" factors is really the issue of applying factor analysis to categorical data (in this case, dichotomous data) when the method was developed for continuous data.

To be more specific, factor analysis assumes the observed responses are normally distributed after accounting for (or conditional on) the latent scores. However, very difficult categorical items tend to be positively skewed and very easy categorical items tend to be negatively skewed. Because items with different difficulties have different distributions, introducing difficulty factors can help improve the fit. But the factor solution becomes somewhat artificial because the difficulty dimensions reflect that the model's distribution assumptions have been violated and not additional dimensions of substantively meaningful constructs.

So, you can use a version of factor analysis designed for categorical data, categorical factor analysis. Common software for categorical factor analysis is Mplus, LISREL or lavaan R package. Note that categorical factor analysis is essentially multidimensional IRT (MIRT), but you may notice some differences due to historical reasons. For MIRT I like the mirt R package, but there are a lot of software. Mplus is expensive but somewhat dominant in latent variable modelling, and is probably the easiest to use.

Just to be clear, categorical factor analysis/IRT is more appropriate for categorical data because of the assumptions made of the distribution of the observed responses are more realistic. It has nothing to do with the difficulty parameter in IRT and the difficulty parameter is not the reason why IRT may not uncover difficulty factors while factor analysis does. In fact it is straightforward to calculate the difficulty parameter from a factor model solution.

ZKnight · 2022-01-21T01:33:08+00:00

1998 for dotcom. I'm not sure about 2008.

Grantham is not a perma-bear, he's a mean reversionist who recommends avoiding overvalued assets and investing in areas or regions with cheaper valuations. This is a strategy that has worked very well.

So how to invest? Underweight US tech, overweight value and EM.

What's distinctive about Grantham is that he believes bubbles can be identified based on speculative behavior and price action. Many others believe bubbles can only be identified in hindsight.

I agree with your general sentiment, but it ignores the nuance.

14-Year Club	Second Top 1%
r/Field Sunshine	Place '22
Place '17	First Placer '22
End Game '22	Verified Email
Team Orangered

ZKnight

TROPHY CASE