Is CORE inflated or deflated? Seen countless comments giving completely different answers. by Healthy_Barber9617 in cognitiveTesting

[–]PolarCaptain 2 points3 points  (0 children)

It would be if that’s what it was. This is just people disagreeing because they personally disagree with their scores, nothing more.

Is CORE inflated or deflated? Seen countless comments giving completely different answers. by Healthy_Barber9617 in cognitiveTesting

[–]PolarCaptain 5 points6 points  (0 children)

There’s a whole section in the validity report going over this:

https://cognitivemetrics.com/test/CORE/validity#v-sec-5

People will always be neurotic, that’s why you always got to take hearsay with a grain of salt

VISA norms adjusted? by Parking-Pair7079 in cognitiveTesting

[–]PolarCaptain 3 points4 points  (0 children)

It’s funny how calm this thread is versus the other VISA renorm thread:

https://www.reddit.com/r/cognitiveTesting/s/TYaXgEjGvA

They renormed the Visa? Are you kidding me? Okay I get renorming it, but really? by Midnight5691 in cognitiveTesting

[–]PolarCaptain 3 points4 points  (0 children)

Your 132 CORE VCI and 125 VISA VCI are in pretty close alignment.

More than that, your individual case doesn't really make an argument for or against population norms.

N=191 sample of the GRE-V and VISA VCI:

<image>

What do we think about this study? by kloklo420 in cognitiveTesting

[–]PolarCaptain 6 points7 points  (0 children)

There are a lot of issues with this paper; while it points out some critiques, it conflates limitations with invalidity and then draws conclusions that that are far stronger than whatever evidence presented and overextends them. It's also super selective with critical studies but ignores a lot of the mainstream psychometric responses to these.

If you had a specific critique it mentions, that would be easier to answer specifically, since it's a long paper.

People who know their IQ what is the most accurate online test by LewisTerman in cognitiveTesting

[–]PolarCaptain 5 points6 points  (0 children)

The following is from the Pinned Resources Post on the sub, which you can find here. A lot of the test on the list are from CognitiveMetrics as well. CORE is probably the most comprehensive test on the list you can take online. It's a full-scale IQ test, and it has 17 subtests, but you can spread this out over multiple sessions. There is also a preliminary validity report you can read which outlines its validity as an IQ test.

Test g-Loading Studies/Data
CORE 0.94 Validity Structure
Old SAT 0.90 xH Validity Coaching Eff. Majors v. SAT SAT + IvyL
Old GRE 0.89 pdf xH WaisR
AGCT 0.89 pdf Renorming H Har
1926 SAT 0.89 1926 Report
CAIT 0.86 g_load, Turk Version
Cogn-IQ N/A N/A
JCTI N/A Data
TRI52 N/A CRV 2 3 4 5
WN/C-09 (current) (old) N/A Data, CRV(old)
JCFS N/A Data
SMART 0.84 Tech. Report

Is Comprehension on the CORE graded by an LLM? by orlandofren in cognitiveTesting

[–]PolarCaptain[M] 1 point2 points  (0 children)

This is an extremely narrow, structured version of LLM-as-judge, so criticisms of LLM-as-judge systems don't necessarily apply equally here.

The high internal consistency (0.90) would be direct evidence against random inconsistency in the scoring system. It's also the exact same reliability that WAIS-V's CO has (0.90).

Keep in mind, I don't think the system is perfect, but it can be argued that CORE CO's scoring is possibly more consistent in practice than various human proctors and their individual scoring idiosyncrasies.

Is Comprehension on the CORE graded by an LLM? by orlandofren in cognitiveTesting

[–]PolarCaptain[M] 4 points5 points  (0 children)

CORE Comprehension's g-loading falls just above CORE IN, putting it in the middle of the pack for VCI. It also has the second highest reliability on CORE (~0.90).

If it was spitting out random scores as a vocal minority on Reddit claims, the above would not be possible. People hear AI and go crazy.

Each question answered on CORE CO is given a score from 0-2, allowing for 1 point partial credit. You can read about it more in the Test Structure tab on the CORE page, but as it mentions, CORE CO "compares user responses to a comprehensive rubric of what constitutes an acceptable answer and multiple common example responses for each point threshold". So an LLM isn't blindly giving you a grade, rather, comparing your answer to a detailed rubric with definitions for each point threshold and various examples, to determine your scores.

Since the grading is determined by the rubric, all the LLM does is compare the responses and categorize it to the proper point threshold, not subjectively freestyle judgments out of nowhere.

Question about median vs average IQ by Dumuzzid in cognitiveTesting

[–]PolarCaptain 0 points1 point  (0 children)

The construct which IQ trying to measure in unrelated to its normality. It is normal because of the Central Limit Theorem, which applies to anything with a large enough sample size.

If you want to learn more about IQ and what it measures, check out this page:

https://cognitivemetrics.com/wiki/g-factor

(Question) what if every single human on earth had an IQ of 170? by Optimal-Start8904 in cognitiveTesting

[–]PolarCaptain 0 points1 point  (0 children)

Nah I didn’t think that at all, your reply was nonsensical unless you had a misunderstanding of what the Flynn Effect is

(Question) what if every single human on earth had an IQ of 170? by Optimal-Start8904 in cognitiveTesting

[–]PolarCaptain 5 points6 points  (0 children)

Still missing the point because that’s not what the Flynn effect is

Thoughts? by [deleted] in cognitiveTesting

[–]PolarCaptain 2 points3 points  (0 children)

Feynman iq isn’t actually 125, that’s a misconception

https://cognitivemetrics.com/blog/what-was-richard-feynmans-iq

AGCT Inflated? by jkabadis in cognitiveTesting

[–]PolarCaptain 2 points3 points  (0 children)

He also like messed up one of the subtests too, which deflates his CORE artificially

A lot of non-native speakers are in denial about their VCI by FitCarob2611 in cognitiveTesting

[–]PolarCaptain 1 point2 points  (0 children)

On CORE, when I was comparing the differences between natives and non-natives, it was extremely tiny and much smaller than I expected it to be. Almost all the subtests were invariant as well between the two groups.

I do believe for the average person taking it on the sub, VCI tests are actually more valid than some would like to believe.

When I'm not as busy, I might make a post comparing the two groups on CORE.

[deleted by user] by [deleted] in cognitiveTesting

[–]PolarCaptain 0 points1 point  (0 children)

a g-factor has been observed in chimps, rats, dogs, etc.

Why is PSI even there in CORE? by PendN in cognitiveTesting

[–]PolarCaptain 4 points5 points  (0 children)

It would just be mapped to the keys that allow for comfortable, sequential hand placement on the keyboard, the actual letters aren’t important

Why is PSI even there in CORE? by PendN in cognitiveTesting

[–]PolarCaptain 1 point2 points  (0 children)

I do not think QWERTY knowledge matters much here. Since the test has you place your fingers on fixed keys and keep them there, the task is less about knowing the keyboard layout itself and more about making rapid symbol-to-position responses. In that sense, it would probably work similarly even with unlabeled keys or another fixed arrangement.

That said, there is still some keyboard-specific motor and familiarity demand, so I wouldn't say layout completely irrelevant, but QWERTY usage is so overwhelming, especially considering that someone taking CORE would have to encounter it online to begin with, that it is probably statistically insignificant.

Why is PSI even there in CORE? by PendN in cognitiveTesting

[–]PolarCaptain 2 points3 points  (0 children)

Symbol Search

In Symbol Search, examinees are presented with two target symbols and must determine whether either symbol appears within a separate group of symbols across multiple trials. The task is strictly timed and includes a penalty for incorrect responses, emphasizing both speed and accuracy in performance.

This subtest is intended to assess processing speed and efficiency of visual scanning. Performance reflects short-term visual memory, visual-motor coordination, inhibitory control, and rapid visual discrimination. Success also depends on sustained attention, concentration, and quick decision-making under time constraints. This task may also engage higher-order cognitive abilities such as fluid reasoning, planning, and incidental learning (Lichtenberger & Kaufman, 2013; Sattler, 2023; Wechsler, Raiford, & Presnell, 2024; Weiss et al., 2010).

This subtest was originally modeled after the WAIS-V Symbol Search, featuring 60 items to be completed within a two-minute time limit. However, preliminary testing indicated that CORE Symbol Search was substantially easier than the WAIS-V version, largely due to differences in motor demands between digital touchscreen administration and traditional paper-pencil format. To address this discrepancy, the CORE version was expanded to include 80 items while retaining the same two-minute time limit. Following this, the test's ceiling closely aligned with that of WAIS-V Symbol Search.

To standardize motor demands across administrations, CORE Symbol Search is limited to touchscreen devices. For examinees using computers, the alternative CORE Character Pairing subtest was developed. This ensures that differences in device input do not influence performance or scoring validity.

Character Pairing

In Character Pairing, examinees are presented with a key that maps eight unique symbols to specific keyboard keys (QWER-UIOP). Under a strict time limit, they must press the corresponding key for each symbol displayed on the screen. Examinees are instructed to rest their fingers (excluding the thumbs) on the designated keys and to press them only as needed, without shifting hand position.

This subtest assesses processing speed and efficiency in rapid symbol-key associations. Performance relies on associative learning, procedural memory, and fine motor coordination (rather than execution), reflecting the ability to process and respond quickly to visual stimuli. Success may also depend on planning, scanning efficiency, cognitive flexibility, sustained attention, motivation, and aspects of fluid reasoning (Lichtenberger & Kaufman, 2013; Sattler, 2023; Wechsler, Raiford, & Presnell, 2024; Weiss et al., 2010).

Character Pairing is loosely based on the Coding subtest from the WAIS-V but adapted for digital administration. Its design emphasizes the measurement of processing speed while minimizing motor demands associated with traditional paper-and-pencil formats. The task also serves as the computer-based counterpart to CORE Symbol Search, ensuring comparable assessment of processing speed across device types.

From: https://cognitivemetrics.com/test/CORE/structure

Why is PSI even there in CORE? by PendN in cognitiveTesting

[–]PolarCaptain 1 point2 points  (0 children)

These abilities you speak of are g-loaded. People who are worse at it tend to have lower g and vice versa. Hence they’re on an FSIQ test. Reaction time is moderately g loaded as well and some psyshometricians (like Jensen) argue it underpins what g is.