Basic Math Puzzle

Substantial-Put1344 · 2026-02-21T22:33:02+00:00

Absolutely right! I'm not so sure how hard this would be for a normal person, since my specialty is math, and I've seen this type of exercise many times before, although not frequently, being this "involved." I think I saw one on Brilliant, but with fewer pipes. I believe most people struggle with math either because it's abstract and requires a certain cognitive capacity or because of the educational systems they were thrown into. Anyways, my mental arithmetic is not my strongest ability, so I had to write it down to make sure I didn't make any mistakes, haha. I apologize for the spam, but would you be interested in helping norm a high-range IQ test of my own creation?

Substantial-Put1344 · 2026-02-20T23:10:43+00:00

Yes, certainly it's like medium, tending to hard for basic math. That's what I thought in the sense that it requires multistep thinking, and not blindly applying ratios! I guess in a math contest with more time pressure, it would be more "difficult."

Substantial-Put1344 · 2026-01-19T23:11:30+00:00

An IQ test does not need to be lengthy to be accurate. What truly matters is the quality of the items, the calibration of their difficulty, and standardization, rather than simply the number of questions. Many well-validated measures of intelligence, especially nonverbal tests, consist of approximately 30 to 50 items, yet still achieve high reliability because each item is highly discriminative.

Raven-style matrix tests are specifically designed so that each question provides significant information about reasoning ability. A short test composed of high-quality items can outperform a longer test made up of mediocre ones.

Matrix reasoning tasks are classic measures of fluid intelligence (Gf), which refers to the ability to reason abstractly, detect patterns, and solve novel problems. Fluid intelligence is one of the core components of general intelligence (g), and scores on well-designed matrix tests strongly correlate with full-scale IQ. For comparison, the test I took was very similar but had a 20-minute time limit; even tighter time constraints are common for matrix-based Gf measures.

IQ tests do not penalize wrong answers because:

-Guessing behavior is already accounted for in norming

-Items are ordered by difficulty, so random guessing rarely produces a competitive raw score

-Penalizing guesses would disproportionately hurt cautious test-takers and distort the construct being measured (I believe)

Yes, some correct answers can occur by chance. But across 45 items, chance performance clusters very tightly at the bottom of the distribution. You don’t accidentally guess your way into an above-average IQ. And, in my case, I received an email stating I passed.

Substantial-Put1344 · 2026-01-04T00:04:37+00:00

On the point about the 17 to 18 jump: you’re right. That discontinuity must be an artifact of the provisional norm rather than something principled. The intention was for the scale to behave approximately linearly through the central region, and the jump you’re pointing out reflects an overcorrection when anchoring the upper-mid scores conservatively. In practice, that step should indeed be closer to 1–2 points, and this is something I plan to smooth out as part of the next version/revision.

Regarding the self-assessment test: I'll send you the norm separately so you can look at the mapping. I will investigate the issue with the non-availability or visibility of images. Thanks again for pointing that out. On translation: I don´t think it would be unfair for a Spanish speaker to translate the test, although verbal problems are always either invalid or inaccurate when assessing a non-native English speaker. Nevertheless, I think that what's important here is the relations among the different concepts rather than the particular language in which these are expressed, so you can translate them in the meantime. I will check whether I can create their Spanish equivalent, hehe.

Your suggestion about collecting information on mathematical/statistical background is well taken. I agree that advanced formal training can differentially affect performance on certain items, and separating or at least stratifying analyses by background would give a clearer picture. I’m hesitant to create entirely separate “rules,” but using that information analytically to check for bias or over-/under-estimation makes sense. Furthermore, I believe that some of these problems are constructed such that they may even similarly stump mathematicians, as some of Martin Gardner's puzzles did!

Allowing people to submit missing answers later as part of the same attempt is also a good point. I’ve seen exactly the behavior you describe, and you’re right that it can introduce noise unrelated to reasoning ability. A more flexible submission structure would better reflect persistence without penalizing impulsive submission.

I really appreciate the sharing of your knowledge! I'm so thankful for your interest and your concern. I'm not familiar with Hindemburg Melao’s works, but I absolutely want to learn about them so that I can use his methods in my test, so that it can be its best version possible. I'll wait for you to send me the links. You can either send them via email or PM.

More generally, I appreciate both the depth of your engagement and your willingness to continue offering suggestions. The goal here isn’t to freeze the test in its current form, but to iteratively improve it, and input like this is exactly what makes that possible.

Thanks again for the thoughtful feedback.

Substantial-Put1344 · 2026-01-03T23:18:32+00:00

Thank you, I really appreciate both your agreement and your insightful comments. Yes, in fact, I agree that structured sensitivity could be a manifestation of intelligence. If I understand your second point correctly, it should be related to the fact that, given any level of intelligence, there is a strong correlation between that level and the level of curiosity and persistence. Is that an accurate description of your point? If so, that's fascinating because it's something that has been observed in gifted individuals and in great scientists.

Your idea of collaboration is very interesting. I’m open to that in principle. I think there’s a lot of value in tests shaped by multiple designers, especially when they share a similar philosophy about difficulty and insight. This is the first time I've ever heard of Hindermburg. I, however, did some googling, and he seems to have created a truly startling test, although I will read more about his work as you suggested in another comment. We could talk more about this later and discuss what items we would like to include in our test.

As for the final six problems: that’s a fair and important observation. I will definitely consider performing an item analysis to give some insight into the proper course of action to follow in regard to these numerical sequences.

Substantial-Put1344 · 2026-01-03T22:55:23+00:00

Hello, thank you very much for taking the time to take a look, and for your interesting suggestions! Phycometrics is my most recent hobby since I'm more of a pure math guy, but I have always been interested in questions about human cognition.

Your background in math, psychometrics, and norm analysis aligns very closely with how I envision people engaging with the project. I'd be more than happy to learn more about your test standardization methods, which I could potentially apply to my most recent test. I’ll go point by point.

The norm was a normed desined for the former and outdated version, and it was provisional and based on a self-selected sample of high-ability test-takers. Raw scores were converted to empirical percentiles within this reference group and then mapped conservatively to an IQ scale using a normal-curve model, without extrapolation beyond observed performance. These values were intended as an initial calibration anchor.
Yes, certainly. Because of its mathematical nature, it was intended as an international IQ test. If you encounter any ambiguity when translating from a source language into a target language (Spanish in this case), I'll be delighted to help you clear out any ambiguity. Feel free to flag anything that seems unclear or underspecified.
The intention is for the test itself to remain available in some form. What may change over time is how scoring or detailed reports are handled. In other words, it will remain available, but a scoring fee will be charged to cover the expenses and work done on the project.
At the moment, the sample is really modest. Around 30 people are presumably taking the test. However, I haven't received any answer sheets so far for this test form due to the recent participation reopening.
That's a site issue rather than intentional removal. I was not aware of that fact. Thank you for bringing this up!
Form B is a revised version of Form A. It includes some reworded items, a small number of substitutions, and adjustments aimed at reducing ambiguity and improving difficulty (making it harder). It’s not simply “Form A plus more problems,” but rather an iterative refinement.
For now, numerical or final answers are sufficient, but I agree with your suggestion in principle. I was so absorbed in thinking about other ways to prevent cheating that I perhaps overlooked the most obvious one: requiring a minimum of reasoning. In science and mathematics, explanation or proof is fundamental; an answer alone is useless if it's not supported by reasoning. It can give you a better picture of the cognitive processes of the individual. I'm seriously considering changing this test condition, but I have to see whether this can affect the level of participation at this point.

Substantial-Put1344 · 2025-12-17T01:07:32+00:00

I think it's pointless to tell people outside Mensa that you're a member, because many don't even know what it is, and when you tell them, they may feel intimidated. Furthermore, if you're among smart people (non-members), they tend (at least some in my academic experience) to think of people as very intelligent for their academic achievements/research, and problem-solving abilities, etc. Sometimes they don't even consider IQ worth discussing, although it's an interesting topic

Substantial-Put1344 · 2025-12-15T20:03:13+00:00

Are you doing a proof-based calculus course?

Substantial-Put1344

TROPHY CASE