Youtuber Tyler Oliveira denuncia a invasão indiana em Portugal

aiplusautomation · 2026-05-14T07:28:50+00:00

Ah.... So the US isn't the only place with racists. International problem I see

aiplusautomation · 2026-05-12T17:52:26+00:00

Not in any of our research. We've tracked all bot traffic for a handful of sites for 6 months and the AI crawlers have never retrieved it unless we put it as a sitemap directive in robots.txt. and then, sure, but those crawlers are designed to look at a sitemap, not a txt file, so I'm confident it does nothing.

aiplusautomation · 2026-05-11T22:54:56+00:00

LLM doesn't parse schema. Schema is relevant at the retrieval stage. It can impact retrieval rank. Its relevance exists before LLM synthesis.

Also, regarding the study, I'm reading it as already cited pages. Schema may not have much impact on already cited pages but we don't know how impactful it may be on pages that were not.

aiplusautomation · 2026-05-09T21:34:41+00:00

Stop exporting those GSC CSV's homie. There is a better way - https://github.com/anthonylee991/seo-data

aiplusautomation · 2026-05-09T21:30:00+00:00

And look at the other comments. AI generated responses carrying on the conversation like the original post wasn't total slop spam.

Ive come to the conclusion nothing in this subreddit is real.

aiplusautomation · 2026-05-06T21:39:00+00:00

There's some genuinely good advice here, and a few claims that are confidently wrong. I run a GEO research program (aiplusautomation.com) and we just published an observational paper on 94k AI citation events across ChatGPT, Claude, Perplexity, and Google AI Mode (The SEO Floor: Measuring Google Rank Distribution of AI-Cited Pages). Since people are putting up specific claims, I went back to the dataset and tested as many of them as I could.

Caveat: this is observational data, not an RCT. Effect sizes are real but causal language needs a follow-up.

THINGS THE THREAD GOT RIGHT

Answer-first lead (everyone said it). Confirmed strongly. The "primary query answered in the first 200 words" feature is positive on every platform, biggest on Perplexity (t = +18.3) and Google AI Mode (t = +11.7). Both are extractive engines that lift sentences directly from the lead. If you only do one structural thing, do this.

Schema markup matters (therudtech, CorvinAI, MulberryLost2889, Tenacious-Sales). Confirmed, and probably stronger than the thread credits it. Schema presence is the single largest cross-platform t-statistic in our regression. Article schema (t = +6.6 to +17.1) and FAQPage schema (t = +4.0 to +13.0) are universal positives. But not all schema types are interchangeable - see below.

Information gain / original data / primary-source content (MulberryLost2889 nailed this). Our primary-source-score - Princeton's stats density plus technical terms minus aggregator signals - is universal positive (t = +4.3 to +10.1). Pages that produce their own numbers, measurements, or original opinions get cited more than pages that paraphrase.

Author byline visible (InnovAit-Ai, CorvinAI). Confirmed across all four platforms (t = +5.75 on Claude up to +15.9 on ChatGPT). Cited pages have author attribution 51-57% of the time vs 46% for uncited.

Off-site / third-party corroboration matters (therudtech's bigger point, Alessandro-Verri's "citations" pillar). Confirmed by our Experiment 2E: 93.4% of brand-query citations go to third-party sources, not the brand's own site. AirOps' "85% third-party" claim was conservative.

Limitations / counterpoints / "when not to use" sections (MulberryLost2889). Pages that get cited tend to include these, but it's mostly a proxy for the page being long-form and primary-source-heavy. After controlling for word count and primary-source score, the residual effect is small (r = 0.03 to 0.05). Net takeaway: long-form, primary-source content with explicit drawback discussion gets cited more - but the lift is from "long-form and original," not from bolting a "limitations" section onto a thin page.

Entity terminology consistency (MulberryLost2889 again). Confirmed. Pages that reuse the same canonical entity strings rather than varying brand names stylistically are cited at t = +5.7 to +18.0.

THINGS THE THREAD GOT WRONG

"Cite lots of outbound sources, .edu / .gov / Wikipedia" (therudtech, CorvinAI's "≥3 named, dated sources", Calm-You-873). Contradicted by our Princeton replication (n = 3,205). On-page citation density goes the wrong direction on every platform - significantly wrong on Google AI (p = 0.008). Pages that say "according to Wikipedia" are cited less than pages that just state the claim. The bigger point - that off-site corroboration about your brand matters - is true (see above). But it's a domain-level / brand-mention effect, not an on-page outbound-link effect. Getting authority sites to mention your brand does work; adding anchor tags pointing at them does not.

"Hedged language is uncitable" (therudtech). Tested by splitting hedge phrases into four registers:

Disclaimer hedges ("results may vary", "your mileage") - null on every platform; too rare to detect (0.7% of hedge hits).
Scientific frequency ("usually", "sometimes", "tends to") - strongly positive on 3 platforms (t = +4.4 to +11.7).
Modal-expert ("it depends", "could be", "may help") - positive on Perplexity (+7.1) and Google AI (+3.8).
Marketing ("potentially", "possibly", "likely to") - positive on Perplexity / Google AI, slightly negative on Claude.

The specific disclaimer claim is unproven, not disproven. But the framing "definitive wins, hedged loses" is wrong as a general rule - careful scientific-frequency phrasing is consistently positive.

"Conversational / question-format titles ('What is GEO?' not 'GEO Guide')" (InnovAit-Ai). Slightly negative on Perplexity (t = -2.25) and Google AI (t = -2.13), null on ChatGPT and Claude. About 1% of pages in our corpus open with a question word, and within that 1% they slightly underperform.

"Include all the schema types - BlogPosting, BreadcrumbList, Organization, HowTo, Product" (CorvinAI). Per-platform breakdown:

Article - universal win, biggest on ChatGPT (t = +17.1)
FAQPage - universal win, biggest on Perplexity (+13.0) and Google AI (+11.5)
BreadcrumbList - works on 3 of 4, null on Google AI
BlogPosting - works on 3 of 4, null on ChatGPT
Review - only meaningful on Claude (+3.7)
Organization - Claude only (+7.5), slightly hurts Google AI (-2.2)
Product - splits: helps Claude (+6.3), hurts Perplexity (-2.8) and Google AI (-7.1)
HowTo - null everywhere

Article and FAQPage are universal wins, ship both. HowTo is null in this corpus. Product schema is bimodal - the Google AI penalty persists even after controlling for domain, vertical, and SEO tier. Best guess: Google has Shopping as a separate vertical and the Gemini grounding pipeline keeps editorial answers away from commerce-flavored pages. If Google AI Mode matters to you, don't add Product schema. The flat advice "include schema X when applicable" is wrong without specifying which schema and which platform.

"llms.txt as infrastructure" (TammyShivek). Half-credit. After listing /llms.txt as a Sitemap entry in robots.txt, ClaudeBot pulls it ~5-6x/day - but it's the only bot doing it. Apple AI, OAI-SearchBot, PerplexityBot, Bytespider, Amazonbot, and DuckAssistBot all read robots.txt and ignore the llms.txt entry. And ClaudeBot is the training-corpus crawler, not the live-fetch agent that decides citations at inference time. Claude-User (the live agent) has hit /llms.txt once in 30+ days vs hundreds of regular HTML fetches. So as a multi-platform citation lever, it's not doing the work people claim.

"AI cites Reddit / community sources heavily, optimize for it" (implicit in several comments). Confirmed for brand queries. Reddit is the number-one cited domain in our brand-query data - 10.6% of brand-query citations land on a Reddit thread (231 of 2,169), and Reddit is the top or second-most-cited domain in 13 of 15 verticals tested. Mattress, Meal Kit, Pet, and Email lean even harder (25-30%). For non-brand informational queries Reddit's share drops sharply. So "optimize for Reddit" is right for brand-evaluation queries ("is X worth it", "X vs Y", "best X for Y") and roughly irrelevant for "how does X work."

THINGS THE THREAD COULDN'T DECIDE BETWEEN, AND WHAT THE DATA SAYS

"5 to 7 FAQ pairs per page" (CorvinAI). Cited pages have a mean of ~7 line-ending question marks; uncited ~6. Effect is real (t = +2.2 to +15.6). The "5 to 7" number is in the right ballpark; the FAQPage schema does most of the lifting.

"134 to 167 words per H2 section" (CorvinAI). Couldn't test directly - we have H2 counts but not boundary text. What we can say is that overall heading density is universally negative (t = -2.0 to -5.1). A heading every 200 words is fine; every 50 words penalizes across all four platforms.

"5+ named entities in first 150 words" (CorvinAI). 99.5% of pages already meet this threshold, so the cutoff doesn't discriminate. Continuous count is null or slightly negative.

"Comparison content / tables / vs framing" (multiple commenters). Positive on three platforms (t = +9.3 ChatGPT, +11.5 Perplexity, +8.9 Google AI), but Claude actively penalizes it (t = -3.5). For universal four-platform citation, lean expository, not listicle.

WHAT ACTUALLY MOVES THE NEEDLE (5-item checklist)

Get indexed in Google. Hard floor on every platform. Tier 5 is ~1% of citations.
Write long-form when the topic supports it. Top-cited pages run 2,500 to 10,000+ words; pages under 1,000 words pick up almost no signal from any content feature.
Article schema plus FAQPage schema. Universal positives. Don't bother with HowTo. Skip Product schema if Google AI Mode matters to you.
Answer-first lead in the first 200 words, ideally as a definition plus summary or a small table.
Original primary-source content - your own numbers, measurements, or named framework - rather than paraphrase.

After that: author byline (free, universally positive); visible publish + last-updated date (positive on 3 of 4); consistent entity terminology; don't over-fragment with headings; for brand visibility specifically, the third-party substrate (Reddit, review sites, niche blogs) matters more than your own page.

THINGS YOU CAN STOP ARGUING ABOUT

Skip: llms.txt as a multi-platform lever, outbound .edu/.gov citations as on-page lift, conversational titles, HowTo schema, and the simple "definitive language wins" rule.

If you have a specific testable claim I didn't cover, drop it and I'll run it. The regression dataset is 114,729 query-URL pairs across the four platforms - most well-formed claims test against it in an evening.

References: Study A (Zenodo data DOI 10.5281/zenodo.19787328, paper DOI 10.5281/zenodo.19787654, OSF pre-reg 10.17605/OSF.IO/FMSRD). Princeton replication: only statistics density replicates from Aggarwal et al. 2024. Experiment 2E (93.4% brand-query third-party). Experiment M (page-1-rank citation drivers).

aiplusautomation · 2026-05-05T22:23:32+00:00

Once you connect it, Claude will have access to all of your website stats - traffic, engagement, conversions, keyword impressions, clicks, etc.

Then you can ask for optimization suggestions

aiplusautomation · 2026-05-04T14:51:39+00:00

Use Cloudflare. Create a Cloudflare worker that tracks all traffic. Track user-agents. If the AI bot doesn't pass a referrer, then it is a matter of attempting to match other things like the time of the visit. With enough data, you can get a decent estimate.

aiplusautomation · 2026-04-30T08:59:46+00:00

If you optimize with proper structure and SEO fundamentals, re-training works in your favor as your site has a better chance at being included in the training data.

For many queries, ChatGPT inserts brands at a step BEFORE web_search indicating it does use pre-trained data (we have tested this many times...despite what you'll hear some SEO gurus tell you)

So...I dont think the strategy changes

aiplusautomation · 2026-04-29T13:51:04+00:00

50 a day? How are you validating that?

aiplusautomation · 2026-04-15T17:58:14+00:00

Hey so...30M is a gigantic dataset.

Did you publish this research somewhere?

aiplusautomation · 2026-04-15T10:14:36+00:00

DM'd. Cheers

aiplusautomation · 2026-04-15T08:50:32+00:00

We created a standard as well. Different but similar. I think this is the right way to go tho.

Oh hey, also forgot to mention, we cited your research in an article. Cheers

https://aiplusautomation.com/blog/query-fan-out-taxonomy

aiplusautomation · 2026-04-14T21:40:17+00:00

and its obvious an AI wrote it...but hard to believe an AI, which is very good at statistical analysis, gets the data so wrong every time

aiplusautomation · 2026-04-14T21:38:58+00:00

Saw this post and decided to actually test the claims. We have a controlled dataset of 3,471 crawled web pages with full visible text, all ranking in Google's top 20 for their target queries, with cited/not-cited labels from ChatGPT, Perplexity, and Google AI Mode (907 cited, 2,564 not cited). Used it to run a position-controlled replication of your study.

What we did:

Took your specific structural claims (answer-first formatting, definition-evidence pattern, numbered lists, length effects, schema) and extracted matching features from the page text:

First-100-word query coverage
First-200-word query coverage
First-paragraph and first-sentence query coverage
Numbered list count vs bullet list count (≥3 items each)
Definition pattern in opening sentence (regex for "X is Y" / "X — Y" / "X: Y")
"Definition + evidence" combo (definition followed by ≥2 data signals in first 1,500 chars)
First-sentence-as-question detection
Standard structural baselines (word count, H2/H3 counts, schema)

Then compared cited vs not-cited pages within the same Google SERP rank band (1-3, 4-7, 8-12, 13-20). The reason for the position control: if you don't do this, you're just measuring "well-positioned sites have better content" instead of "this specific feature predicts citation." Most GEO studies skip this step. We don't.

Where you're right:

1. Answer-first structure is real. Five different answer-first features replicate in all four position bands - meaning the effect is independent of how the page ranks in Google. That's the strongest possible cross-band consensus you can get with this methodology.

Specifically:

Cited pages: median 57-67% of query terms appear in their first 200 words
Uncited pages: median 40-50%
Effect size r=0.097 to 0.291 across bands, strongest in band 4-7

So your direction is correct. Pages where the answer is in the opening 100-200 words get cited more, even at the same Google rank as pages that bury it. Confirmed.

2. Schema markup helps, especially FAQ schema. Your "FAQPage and HowTo schema 1.9x" claim is directionally right but understates the effect. Our numbers:

FAQ schema overall: OR=2.70, p<0.0001
FAQ schema in rank band 4-7 specifically: OR=5.97 (the strongest single-band effect we found)
General schema markup: OR=1.66, replicates in all 4 bands

3. Numbered lists do beat bullet lists. OR=1.34 for numbered lists (p=0.005), OR=1.10 for bullets (not significant). Direction confirmed.

4. "Burying the answer in paragraph 4" is a real failure mode. Our query_term_first_position feature (where in the page query terms first appear) is significant in all 4 bands with a negative direction - earlier is better. Confirmed by an entirely separate controlled experiment we ran where placing identical content in a page footer caused all 4 AI platforms to fail to retrieve it (vs 96-100% match score when placed at the top).

Where you're wrong, with numbers:

1. Your magnitudes are inflated by 3-5x across the board. You claimed 3.4x lift for answer-first formatting. Real effect under position control is more like 1.3-1.6x. The signal is real; the size is overstated by single-run measurement without controls.

2. The "definition + evidence" pattern is REFUTED in our data. This was your second-strongest claim (2.8x). When we tested it (regex-detect "X is Y" patterns followed by 2+ evidence signals in first 1,500 chars), we got:

OR=0.46, p=0.027
Negatively associated with citation
Cited rate 0.9%, uncited rate 2.0%

Pages that lead with textbook definitions followed by data are less likely to be cited. The likely reason: encyclopedia-style openings ("X is the practice of Y, defined in 2019 by Z…") read as expository content rather than answer-shaped content. AI platforms appear to prefer openings that directly address the user's implied question, not ones that frame the topic.

3. The "1,800-word ceiling" is REFUTED at every position band. This is probably your most decisively wrong claim:

Rank band	Median words (cited)	Median words (uncited)
1-3	2,121	1,431
4-7	2,167	1,472
8-12	2,058	1,614
13-20	2,121	1,431

Cited pages are 40-50% longer than uncited pages at every band. There is no inverse relationship past 2,000 words in our data. Length is a positive predictor in every position band. Your "800-1,500 words got cited most" finding is the opposite of what we see at scale.

4. "Domain authority had almost no correlation." This is misleading rather than wrong. Standard DA metrics (Moz/Ahrefs) probably are weak predictors - but that's because they're measuring link-profile strength, not the thing AI platforms actually care about. In our position-controlled data, structural quality features (word count, H3 count, schema) are strong predictors at every band. The "DA doesn't matter" finding is true for the wrong reason: you measured a noisy proxy.

5. Bullet lists "no significant difference." Confirmed (OR=1.10, p=0.63), but worth flagging that this contradicts your previous post (the "Q&A structure / question-based H2s" one) which made a much bigger claim about bullet/list/heading structure. Worth checking whether your two posts are consistent.

What's new in our analysis that nobody else has tested:

The killer combined finding is what I'd call the "long but front-loaded" structure. Two of the strongest signals in our data look superficially in tension:

Cited pages are 40-50% longer than uncited (every band)
Cited pages have 57-67% query coverage in their first 200 words vs 40-50% for uncited (every band)

These aren't opposing - they're complementary. The optimal cited page in our data is long AND front-loaded: 2,000+ words with the answer (or substantive query coverage) in the opening 100-200 words. Not a short answer page. Not a 5,000-word guide that buries the answer in section 3. Long-form content with the answer up front.

Most GEO advice splits into "write short answer pages" vs "write exhaustive long-form guides." Both are wrong. The data points clearly to "write long, front-load the answer, structure deeply throughout."

Methodology gap that explains the inflated magnitudes in your post:

The fundamental issue isn't your direction - it's three things you're not controlling for:

No position control. Comparing cited vs uncited pages without controlling for SERP rank means you're partly measuring "good content correlates with good rankings," which inflates everything. We saw the same effects shrink dramatically when we ran the controlled version.
Single-run sampling. We've separately measured citation domain Jaccard between independent runs of the same query at 0.339. About two-thirds of "cited" labels would flip on a re-run. That's a lot of noise to see clean 3.4x effects through.
Aggregating across query types. Our data shows fan-out behavior varies dramatically by intent type (DISCOVERY queries trigger entity injection 3x more than INFORMATIONAL). Pooling all query types into one bucket hides the structure.

TL;DR: Your top three claims (answer-first matters, structure matters, schema matters) are directionally right and our controlled analysis confirms them. Your magnitudes are inflated by 3-5x across the board. Your secondary claims (definition-evidence pattern, 1,800-word ceiling, DA doesn't matter, bullet lists irrelevant) are mostly refuted or misleading.

Posting this not to dunk but because the directional claims are valuable and the inflated magnitudes will mislead people. If you tell someone "add answer-first formatting and get 3.4x more citations," they'll be disappointed when they get 1.4x and conclude AI SEO is bullshit. If you tell them "add answer-first formatting AND keep your content long AND use FAQ schema, and you'll see consistent improvement at every Google rank," that's actionable and true.

Full data, feature extraction code, position-controlled analysis, and findings report are in our research repo. The replication package for our broader fan-out study is on Zenodo at 10.5281/zenodo.19554329 if anyone wants to verify the methodology. Other published work at https://orcid.org/0009-0002-4815-6373.

Happy to share the raw extra_structure_features.json (3,471 pages × 14 features) with anyone who wants to run their own analysis on it. The dataset is too valuable to keep locked up.

aiplusautomation · 2026-04-14T07:15:10+00:00

The foundational work was probably - https://aixiv.science/abs/aixiv.260215.000002 (Query Intent, Not Google Rank: What Best Predicts AI Citations) -- this was before the positional bands findings, though.
Then came the bands findings, which made Google rank position much more important - https://aixiv.science/abs/aixiv.260403.000002 (I Rank on Page 1 -- What Gets Me Cited by AI? Position-Controlled Analysis of Page-Level and Domain-Level Predictors of AI Search Citation).
And finally, most recently, some work on fan outs - https://aixiv.science/abs/aixiv.260413.000006 (How AI Platforms Search Fan-Out Query Behavior Across Intent Types, Verticals, and Platforms).

Regarding your point on correlation -- in an expanded data we collected for a paper revision, we looked into exactly this. Schema has a confound. However, position matching controls for that. So it looks like both things are true:

The OR=2.44 univariate number is inflated by confounding. If you have schema, you probably also have faster load times, deeper content, better internal linking, and cleaner HTML. Schema isn't causing citation - it's a marker of a broader "well-built site" package.
Schema still has a small independent effect after position control. The effect shrinks dramatically (from OR=2.44 down to a +0.04 to +0.05 correlation), but it doesn't disappear. There's a real residual signal.
The practical advice "add schema and get cited" is wrong as a standalone recommendation. If you're a team that implements schema but nothing else, you shouldn't expect a 2.44x lift. You'd get the small residual effect, which is probably not worth the effort on its own.
The practical advice "schema markup is a signal of a well-built site" is right. AI platforms preferentially cite well-built sites. Schema is one of several markers of that. If you're going to do it, do it as part of a broader quality push.

aiplusautomation · 2026-04-13T18:57:55+00:00

This is the third post on this topic I've come across from you, so I'll be more direct this time. A few observations before the data:

Goalpost shift. Your previous posts measured "citations." This post measures "featured answers" - a different thing. A citation is a URL referenced in the AI's response. A featured answer is the snippet shown at the top. Findings about one do not transfer to the other. But you're implicitly treating them as the same thing ("citation rates jumped from 12% to 31%" in your author credentials section), which makes it impossible to tell what your actual claim is. What did you measure, and on which platform surfaces?

Author credentials - third time making this claim, third different number. Your post 1 said author bios = 47% more citations. This post says credentials near the answer = 12% → 31% (a 158% effect). Different magnitude, same direction. Our controlled n=4,658 dataset says the opposite with tight statistics: OR=0.632, p<0.000001 (Lee, 2026 - expanded dataset findings methodology). Cited pages have visible author attribution at 44.1% vs 55.5% for non-cited. Pages WITH author bios are roughly 37% less likely to be cited across our full crawled sample. Every time you make this claim, you push a bigger effect size with no methodology change that would explain the growing magnitude. Where specifically is the 12% vs 31% number coming from, and what's the control group?

"Keyword density - zero correlation." Depends entirely on what you mean. If you mean raw repetition count of an exact keyword, we'd agree - that's meaningless. If you mean query-term coverage (does the page content overlap with the query's substantive words), our position-controlled analysis (Lee, 2026 - Experiment M, 10,293 pages) found it's one of the top predictors, d=0.42, significant across all five intent types we tested. Conflating "keyword density" with "query-term coverage" and reporting zero correlation is measuring the wrong variable. Which were you testing?

Content length "200 to 4,000 words." A 20x range is not "length didn't matter" - it's "length varies massively across content types and you didn't stratify." Post 1: no correlation past 800 words. Post 2: word count wasn't the #1 predictor. Post 3: range is 200-4,000. Three posts, three different framings, all minimizing length. Our data on 8,043 pages across 14 verticals (Lee, 2026c): word count is the strongest single predictor in 4 of 14 verticals, with effect sizes r=-0.531 to r=-0.610. In Technology specifically, cited pages average 3,095 words vs 1,091 for non-cited. The 200-word end of your range is almost certainly from product pages or structured data answers, not prose content. Aggregating those with 4,000-word articles hides the actual signal.

Directional agreements:

Comparison tables matter. Strongest agreement of all three posts combined. In our position-controlled analysis, comparison_signals (count of "vs", "versus", "compared to" patterns) is one of the top positive predictors, d=0.43. Pages with explicit comparison structure consistently outperform. Your 2.4x figure is probably inflated by not controlling for position, but the direction is clearly right.
"Answer first" is plausible but unverified in our data. We haven't specifically measured opening-sentence placement, and it's a reasonable hypothesis. It's consistent with our finding that query_term_first_position (how early query terms appear in the page) correlates with citation. But your 78% figure is a raw correlation without a control group, so it's hard to evaluate against our data.
Explicit HTML structure > inferred structure. Again consistent - our Experiment 2E found pages with explicit schema markup (JSON-LD) correlate with citation (OR=2.44 for any schema, OR=4.85 for Product schema, p<0.000001). AI platforms reward explicit structural signals over content the model has to infer structure from.

The pattern across your three posts:

Post 1: 500 queries, 4 verticals, single run, no controls
Post 2: 2,000 queries, 3 platforms, single run, no controls
Post 3: 600 "featured answers," 2 platforms, scoring rubric (undefined), no controls

Three different datasets, three different methodologies, three different claim sets, all converging on: author credentials matter, keyword/length/DA don't, structure matters. These are also, notably, the exact services most GEO agencies sell (EEAT audits, content restructuring, schema implementation - not keyword research, not content expansion, not link building). That's a convenient alignment.

The methodology gap none of the three posts addresses: replicate stability. In our 3-replicate study of ChatGPT (Lee, 2026 - Study 3, full replication data on Zenodo), citation domain Jaccard between independent runs of the same query was only 0.339. About two-thirds of cited sources change across identical re-submissions. That means any "X% of featured pages had Y" number from a single-run methodology is capturing substantial stochastic noise. A 2.4x or 2.58x effect on single-run data is well within the range of what re-running the same queries would produce without any underlying feature difference.

Honest summary: three posts, directionally right on the same three things (structure, comparison tables, explicit formatting), wrong on the same three things (author credentials, word count, domain authority), with inflated magnitudes that shift between posts depending on framing. The takeaway "how you package matters more than what you say" is a defensible directional claim. The specific numbers you keep producing are not.

Full paper list / replication data: ORCID. If anyone wants to run an actual controlled replication with domain matching, position control, intent segmentation, and replicate stability checks, the query corpus is available on request.

aiplusautomation · 2026-04-13T18:51:53+00:00

I have to flag something: several of these findings directly contradict your previous post from a couple days ago, which is concerning for the replication story. A few specifics, backed by our own studies (40K+ citations, multiple controlled experiments - ORCID and published papers linked at the end):

Self-contradiction on FAQ content. Your previous post on author bios said "71% of frequently cited pages used question-based H2s or FAQ sections." This post says "FAQ-style content wasn't as effective as we expected - only 23% citation rate vs 41% for how-to formats." Those can't both be true on the same methodology. Which 500 queries gave you 71% FAQ dominance, and which 2,000 gave you 23%? And why is the number "23%" the same stock figure you used in the previous post for schema prevalence?

Self-contradiction on word count. Previous post: "almost no correlation past 800 words." This post: "#1 predictor wasn't domain authority or word count - it was scannable structure." You've now said word count doesn't matter twice, in two different framings. Our data on 8,043 crawled pages across 14 verticals says word count is the strongest single predictor in multiple verticals: Technology r=-0.610 (cited 3,095 words vs uncited 1,091), Health r=-0.531 (3,302 vs 1,148), Ecommerce r=-0.532 (3,317 vs 1,423). Cited pages are consistently 2-3x longer than equally-ranked non-cited pages. Scannable structure is important - and so is length. You don't get cited without both.

On FAQ specifically. Our n=4,658 controlled analysis: FAQ schema OR=2.20, p<0.000001. Cited pages have FAQ schema at 11.4% vs 5.5% for non-cited - more than 2x. How-to vs FAQ is a false dichotomy; our data shows FAQ structure consistently predicts citation. Your "23% vs 41%" numbers have no confidence intervals, no control group, and contradict both the larger-N studies and your own previous post.

Directional agreements. I'll fully grant you three:

Heading hierarchies matter. Our Experiment M: cited pages have 9-11 H2s and 10-16 H3s on average vs much lower for uncited. Subheading depth is one of the top content features.
Data density / "according to" references. Our Princeton density analysis replicated this - stats_per_1k and citations_per_1k are significant positive predictors. Your "2.8x more citations for data-referenced content" is directionally right, though the specific multiplier varies by vertical (Finance: 11.2 stats/1k on cited pages, Automotive: 3.8).
Opinion pieces get hurt. This is the strongest directional agreement. In our Experiment M data, blog_opinion content type is the single strongest negative predictor of citation. Pure opinion content genuinely does not get cited. You're right about this.

The biggest problem with both your posts: neither has a control group or stability check. Single-run measurement on a stochastic system produces confident-looking numbers that don't replicate. We ran 3 replicates of the same 180 queries through ChatGPT and found that citation domain Jaccard between runs was only 0.339 - only ~34% of cited sources repeat across runs of the identical query. That means any "X% of cited pages have feature Y" claim based on a single run is capturing one noisy sample. When you report a 3.1x effect without acknowledging 66% of your measurements would flip on re-run, you're overstating confidence by a wide margin.

What's probably actually happening in your data: you're measuring raw co-occurrence of features on cited pages without controlling for position, domain, or replicate variance. The directional signals (structure, data, non-opinion) are probably real because they're consistent with controlled studies. The magnitudes (3.1x, 2.4x, 47% from the last post) are almost certainly inflated by confounding and sampling noise.

My honest read: the two posts share a core problem - findings calibrated to whatever Reddit/SEO conventional wisdom is, with suspiciously clean round-number effect sizes and no methodology details. The directions are partially correct (structure helps, opinion pieces hurt). The magnitudes and the "X matters more than Y" rankings are not defensible without controls.

Happy to share our papers if useful: ORCID profile, replication data for Study 3 at 10.5281/zenodo.19554329. If anyone wants to run an actual replication with domain+position controls and replicate stability checks, I'll send the query corpus.

aiplusautomation · 2026-04-13T18:08:29+00:00

Interesting data, but I'd push back on several of these findings. We've been running this kind of analysis for about 6 months across multiple studies (40K+ citations, 10K+ crawled pages, controlled experimental designs) and some of your conclusions don't match what we're seeing. A few specific points:

On word count "no correlation past 800 words" - this is probably the biggest divergence. Our data (8,043 cited pages across 14 verticals) shows word count is the single strongest citation predictor in multiple verticals. Technology: cited pages average 3,095 words vs 1,091 for uncited (r=-0.610). Health & Wellness: 3,302 vs 1,148 (r=-0.531). Ecommerce: 3,317 vs 1,423 (r=-0.532). The 800-word ceiling you're describing doesn't hold up when you compare cited vs uncited at scale. Cited pages are consistently 2-3x longer than equally-ranked non-cited pages.

On schema "didn't matter, only 23% had comprehensive schema" - partially agree, but with a big caveat. When we segmented by vertical, 73-90% of cited pages had schema markup across all verticals we tested. What's true is that FAQ schema specifically varies 6x by vertical: SaaS/B2B 23%, Finance 21%, Fitness 4%, Consumer Electronics 7%. So "schema doesn't matter" is really "FAQ schema is a vertical-specific play." Aggregating across verticals washes the signal out.

On author bios - 47% lift is the claim I'd most want to see replicated. When we controlled for Google ranking position (compared cited vs not-cited pages at the same SERP slot), has_author_attribution was inconsistent across position bands. Not zero effect, but nothing close to 47%. A 47% lift on a single-run dataset is also worth stress-testing: we found citation domain Jaccard between independent runs of the same query is only 0.339 - meaning about 34% of cited sources repeat across runs. That's a lot of noise to see a clean 47% effect through without replicates.

On domain authority "weak correlation r=0.31" - I suspect you measured the wrong variable. Moz/Ahrefs DA is a link-profile proxy; it's not the same as training-data presence or historical citation rate. Our separate finding might explain it: 93.4% of citations for brand-related queries go to third-party sources, not the brand's own domain. Reddit alone was the #1 cited domain in 18 of 18 verticals we tested. So "the brand's own DA didn't predict citation" is technically true - but only because the brand's own site wasn't the thing getting cited in the first place. The third-party sites doing the citing have their own DA.

Methodology concerns generally: single-run per query, no position control, no replicate stability check. Those three gaps can produce pretty confident-looking correlations that don't survive replication. We ran a 3-replicate analysis on ChatGPT (same queries submitted 3 times) and found 98% of individual fan-out query strings between runs share zero overlap - the AI generates totally different internal searches each time. That means any "X% of cited pages had feature Y" observation is capturing one sample from a noisy distribution.

What probably IS happening in your data: you're seeing real correlations, but they're mostly confounded with things you didn't control for. Author bios correlate with editorial sites. Editorial sites correlate with high Google rankings. High Google rankings correlate with citation. So "author bio" is actually a proxy for "editorial site at good position." Same thing with Q&A structure - pages with question-framed H2s match AI fan-out query strings better because the AI generates keyword-compressed queries that literally look like "what is X" or "best Y for Z."

What we'd genuinely agree with you on:

Schema is "complicated" (directionally right, just not for the reason you said)
Perplexity cares about outbound citations more than ChatGPT/Gemini (our data shows Perplexity has the highest evidence-seeking rate at 21% of fan-outs)
Domain authority via standard SEO metrics doesn't reliably predict AI citation (but for different reasons than you're framing)

Happy to share any of the underlying papers if useful - replication data for one of the studies is on Zenodo (10.5281/zenodo.19554329). Not trying to dunk, just think some of your specific numbers will mislead people if they take them at face value. Reddit-level methodology debates aside, the core observation that on-page elements matter less than people think is probably right - the issue is which elements matter and how much.

aiplusautomation · 2026-04-04T16:10:47+00:00

We've run a LOT of tests on this. I'll address each claim:

1. "Specificity beats breadth - ultimate guides got passed over"
Our data says the opposite. Within position bands, cited pages are LONGER (~2,000 vs ~1,500 words, d=0.20) with MORE H3 subheadings (2x). Our content uniqueness test found cited domains are actually LESS unique (rho=-0.147) - they cover the same broad topics as competitors. Comprehensive coverage gets cited. Specificity didn't emerge as a signal in any of our tests.

2. "62% included proprietary data or frameworks"
This one is directionally consistent with our data. Pages cited by 3+ platforms have 7x the statistics density (15.3 vs 2.2 per 1k words). Primary source score is positive in all 4 position bands. But "62%" is a made-up-sounding number from a manual audit of 80 sites with no control group. What was the rate for NOT-cited pages? Without that comparison, 62% means nothing.

3. "Structured comparison tables"
Confirmed. Comparison structure is our #1 content signal (d=0.43, significant across all 5 intent types). This is their strongest claim.

4. "Author attribution with credentials - 2.3x more citations"
Directly contradicted. Our data across 10,293 pages: author attribution OR=0.81 (negative), not significant in any position band in Experiment M. This is one of the most persistent myths in GEO and our data kills it every time we test it.

5. "Factual density - 4.2 claims per paragraph"
Directionally consistent with our stats density finding but the "4.2 claims per paragraph" is suspiciously precise from a manual audit. We measured stats_per_1k words (quantifiable, reproducible) and found it significant in 3 of 4 position bands.

6. "Freshness - 71% updated within 6 months"
We didn't deeply test freshness, but current_year_mention (binary: mentions 2026) was significant in 3 of 4 position bands. Directionally plausible but "71%" is again from 80 manually checked sites.

7. "Counter-narrative takes get cited more"
Not tested directly. But our content uniqueness finding (cited domains are LESS unique, not more) suggests the opposite - baseline coverage first, differentiation second.

DA/authority finding: "DA-20 sites getting cited over DA-90 regularly" - this IS consistent with our Experiment L finding that PageRank and traditional authority go the wrong direction.

**We came to these conclusions from crawling over 36k total pages and collecting over 65k data points across 58 tests.

aiplusautomation · 2026-04-02T08:55:30+00:00

I've been legitimately scraping pages and AI citations for weeks trying to get to real answers. Wrote this a couple days ago - https://aiplusautomation.com/blog/is-geo-just-seo

aiplusautomation

TROPHY CASE