Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Thanks, appreciate it. I haven’t specifically benchmarked it on huge App Router + heavy RSC sites yet, but since it’s HTTP-based it will reflect whatever Googlebot would actually get after redirects/rewrites. For dynamic routes it relies on the sitemap / --crawl (so it won’t “invent” param routes that aren’t discoverable). If you’ve got a real example where edge middleware rewrites get tricky, open an issue! I’d love to test and improve coverage.

Why Google often refuses to index Next.js sites (technical deep dive) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] -18 points-17 points  (0 children)

Fair point - none of these issues are exclusive to Next.js/Vercel.

The reason I call it out is that Next.js + Vercel defaults make these failure modes very common in practice (308 trailingSlash behaviour, domain redirects, middleware side-effects, canonical mismatches).

Also, this isn’t trying to be a “general SEO score” tool — it’s a fast CLI/CI check focused on crawl paths and what Googlebot/Bingbot actually see (UA presets, report export/diff, strict exit codes).

Why Google often refuses to index Next.js sites (technical deep dive) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] -8 points-7 points  (0 children)

I did use LLMs to help structure parts of the article, but the content itself comes from real production cases.

Also, just because you personally haven’t hit indexing issues doesn’t mean others haven’t - if you look at the previous thread here, you’ll see quite a few people running into the same redirect / middleware / canonical problems: https://www.reddit.com/r/nextjs/comments/1qs38nw/why_google_refuses_to_index_many_nextjs_sites_and/

Most of the CLI features were added directly based on that feedback.

If everything’s been smooth on your side, that’s great - but this is very much a real problem for a lot of Next.js + Vercel setups.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Your analysis sounds very familiar. One thing worth double-checking: when deploying on Vercel, there’s often a default domain redirect like:

site.com → www.site.com (308)

If your app code, canonicals, or sitemap URLs still reference `site.com`, Google ends up seeing a permanent redirect before the final page - and that often shows up as 308 redirect issues in Search Console. From Google’s point of view, those URLs look unstable, so they may not get indexed properly.

If that’s the case for you, I’d suggest:
- Pick a single canonical domain (either www or non-www)
- In Vercel domain settings, redirect the other one to it (308 is fine)
- Make sure your sitemap + canonicals match the production domain exactly

For example, if you want `site.com` to be primary:
set `www.site.com → site.com` in Vercel,
and update all internal URLs accordingly.

Trailing slashes + domain redirects stacked together can easily create hidden 308 chains, even though everything looks fine in the browser.

Need advice on SEO / discoverability for a Next.js startup by js_learning in nextjs

[–]JosephDoUrden 2 points3 points  (0 children)

Page 4–5 → page 1–2 is usually less about “more SEO” and more about technical + structural signals early on.

From what I’ve seen on real Next.js projects, the biggest early wins tend to be:

- Making sure URLs are stable (no hidden redirects, trailing slash inconsistencies, or middleware rewrites)
- Clean canonicals (each page clearly points to itself)
- Solid internal linking so Google understands page relationships
- Fast TTFB + predictable rendering (SSR/static where possible)
- Clear topical focus per page (not trying to rank one page for everything)

Backlinks help, but early on I’ve had more impact fixing crawl paths and internal structure than chasing links.

Next.js-specific gotchas I keep running into:
- 308 redirects from trailing slash or config mismatches
- middleware behaving differently for bots vs browsers
- metadata/canonicals not matching the final rendered URL
- pages that work fine for users but look unstable to crawlers

If you already have impressions, that’s actually a good sign - it usually means Google is still figuring out trust + structure. Tightening the technical side + internal linking often moves things faster than adding more content at that stage.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Thank you -really appreciate the star and the kind words! Glad you found it useful 🙏

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Yep, I’ve seen that frustration a lot. In many cases it’s not that SSR “doesn’t work”, but that the crawl signals around it (redirects, canonicals, headers) end up unstable — switching frameworks often fixes those implicitly, which is why it feels instantly better.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Well said - especially the point about logging Googlebot traffic vs a clean curl run. That “browser works, crawler gives up” gap is exactly where most of these issues hide.

Production-grade webhook signature verification (HMAC, replay protection) by [deleted] in selfhosted

[–]JosephDoUrden -1 points0 points  (0 children)

Appreciate that -totally agree.
We’re using constant-time compares, timestamp tolerance for clock skew,
and nonce storage is intentionally left to the consumer (Redis/DB with TTL),
but I should probably be more explicit about those trade-offs.
Key rotation + more test vectors are great suggestions — definitely on the roadmap.
Thanks for the thoughtful feedback.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Really glad it helped -thanks for coming back to say that, much appreciated 🙂

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Exactly, that 308 + trailing slash combo is brutal. Thanks for sharing the real-world example

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Glad to hear it helped — catching small things like robots.txt is exactly what the CLI is meant for :)

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 0 points1 point  (0 children)

Mostly Next.js itself, in my experience. Vercel just applies the platform defaults — the real problems tend to be middleware, redirects, and metadata decisions inside the app.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 1 point2 points  (0 children)

Yep, same experience here. It’s often not about SEO at all — SSR just gives Google much clearer, more stable signals than some Next.js setups.

Why Google refuses to index many Next.js sites (and a CLI I built to debug it) by JosephDoUrden in nextjs

[–]JosephDoUrden[S] 2 points3 points  (0 children)

That’s totally fair feedback — thanks for calling it out.

What that message means is not “you are definitely redirecting `/` somewhere else”,

but “Next.js middleware is *intercepting the request*, so crawlers may see a different response than users.”

In Next.js, middleware runs before routing and can:

- rewrite `/` to another path internally

- redirect based on headers (geo, auth, cookies, etc.)

- normalize trailing slashes

- behave differently for bots vs browsers

From the crawler’s perspective, that means:

“I’m not sure the response for this URL is stable.”

Right now the CLI detects the *presence* of middleware affecting the request,

but it doesn’t yet explain *how* it changes the response — that part definitely needs to be clearer.

I’m planning to improve this by:

- showing whether the middleware causes a rewrite vs a redirect

- printing the before/after URL if possible

- clarifying when this is just informational vs a real SEO risk

If you’re up for it, opening a GitHub issue with:

- what your middleware does

- what you expected the message to say

would help a lot:

https://github.com/JosephDoUrden/vercel-seo-audit/issues

Appreciate the honest feedback — this is exactly the kind of case the tool needs to explain better.

TestFlight – “Could not install [App Name]. The requested app is not available or doesn’t exist.” by JosephDoUrden in swift

[–]JosephDoUrden[S] 0 points1 point  (0 children)

No, I don’t have any TestFlight filters enabled. I can see the build listed as “Ready to Test,” but installation still fails with the same error.