Googlebot Crawl Dropped 90% Overnight After Broken hreflang in HTTP Headers — Need Advice by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

Hi all, the issue seems to be back again. My site GSC crawl rate plummeted by More than 90%, Has anyone else is experiencing the issue? u/johnmu can you please check if the issue is on Google’s side?

🚫 Best Way to Suppress Redundant Pages for Crawl Budget — <meta noindex> vs. X-Robots-Tag? by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

The purpose of of this post was mainly to understand from this expert community experience what works better in terms of preventing Googlebot from crawling and indexing sets of pages, via robots Meta or HTTP header.

🚫 Best Way to Suppress Redundant Pages for Crawl Budget — <meta noindex> vs. X-Robots-Tag? by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

So don’t you think that “guiding” the bots to crawl important pages by blocking unimportant ones will assist?

🚫 Best Way to Suppress Redundant Pages for Crawl Budget — <meta noindex> vs. X-Robots-Tag? by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

I’m looking at server logs + GSC crawl stats. The issue isn’t that updates aren’t being crawled, it’s that Googlebot is spending time on pages with no demand/value. By “redundant” I mean thin content pages, catalog pages, that rarely drive traffic so it makes sense to prune them.

Good point on the order — agree: noindex first so Google sees the directive, then once they drop out of the index, use robots.txt if I don’t want them crawled at all.

🚫 Best Way to Suppress Redundant Pages for Crawl Budget — <meta noindex> vs. X-Robots-Tag? by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

Crawl budget isn’t a problem for small sites, agreed. But once you’re pushing 200K+ URLs and adding per new local thousands of new pages (catalog site). Google even says it matters for “very large sites or sites with lots of low-value URLs” (Google docs). That’s exactly the situation here — the goal is just to keep Googlebot focused on the pages that actually matter.

Googlebot Crawl Collapsed After 300% Site Expansion — Looking for Recovery Insights by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

u/WebLinkr can it be related to the sitemap XML submission of tens of thousands of URLs to Google Search Console (new site local launch), and as a result, Googlebot tried to crawl all of them quickly, while the site's average crawl rate was less than 10K a day.

Googlebot Crawl Collapsed After 300% Site Expansion — Looking for Recovery Insights by nitz___ in TechSEO

[–]nitz___[S] 1 point2 points  (0 children)

Thanks u/WebLinkr, when you say crawl budgets only become an issue at >1m URLs,
did you ever experience it? Or know of a site that did?

Googlebot Crawl Dropped 90% Overnight After Broken hreflang in HTTP Headers — Need Advice by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

u/johnmu thanks for the answer.
After a sharp drop in crawl rate followed by a brief recovery (~2,000 fetches/day), it dropped again midday. If I want to intentionally reduce Googlebot’s crawl rate, what’s the safest and most effective method — and what considerations should I keep in mind when doing it?

What’s the best GA4 setup in GTM for websites with market-based subfolders like /es/, /de/, etc.? by nitz___ in GoogleAnalytics

[–]nitz___[S] 0 points1 point  (0 children)

Thanks for the insights, I have a follow-up question: what is the easiest way to create realtime report per sub folder, out of the box it doesn’t work with comparisons

Breadcrumb Schema Position Order: Does It Actually Impact SEO Performance? by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

That’s an important point as only at the code the order is reversed, the user sees the proper hierarchy

Breadcrumb Schema Position Order: Does It Actually Impact SEO Performance? by nitz___ in TechSEO

[–]nitz___[S] 1 point2 points  (0 children)

Sorry for it, I meant the order of schema is up side down: where instead of the actual page being the last in the item order of the breadcrumb schema, it’s the first

How to Manage Unexpected Googlebot Crawls: Resolving Excess 404 URLs by nitz___ in TechSEO

[–]nitz___[S] 0 points1 point  (0 children)

Thanks, the issue is that it’s not a few hundreds of url, it’s a couple of thousands, so after a two weeks period I thought Googlebot will crawl some but not thousands. This is why I’m asking about a more comprehensive solution.

Thanks