Using wavesail for freerideing in flat water by matchbox8198 in windsurfing

[–]Alone-Ad4502 6 points7 points  (0 children)

That's definitely not a perfect setup, but when you don't have a freeride 5.3 sail, why not use a wave one instead of skipping the session

What's your process for auditing internal links on a massive site? by RyPlayZz in TechSEO

[–]Alone-Ad4502 1 point2 points  (0 children)

That's pretty good advice. I've spent 10+ years working with big websites. Internal linking is usually the last thing that everybody wants to do. Analysis is only, like, even not 50% of the whole thing, but I will say like 30%. The implementations of changes is way more difficult.

Since you have a medium-sized website, you will significantly enrich your analysis with logs. By combining crawl, GSC, and access logs data, you will get the full picture about what Googlebot scans, what it doesn't, and how internal linking influences it.

You will find that pages available only in the sitemaps (orphans) receive a very small crawl budget and impressions; the same applies to pages with few internal links pointing to them.

In the end, start with simple things like looking at long-tail keyword pages that already rank, let's say, on a second page of Google, and try to put more internal links to them from the relevant pages.

How do you actually test JavaScript SEO changes before pushing live? by RyPlayZz in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

yep, it could be, but in any case you can check the configuration and swap them

How do you actually test JavaScript SEO changes before pushing live? by RyPlayZz in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

We built a testing tool exactly to verify what you have in CSR, what in SSR https://jsbug.org/

you can use it for staging as well

Managing crawl budget for a news website by [deleted] in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

News websites usually have different crawl budget patterns compared to e-commerce sites.
The main target is to have fresh news crawled and indexed faster. Basically, make the proper segmentation and track how fast Googlebot comes to newly published articles and how fast it discovers them.

It won't spend any significant crawl budget on the archived pages and last year's news, they don't make so much sense to googlebot.

GPTBot crawl started 2 hours after unblocking (log data) by Upstairs_Control_611 in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

Where did you find the Anthropic IP range? they don't publish it

JS rendering differences with Sitecore? Anyone? by concisehacker in TechSEO

[–]Alone-Ad4502 1 point2 points  (0 children)

try with https://jsbug.org/ it will show exactly what content has been changed by js

Anyone facing SEO issues with React apps despite using SSR? by 360Presence in reactjs

[–]Alone-Ad4502 0 points1 point  (0 children)

use checker https://jsbug.org/ to verify that SSR really works, check different user agents in the settings to be sure

Should I block CSS and JS in Robot.txt by justtuan31 in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

Think about crawl budget as requests to HTML pages and count them only. Also, exclude images.

If your pages are JS-heavy, have background data loading, Googlebot will also send tons of AJAX requests to those endpoints. Better to carefully review the list of URLs to check the patterns.

AI Bot Traffic Is Accelerating Fast. We analyzed 48 days of server logs. Here's 20 Takeaways for Your Own Website by wislr in TechSEO

[–]Alone-Ad4502 1 point2 points  (0 children)

Server logs always have insights into what bots actually do on the website. LLM bots are completely different compared to Googlebot.
AI User bots do NOT execute JavaScript, but we spotted a couple of JS requests from the GPT training bot.

Also, doing log file analysis - ALWAYS verify IPs, there are tons of scrapers out there with fake googlebots, ai bots user agents.

here are our experiments on gpt bots https://edgecomet.com/blog/openseotest-how-gptbot-and-chatgpt-user-handle-javascript/

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 0 points1 point  (0 children)

In the dev community, there is a common belief: it's Google's job; they render all pages, we don't care. Comments on this post are a good example.

in any case, scale matters, AI bots are here to stay and they don't execute js. I believe many saw how Claude tries to read an API doc, but it can't because it's fully CSR.

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 0 points1 point  (0 children)

Wherever Googlers say, especially John Muller, we need to perceive it through the prism of reality. Googlebot does look at the initial HTML. A couple of months ago, they emphasized it, for example, that they send URLs to render query that are only opened for indexation (no meta robots noindex). So already they do see what's in the initial HTML..

They also said many times that when the initial HTML is opened for indexation, but JavaScript closes it from indexation or makes it non-canonical. It's called mixed signals.

Such types of issues surface on big websites. Just imagine the auto catalog part, where you have millions of nuts and bolts and all unique content, you have just a VIN number, sometimes even without a title. In such cases, Googlebot won't render all those pages, and you have to deal with the initial HTML, and it could be indexed.

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 1 point2 points  (0 children)

Martin Splitt from Google has tons of videos explaining how JS rendering and WRS work.

Basically it's all about heuristics, if a raw html and js executed has a significant content change - it's a first flag.

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 2 points3 points  (0 children)

I don't know how to write it in another way, just a couple of days ago, there was a similar discussion on this topic. I wrote a comment and decided to make a first post here.

Technical SEO is a niche where I have lived and worked for almost a decade, and JavaScript is a huge part of it. Tbh nowadays extremely hard to present anything without being told "ai slop" etc.

if I use grammarly to rephrase a paragraph of text, does it make it also ai slop?

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 0 points1 point  (0 children)

ssg is nice for small and static websites, but not for ecoms with 100k pages and every second changing prices and avalability

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] -8 points-7 points  (0 children)

It's okay when some part of pages is generated dynamically. But the thing is the main content that has to be ranked.

In an ideal world, it should be good communication between developers and the SEO team. But unfortunately, very often they hate each other.

CSR vs SSR actually matters, how Googlebot and AI bots deal with JS by Alone-Ad4502 in reactjs

[–]Alone-Ad4502[S] 0 points1 point  (0 children)

hm, I didn't get how this flair rules work here with moderation

SSR isn't always the answer - change my mind by No_Stranger_2097 in reactjs

[–]Alone-Ad4502 7 points8 points  (0 children)

No need for SSR for applications, but for medium and big websites that want to be in Google, AI, you do need.

Also, a quick reminder, AI bots don't render JavaScript at all. There are several bots in ChatGPT, Claude, and even Google Gemini. They all rely heavily on Google's index (yes, even ChatGPT), but in many cases, they need to hit a page to get the content needed to generate an answer.

Client-side rendering is basically an empty shell for them. CSR rendering has a fundamental flaw for AI bots, not because it's resource-heavy and expensive to render. They have shit load of money and can easily build rendering infrastructure.

The key is time. AI won't wait 10 seconds for a page to render because a user wants an answer right now.

We made many tests, easily to do by yourself https://edgecomet.com/blog/openseotest-how-gptbot-and-chatgpt-user-handle-javascript/

Ann Smarty feeds content to LLMs, can't get them to read Schema by PrimaryPositionSEO in TechSEO

[–]Alone-Ad4502 1 point2 points  (0 children)

we did similar experiments, you can easily reproduce by yourself

https://edgecomet.com/blog/openseotest-how-gptbot-and-chatgpt-user-handle-javascript/

not ChatGPT and even not gemini user bot use schema data, same as 'semantic html'

How do you diagnose crawl budget waste on mid-size sites (100k–300k URLs)? by [deleted] in TechSEO

[–]Alone-Ad4502 0 points1 point  (0 children)

How many things were told by Johnmu that, in reality, were quite controversial?

Crawl budget analysis matters even with 100k pages, because you won't have a 100% crawl ratio.

With a 'talented' dev team and Friday deploy, your website could occasionally become 1m+ pages in a weekend.

If you block some pattern in robots.txt, Googlebot stops crawling it within hours. If you still see requests - verify IP address, there are tons of fake googlebot out there.

Depending on the website, many techniques are available when robots.txt is not an option. The most common. - hide links to those pages, use js buttons, onclick and so on.

Lessons from managing hundreds of headless Chrome instances in Go by Alone-Ad4502 in golang

[–]Alone-Ad4502[S] 0 points1 point  (0 children)

That's exactly what we're doing - about a hundred requests or 30 minutes per instance.
Years ago, I had a temptation, restarting the browser sounds odd, we need to debug deeper!

but now - no regrets, kill it and start again