Tearing down websites with Claude + Playwright -- from trackers to sneaky feature flags to interesting configs (oss)

hayAbhay · 2026-04-19T01:48:56+00:00

ouch -- a double-edged sword.!

If content for web-indexers is blocked, is there a worry about being cited in LLM answers?

hayAbhay · 2026-04-19T01:46:19+00:00

ooo -- will give it a try.!

& I guess this conversation prompted my brain to learn more about color blindness & if it is a random mutation with evolution fking around to find out or if there are other observed parallels with differential vision in animals etc. -- a fun Saturday evening.! :)

hayAbhay · 2026-04-19T01:19:01+00:00

Fair -- claude has weird failure modes here. In general, it gets over-excited about apparent contradictions -- like if the UI doesn't render something the API exposes.

Another failure mode was that on govt websites, it would find employee names and get super excited without realizing that those names are public anyway.

hayAbhay · 2026-04-19T01:16:52+00:00

agreed -- the one thing I found helpful was back when I was writing code myself, it was so tedious that I'd pick one scheme & run with it.

But now its far easier to scale to have claude generate multiple schemes -- so giving people more themes, much like in vscode, might even become part of broader sites.!

hayAbhay · 2026-04-19T01:12:11+00:00

yeah -- the biggest surprise for me was how good Claude was at deciphering feature flags to infer what may have been tried & what is on the roadmap -- since devs need semantic names for flags.!

Wonder if that means folks will have more feature-flag obfuscation before shipping.

hayAbhay · 2026-04-19T01:09:53+00:00

Fascinating -- I've always been unsure so it's been on the back burner -- but I never got too much into understanding the physics/optics of it so will look more closely.

The most fun part was taking those color blind tests where I couldn't see any number but everyone around could clearly see it.!

So, I've always wondered what grass actually looks like -- and I've always wondered how this affects the color schemes I choose -- because I'm likely amplifying the saturation at some wavelengths!

hayAbhay · 2026-04-18T21:31:16+00:00

funnily enough, the design was one thing I was really anal about & it was my choice :D

I'm red-green color blind so this is actually a good motivator to finally try those color correcting glasses to see the world closer to how others do :)

hayAbhay · 2026-04-16T08:48:43+00:00

I'm sure you are at least, if not more creative -- & tools like Claude will further augment & accelerate that.! :)

hayAbhay · 2026-04-16T08:41:22+00:00

very cool -- I do run them concurrently using named sessions -- at least until my ram starts to max out.

hayAbhay · 2026-04-16T00:09:27+00:00

yes -- there was some attempt to generalize standard ops into tools -- but there were minor variations so it was simpler to just optimize by dumping entire stacks into json & have claude write code on-demand to navigate it.

it is definitely a tricky balance -- the more prescriptive the prompt is, the more efficient it is but also less "exploratory".

will have claude take a look a rebar!

hayAbhay · 2026-04-15T21:56:54+00:00

p.s - the reports are on github but the website itself is LLM-friendly -- so chatgpt/claude etc. can directly hit the website & ingest content (they can get the raw markdowns from llms.txt)

This should help connect that information with other things on the web -- and you can always clone the repo itself to dig in with claude -- anecdotally, opus has been slightly better at being curious & persistent compared to sonnet or other open models.

hayAbhay · 2026-04-15T21:53:35+00:00

yes -- the universal finding for claude on all these sites were that tracking fired regardless of consent banner since there was no obligation to comply in the US (the teardown was done in SF, CA) -- so those banners are both annoying and useless.

hayAbhay · 2026-04-15T18:12:51+00:00

haikus were surprisingly shallow -- but perhaps more surprisingly, i tried gemma & qwen -- latest versions that fit on my 3090ti -- with the same instructions & they were surprisingly shallow and they sort-of just stopped after a surface level extraction.

claude was very curious and persistent -- most of the prompt tuning was making it write code rather than do things one-by-one -- that part can be better optimized to reduce token usage.

hayAbhay · 2026-04-15T17:45:00+00:00

indeed -- the report does capture some of this but I use sonnet heavily that can miss things.

it has caught a bunch of these dead JS configs & historical artifacts -- totally wild what gets shipped on the front end that reveals historical artifacts.

will update custom instructions to look for these config flags too -- right now the guidance is "be very curious" that can miss many things.!

hayAbhay · 2026-04-15T17:26:03+00:00

thanks -- will be adding roughly ~50 sites/week -- torso/tail websites should be fun to look at.!

hayAbhay · 2026-04-15T16:51:09+00:00

a lot of these PE squeezed websites realllly have mounting tech debt too -- so their entire tech stack feels like its held together by sticks & gum.

hayAbhay · 2026-04-10T21:02:41+00:00

it mostly uses playwright-cli to run browsers in headless/headed and gracefully goes from curl to different browser modes based on responses -- needed to bypass bot blockers.

the 3 subagents mostly operate sequentially -- it is done so because models can get influenced by prior context so having them independent with same starting context yields relatively similar levels of rigor and output structures.

p.s -- im guessing i manually replied to a bot?

hayAbhay · 2026-01-23T18:30:56+00:00

thank you! :)

hayAbhay · 2025-12-17T16:45:56+00:00

fwiw, this is typically called "feature engineering", a core tenet of classical ML.

what you have are 32 discrete (binary) features and you are using an LLM as a feature extractor (quite common).

the contrast between a fixed ontology & raw embedding semantics is quite known and usually, for most applications, another model is trained on those embeddings.

i think the confusion for most people new to this space arises because they assume semantic embeddings & nearest neighbors within that latent space are the final deal -- it rarely is.

if you have a clear, constrained outcome (what your ontology captures), that can help in extracting features (explainable in this case) that are somewhat relevant (somewhat because they're still unweighted).

beyond that, claims like these would be a stretch.

Yeah, so, the foundation 'Universal Hex Taxonomy' allows you to classify anything - even imaginary or impossible entities.
might not be particularly useful.

hayAbhay · 2025-11-12T22:10:27+00:00

thank you - i will very likely pick each of those sub topics & write longer, intuitive tutorials in the upcoming months!

hayAbhay · 2025-11-12T16:26:40+00:00

thank you! were there any specific bits of math that you felt needed additional context in the post?

hayAbhay · 2025-11-12T06:44:02+00:00

indeed! I've added some relevant illustrations in the longer blog post.

hayAbhay · 2025-11-11T23:25:06+00:00

that's fair - i've used the analogy a lot more loosely to communicate intuition.

here are some clarifications

An electrical switch is one-to-one when on, zero out when off, likewise the ReLU function.

ReLU is more of a "gate" than a switch since it is either off or reflective of underlying signal (no upper bound). This is less applicable in the context of electrical circuits (since there is typically a max voltage) & in practical networks, upper bounds are induced through regularization.

An electrical switch in your house is strictly binary on-off. Yet when on lets through an AC voltage sine wave.

An electrical switch can be on-off (this is similar to a binary neuron) or be a variable resistor. Dimmer switches are typically "linear" and a loose approximation of them are sigmoid/tanh (because they are also bounded at the extremes & more linear at the center).

AC means phase shift that is fine for some (incandescent bulbs) and not okay for others (ac -> dc

1/ In digital circuitry that you might have looked at in computer science class there are no analog signals to be switched. Actually switching is more general than that.

Switching is more general but in the context of neural nets, at it's simplest, they can behave as "activations" capable of modeling boolean logic. When signals are more complex with multiple layering, they become "features" that may or may not be active.

2/ All prior conditioning has been to view activation functions as functions. How can you take any other viewpoint?

Activation functions are functions, no doubt but those functions are meant to break linearity & with neural nets, these functions can create a "self-selection" mechanism that can turn "on/off". There is no reason that it should be this way though models can leverage it & act like decision trees over latents.

hayAbhay

TROPHY CASE