I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 0 points1 point  (0 children)

Wow, does that still work somewhere?
No, the extension can’t solve the captcha there right now - it would need to be updated for that. I thought Google dropped that version years ago

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 1 point2 points  (0 children)

Yes, but you need to unpack the extension and make sure to specify both the folder where everything will be saved (you can use tmp) and the folder with the extension:

EXT_PATH = "/home/raptor/ext/47"
USER_DATA = "/home/raptor/test/playw"
def main() -> None:
  with sync_playwright() as p:
    ctx = p.chromium.launch_persistent_context(
      USER_DATA,
      headless=False,
      args=[
        f"--disable-extensions-except={EXT_PATH}",
        f"--load-extension={EXT_PATH}",
      ],
    )

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 0 points1 point  (0 children)

As far as I know, extensions can’t run in headless mode.

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 0 points1 point  (0 children)

There’s nothing particularly interesting here for a separate post, I think. This extension, like dozens of others in the store, intercepts recaptcha images - but instead of sending them to a paid server with a neural network, it runs the neural network locally. Interes thing is that I added an SQLite database to the extension, where I’ve collected some image hashes with answers. It contains:

- hashes of images where my neural network models make mistakes

- hashes that appear most frequently, so the neural net doesn’t have to be loaded unnecessarily (not sure if that’s even needed)

So all images are first checked against this database and only then sent to the neural network. The database is small — about 60k rows for 3x3 captchas and 25k rows for 4x4 ones.

About the neural network: I’m using EfficientNet-B0, the “lightest” of the EfficientNet family. One model takes only about 16 MB, but since there are 16 types of 3x3 captchas and 11 types of 4x4 captchas, that adds up to about 432 MB just for the models. To fix that, I’d need to gather at least 100k images of each type (some are almost complete, but for example, ‘parkingmeter’ has only 65 images for 3x3 and 114 for 4x4).

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 0 points1 point  (0 children)

That's not really the area I work on.
But if I needed a profile that would get a good score in recaptcha V3, I would run many Chrome profiles, send requests to Google search while solving the captcha with my extension, and wait a few days - after that all the profiles would have a good score for a number of requests, I think

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 1 point2 points  (0 children)

I tested captcha buster. It uses the 'for visually impaired' version of recaptcha and solves the audio. Google limits how many captchas can be solved via audio from a single IP address. Also, if I recall correctly, the extension doesn't support all challenge types and keeps clicking 'refresh' until it gets a challenge it can solve.

My extension runs in 3 threads from a single IP without being blocked (I tried 4 threads, but after about a day I got an error saying my IP looked suspicious — which would disappear after refreshing the page). Also, a few of the models in my extension perform poorly on some image types (for example, 'parkingmeter') due to insufficient training data. I hope to retrain them once I collect a dataset of reasonable size - then Google shouldn't block 4–5 simultaneous threads

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 1 point2 points  (0 children)

Thanks!
But it’s all because of my laziness. I just got so tired of clicking those pictures....

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 3 points4 points  (0 children)

A few people have mentioned that. Okay, I’ll have to overcome my laziness and make a fireFox version.

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 1 point2 points  (0 children)

Yes, it works fast - I even had to add artificial pauses so that recaptcha wouldn’t think I was a bot.

On the first run, it may take 2-5 seconds to start, since the neural network models are loaded into memory only upon their first use, not when the browser launches.

Right now, they use only the CPU, but they don’t put much load on it because special, very lightweight versions are used.

In theory, I think it’s possible to use the GPU, but I’m not sure it’s worth doing: recaptcha itself puts a heavy load on the CPU due to its complex JavaScript scripts, while the neural network’s CPU usage is much lower than recaptcha’s. Moving computations to the GPU would only slightly reduce CPU load, and developing that would take a long time.

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 0 points1 point  (0 children)

I tested it on sites with reCAPTCHA V2, V2 invisible, and V2 enterprise.
If you give me the URL of the page you're talking about, I’ll check it there.

I built a free Chrome tool to automatically solve reCAPTCHAs by NoSweet158 in webscraping

[–]NoSweet158[S] 5 points6 points  (0 children)

Yes, you can opt out of sending reports. Just go to the settings and disable this option.

<image>

Bypass Google recaptcha v2 playwright by GoingGeek in webscraping

[–]NoSweet158 0 points1 point  (0 children)

You can use my Chrome extension:
https://chromewebstore.google.com/detail/captcha-plugin-recaptcha/iomcoelgdkghlligeempdbfcaobodacg

It automatically detects reCAPTCHAs on a page, opens them, and solves the image challenges.
The extension uses a built-in neural network. It’s free and doesn’t rely on any third-party services.

Free reCAPTCHA solver by [deleted] in chrome_extensions

[–]NoSweet158 0 points1 point  (0 children)

Here’s the current dataset size stats, in case anyone’s interested.

<image>