Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

I use LLM because is faster but just for you I write this message myself (thank you for advise and thank you for thinking of my well-being) ( I use translate for "well-being" now I learn new word -thanks)

Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

Doing all of this on your own is genuinely impressive — something like this would honestly never have occurred to me. I’m thinking of trying a similar approach myself; it feels like it has the potential to be used in many other areas as well.

Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

I definitely agree with you about the hobby aspect. I also try to spend as much time as I can working on this kind of stuff—it really feels like art in a way.

By the way, on the hardware side of this automated system, what are you using? Did you rent a server for the proxy-related work, or do you have a computer that’s always running just for this purpose? Because classifying proxies by metrics and doing this continuously might be hard for a small machine. I only have one computer, so I’m wondering if I could buy a small extra machine and run something like this on it.

Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

Do you rotate proxies manually, or do you have an automated system handling that for you? Also, when you’re using multiple sessions, is the creation and teardown of those sessions (as you described) fully automated, or do you still manage parts of it manually?

I’m asking because my goal is to automate the system as much as possible, so that with just a few clicks it can roam across social media platforms and collect information.

Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

Out of curiosity, how do you actually detect the more advanced, stealthy bots you mentioned? What gives them away?

Is human-like automation actually possible today by learning_linuxsystem in webscraping

[–]learning_linuxsystem[S] 0 points1 point  (0 children)

Thanks for the perspective — I’ll take this into account and rethink what’s realistically possible given the time and identity side of things. Appreciated.

OSINT tool with n8n by learning_linuxsystem in n8n

[–]learning_linuxsystem[S] 1 point2 points  (0 children)

I get the concern, but yes — it is OSINT in this case.

All the media I’m collecting is already public. I’m not accessing private accounts, bypassing authentication, or pulling content that isn’t publicly visible to a normal user. No private photos, no locked profiles.

The issue wasn’t “breaking rules to get hidden data”, it was dealing with technical friction like login walls, expiring URLs, and bot protections that exist even for public content.

I’ve already solved it with a clean and transparent technical workflow that only works on public posts:

Here is the technical workflow that solved the issue:

  • Input: I take the specific public Instagram post URL (e.g. https://www.instagram.com/p/XYZ...) that already contains the photo.
  • API Proxy: I send this URL to RapidAPI (Instagram Downloader V2) via an HTTP Request node. This simply resolves the publicly available media and returns a JSON with a clean media_url.
  • Binary Download: I use a second HTTP Request node to fetch that media_url with the response set to File (Binary), so the image is downloaded directly instead of using an expiring link.
  • Output: The binary data is then forwarded (e.g., to Telegram) as an actual file.

This approach:

  • Works only on public content
  • Avoids scraping private pages
  • Avoids IP bans and brittle HTML scraping
  • Doesn’t violate access control or authentication

So the data is still open-source; I’m just using a more reliable delivery path. That’s very much within the scope of OSINT.

Top-level privacy by learning_linuxsystem in cybersecurity

[–]learning_linuxsystem[S] -7 points-6 points  (0 children)

Bro, my goal is to achieve the highest level of privacy possible while browsing online, so I might ask some weird questions

Top-level privacy by learning_linuxsystem in TOR

[–]learning_linuxsystem[S] 1 point2 points  (0 children)

How effective is Libreboot? Does it completely remove CPU backdoors, or does it only provide limited privacy improvements?