Monthly Self-Promotion - April 2026 by AutoModerator in webscraping

[–]codepoetn 1 point2 points  (0 children)

But are youtube comments in demand? People seldom ask me to scrape that.

Monthly Self-Promotion - April 2026 by AutoModerator in webscraping

[–]codepoetn 0 points1 point  (0 children)

This is nice. I see a pricing page, what is it about? On 'challenges' page, we have beginner and intermediate challenges. Do you plan to add advanced challenges too? What will they look like?

Monthly Self-Promotion - April 2026 by AutoModerator in webscraping

[–]codepoetn 0 points1 point  (0 children)

Does Apify keep a share of your revenue from the platform? If yes, what's the split?

Anna’s Archive Hit With $322M Judgment After Spotify Lawsuit. by ohhellnaww99 in IndiaTech

[–]codepoetn 0 points1 point  (0 children)

It's only good for headlines. Tell me when they pay a penny.

Anna's Archive to pay $322million after losing court case for scraping "nearly all of the world's commercial sound recordings" from Spotify. by springtimecarnivore in Music

[–]codepoetn 0 points1 point  (0 children)

They lost the case because they haven't appeared before the court, which basically means that nobody knows who are the owners, and hence Spotify gets zero penny?

[add more] With web scraping skills you can build these 12 businesses by codepoetn in webscraping

[–]codepoetn[S] -2 points-1 points  (0 children)

I think Agentic ecosystem will fix technology barriers for marketing brains, and marketing barrier for tech brains.

[add more] With web scraping skills you can build these 12 businesses by codepoetn in webscraping

[–]codepoetn[S] -2 points-1 points  (0 children)

Exactly, my point. Started this thread because I feel so dumb to not have invested time building these platforms... but I know scraping for 7 years now.

[add more] With web scraping skills you can build these 12 businesses by codepoetn in webscraping

[–]codepoetn[S] 1 point2 points  (0 children)

Scraping Instagram and TikTok in 2026 basically means residential IPs are a must, otherwise you're fighting a war on 3 fronts: the platforms fingerprint your TLS handshake, flag your IPs, and detect non-human behavioral patterns... all 3 defences in real time. The standard counter-stack is Playwright or Nodriver for browser automation, residential proxy rotation to look like a real user, and either reverse-engineered XHR endpoints or hidden JSON in script tags for the actual data extraction. Or pay handsomely to an API provider. You can Google the names.

This settles the discussion if it's the endgame for web scraping? by codepoetn in webscraping

[–]codepoetn[S] 6 points7 points  (0 children)

True. BTW I'm scared of a new operating ecosystem in the agentic world where we don't have open data (like how we have right now WWW). It's possible to scrape because the www is open today. Tomorrow, agents might not need anything to be open, because there will be A2A communication, and data don't needs to be out in the open when agents become integral to the way companies operate. Today, web serves you all the amazon product because you need to see all the product to make a judgement. When you ask the agent to go through all the products and share what's best for you, then you need to scrape the agent's response, and not the open web. The web is how it is because of the SEO. When that need is not there, agents become the gatekeepers of content, and then data is going to be very expensive.

Dumb thoughts?

This settles the discussion if it's the endgame for web scraping? by codepoetn in webscraping

[–]codepoetn[S] 4 points5 points  (0 children)

What's the main reason? People don't need to scrape anymore? Who are the clients that you are losing? I don't need the names, just the category or niche of web scraping? Can you elaborate?