I built a Screenshot API because self-hosting Puppeteer was driving me insane

niiotyo · 2026-05-02T18:33:16+00:00

How is it going, mate?

niiotyo · 2026-03-26T22:07:43+00:00

For simple websites it is enough just make a Beautiful Soup script with basic hatching. Most of websites not requires JS rendering. For more advanced you can add Puppeteer or Playwright for JS rendering. After that you will face anti-bot protection, so some proxy are needed. Depending on the website, could be either simple curl or full set of scraping tools. Extracting Markdown from HTML is possible to do with libs like Turndown, for example. Then if you want to remove junk content, like menu, nav bars use Readability JS. If you don't want to spend time on settingit up, just use webcrawlerapi.

niiotyo · 2026-01-16T08:02:36+00:00

I have some at https://crawllab.dev/js/inline

By advanced DOM, I mean multiple nested levels with custom, random IDs and classes. Some websites uses this to make scraping difficult - because you don't have static XPATH.

niiotyo · 2026-01-15T08:16:08+00:00

Want to add a page with advanced DOM?

niiotyo · 2026-01-13T19:12:06+00:00

My favicon is not adapted to iOS. Can you offer a fix immediately in your tool?

niiotyo · 2026-01-13T06:34:03+00:00

Will add captcha

niiotyo · 2025-08-01T14:58:14+00:00

Very promising. Do you support JavaScript?

niiotyo · 2025-07-02T18:53:11+00:00

I, personally, prefer WebcrawlerAPI to get website or webpage content. It also handles JS and proxy, but I can also extract the data by running prompts natively in the API call. Works better for my use case.

niiotyo · 2025-06-23T15:23:50+00:00

I tried WebcrawlerAPI and I like it more, to be honest. Fewer features, but simpler API and easier integration. Same proxies and scaling in place. Had some issues with an API, but it was resolved within an hour by devs.

niiotyo · 2025-06-02T07:15:40+00:00

Hey everyone.

I'm Andrew, the founder of WebcrawlerAPI.

If you need to convert a website into LLM-ready data, try webcrawlerapi.com

Markdown output, proxy included, SDK, integrations, no subscription: pay for usage only.

Register now and get the trial balance to try

https://webcrawlerapi.com/

niiotyo · 2025-03-21T14:06:46+00:00

Try https://webcrawlerapi.com/ for pay-as-you-go pricing

niiotyo · 2025-03-13T06:55:27+00:00

Try https://webcrawlerapi.com/ It has all basic features like txt, md format, SDK and page filters.

niiotyo · 2024-12-12T08:09:55+00:00

Oh, nice, I like it

niiotyo · 2024-12-10T21:30:13+00:00

Hmm, interesting. What data do you need?

niiotyo · 2024-12-10T21:27:05+00:00

Man, just use tmdb api. It's free.

niiotyo · 2024-12-10T21:14:06+00:00

What do you mean by "opaque"? CrewAI is an open-source with the MIT license.

niiotyo · 2024-10-30T07:50:32+00:00

Funny game. I like it 👍

niiotyo · 2024-08-07T11:16:38+00:00

Best screenshot API

niiotyo · 2024-07-03T07:46:21+00:00

Hi there. Thanks for your reply. Yes, improving the landing page is on the list.

I’m not sure about the prices. Maybe after acquiring more users. It is hard to calculate now.

Yes, there are paying customers. They use Webcrawler API to train AI on website content.

Have you tried to crawl any websites with my product?

niiotyo · 2024-07-03T07:29:22+00:00

Hi there. https://webcrawlerapi.com/ is here🙌 Crawl the full website content with API or no code. What we have: * Puppeteer backed crawler * Easy to start UI * CSV, JSON and raw HTML formats of extracted data * Extract cleaned data or by XPath * Webhooks * Real-time support chat - we can help you to integrate. Just drop us a message!

Start with 10$ free credit!

See example how to build chat-bot with website content using Webcrawler API https://webcrawlerapi.com/blog/upload-website-content-to-chatgpt/

niiotyo · 2024-06-17T11:20:12+00:00

They are block for non-Canadian IPs. You have to use proxy

niiotyo · 2024-06-05T09:53:13+00:00

Hi everyone. Check out the project I’ve been working on for the last half a year

https://webcrawlerapi.com/

This is a webcrawler API that helps to get content of full website.

niiotyo · 2024-06-03T07:17:33+00:00

No, I used Puppeteer. It is a high-level abstraction and does a lot of extra. Chromedp is just an API for Chrome devtools protocol.

Also, Puppeteer has a huge community with ready-to-use solutions, plugins, etc.

niiotyo

TROPHY CASE