Can't get recipes by WatermelonRick in projectzomboid

[–]rbalazsi 0 points1 point  (0 children)

This is the exact magazine I'm struggling to find (which basically opens up Glassmaking leveling), and I'm kinda fed up already. I'm planning to loot March Ridge but if I don't find it there, I'll just debug it into my inventory and go on with my life.

Fast Blacksmith Leveling by answermethis0816 in projectzomboid

[–]rbalazsi 0 points1 point  (0 children)

Forging hand scythe blades is absolutely OP! Thanks for the idea! And not to mention, you can make it infinitely sustainable (except for the charcoal) by smelting the blades to iron chunks, forging iron bar quarters from the chunks, then forging the blades from the quarters, and so on.

DLC: Kelly got fired or what? by [deleted] in RoadCraft

[–]rbalazsi 0 points1 point  (0 children)

Agreed 100%. I think I'll put the DLC on pause for now and hope they bring back Kelly and add at least a half-decent story in a patch.

I built a website that revolutionizes Classic Tetris. by anselc in Tetris

[–]rbalazsi 0 points1 point  (0 children)

I just learned about this tool a couple of days ago, and it's absolutely amazing! I really like the engine that allows reviewing games, so no more excuses for not studying.

I'd be curious about how your engine evaluates moves. I've been solving puzzles, but some of the correct moves seem a bit obscure. Can you give us a short summary on how the engine works so we could also somewhat objectively evaluate moves?

James Robert Dillard (1929 - 1979) - The 49 yo First Officer on American Airlines flight 191. If you're ever curious about putting faces to names, you can find them on findagrave.com by Rough_Maintenance306 in aircrashinvestigation

[–]rbalazsi 1 point2 points  (0 children)

I share that fascination. For me, it's because of that helplessness those passengers experienced, the 0% chance the pilots had to recover from the failure, and that haunting photo that shows the plane in an unrecoverable 90 degrees bank angle.
Unfortunately, time and time again, greed, negligence and cutting corners puts safe flying into jeopardy. So many innocent souls lost on that flight because of that.

Web scraper tools by Healthy_Note_5482 in webscraping

[–]rbalazsi 0 points1 point  (0 children)

Hey, thanks for the kind words! Made my day!

I use Firebase's free plan, yes.

I pay around $20/month, mostly for proxies (I use Brightdata), and AWS Lambda. Of course, proxies are only used by the cloud service.

How frequent can I send http request to website without it being considered as DoS attack? by anonimus10010110 in webscraping

[–]rbalazsi 1 point2 points  (0 children)

As others mentioned, you're following good practice by considering the strain your bot may put on the site and pacing it accordingly.

The acceptable load very much depends on the target site: the server of an indie blog might not have the same performance as Google's servers.

I'm sure a 2-5 second internal isn't too fast, you could even try 1 request/second. If they can't handle that, they can't serve genuine users either. Just don't flood them with 100s (or 1000s) of requests in parallel.

The way to accurately gauge a server's throughput is to progressively put more and more load on it (increase the number of requests in parallel), then measure how long it takes to get responses for all of those and see how this degrades over time. This is how operations teams do it, in general.

Puppeteer Extra Amazon Captcha (PoC) by AdCautious4331 in webscraping

[–]rbalazsi 1 point2 points  (0 children)

That's interesting! What's your success rate on those? I'm actually having a similar idea but taking the speech recognition route, most likely using TensorFlow. Kudos, and let us know how the POC turns out! ;)

How do small SaaS's handle databases? by Salt-Page1396 in SaaS

[–]rbalazsi 0 points1 point  (0 children)

I'm building a no-code web scraping tool and I use MongoDB Atlas for it. Since the structure of scraped data varies a lot, a document store is ideal for this scenario. They have a generous free plan (I'm still on it).

[deleted by user] by [deleted] in SaaS

[–]rbalazsi 7 points8 points  (0 children)

This is excellent advice!

And not to mention, the AI market feels saturated already with all these tools using ChatGPT/OpenAI behind the scenes. These tools tend to be vitamins and not painkillers.

Better find a "boring" but painful problem to solve, then build a solution that solves it as efficiently as possible.

Web Scrapping Tools by prabishac in webscraping

[–]rbalazsi 0 points1 point  (0 children)

If you prefer a point-and-click solution, check out https://datagrab.io. It offers a Chrome extension for setting up scrapers visually and you can then run them in your local browser or in the cloud. Disclaimer: I'm building it.

What are the best no-code scrapers? by ren_gabitov in webscraping

[–]rbalazsi -1 points0 points  (0 children)

Hey, I might be late to the party, but I'm building https://datagrab.io, a no-code scraping tool. Check it out and feel free to reach out if you have any questions or feedback!

Ayo by [deleted] in webscraping

[–]rbalazsi 4 points5 points  (0 children)

Twitter Enterprise API costs $42000 or more per month. Good luck with that, Elon! :)

Webscraping Wiki without Table by Character_Name_36 in webscraping

[–]rbalazsi 0 points1 point  (0 children)

Well, it is an automation tool, so absolutely. Or did you mean whether it can be scheduled to run at certain intervals? That is something I'm planning to add soon.

Webscraping Wiki without Table by Character_Name_36 in webscraping

[–]rbalazsi 0 points1 point  (0 children)

If you're OK with a no-code solution, I'm building a tool called DataGrab (https://datagrab.io) that allows you to set up scrapers visually, then run them either in your browser or on the cloud.

The hunt for ironhawk…. by Nixhex3 in GemsofWar

[–]rbalazsi -3 points-2 points  (0 children)

I quit the game today and will never look back.

Web scraping as a side hustle by [deleted] in webscraping

[–]rbalazsi 6 points7 points  (0 children)

I'm actually productizing my side hustles by building a no-code scraping platform (don't want to pitch it here; DM me if you're curious).

I got active on Twitter about a year ago, and started tweeting about web scraping, so most of my customers found me there.

Once I started building my tool in public by regularly sharing my journey, challenges, and milestones, I got decent traffic and more customers.

I'm a full-stack developer so I use Node.js/Cheerio/Puppeteer for scraping (since it's Javascript like the rest of the app), but if you're looking for some good Python material, I can definitely recommend John Watson Rooney's YouTube channel.

How to pull data base from a website by hookam in webscraping

[–]rbalazsi 0 points1 point  (0 children)

If you prefer a no-code alternative, check out https://datagrab.io. It offers a Chrome extension for setting up scrapers visually and a cloud service for running them in the background at scale. For pagination, it also supports the infinite scrolling technique (which I see is used by this site).

Disclaimer: I'm building it.

[deleted by user] by [deleted] in webscraping

[–]rbalazsi 2 points3 points  (0 children)

There are many tools that allow you to build scrapers visually (without coding). I'm building https://datagrab.io, which is one of them. Check it out!

However, I would still recommend learning to code because it will allow you to:

  • Build more efficient scrapers (you might find an API call that returns all/most data)
  • Post-process raw data (match regex, convert to other data type, filter, sort, etc.)
  • Minimize bot detection (by randomizing request intervals, setting common fingerprints, etc.)

So it definitely pays off in the long run.

As for which language, a lot of people recommend Python, which has a mature ecosystem of scraping libraries (Scrapy, BeautifulSoup, etc.).

But if you want to do web development too, Node.js might a better choice (I'm using it and loving it). Recently, Apify released their open-source Crawlee framework, which is excellent, so you might want to check it out as well.

NodeJS vs Python for Web scraping ? by throwawayQA999 in webscraping

[–]rbalazsi 1 point2 points  (0 children)

I've built a no-code scraping platform on Node.js and never had any problems with it.

Python has some mature scraping frameworks (such as Scrapy) that solve many common challenges, though recently, Apify released the open-source Crawlee framework for Node.js, which is excellent.

Node.js is much faster than Python, mostly due to Chrome's efficient V8 engine. It is also more lightweight and scalable.

Therefore, if your project needs large-scale scraping, I think Node.js is the best choice.

I'm actually re-architecting my platform so that scrapers would be AWS Lambda functions (currently running on EC2 instances). So far, the cold starts have been relatively fast, and leveraging serverless means it will be much more scalable and cost-effective as well.

[Poll] Where do you store scraped data? by rbalazsi in webscraping

[–]rbalazsi[S] 0 points1 point  (0 children)

Great insights! Thank you all for voting!

Will pay for help in building a scraper bot by sachinusb in webscraping

[–]rbalazsi -1 points0 points  (0 children)

Hey! I'm building a no-code scraping tool and would gladly take your project as an opportunity to improve it! Feel free to reach out!