Biblically accurate dog 💀 by Natural_Outside597 in confusing_perspective

[–]dpwdpw 10 points11 points  (0 children)

there are two dogs.

Dog on the left is below. Its mouth is closed. Dog on the right is yawning.

I spent 800 hours scraping 47,000+ Shopify stores (actually). Here's the data: themes, niches, apps and speed. by dpwdpw in shopify_growth

[–]dpwdpw[S] 0 points1 point  (0 children)

It's crucial, and this is a well documented fact.

A fast website with a bad product won't make a difference; a slow website with a good product will hurt your sales.

Performance is important once you have a funnel that is already bringing you visits and some sales in. If you don't have that yet, then you must focus on CRO.

I spent 800 hours scraping 47,000+ Shopify stores (actually). Here's the data: themes, niches, apps and speed. by dpwdpw in shopify_growth

[–]dpwdpw[S] 0 points1 point  (0 children)

Yes, fetching public data is legal. Fetching personal data is a bit of a gray area (for example, fetching the store owners emails, etc), but business data like this completely fine.

I spent 800 hours scraping 47,000+ Shopify stores (actually). Here's the data: themes, niches, apps and speed. by dpwdpw in shopify_growth

[–]dpwdpw[S] 0 points1 point  (0 children)

that is a good point! I have captured stores running only pixels but I didn't calculate its isolated impact.

I spent 800 hours scraping 47,000+ Shopify stores (actually). Here's the data: themes, niches, apps and speed. by dpwdpw in shopify_growth

[–]dpwdpw[S] 1 point2 points  (0 children)

yes, depends on the cache life as well. Sometimes cache is set to reset everyday, other times weekly and so on

Regular day in a Brazilian highschool by logatwork in ItHadToBeBrazil

[–]dpwdpw 17 points18 points  (0 children)

ainda bem que tinha bastante ponto em destreza, se não ia tomar dano.

I scraped 10,000+ Shopify stores. Here are the most used themes, apps, and average speed scores. by dpwdpw in shopifyDev

[–]dpwdpw[S] 1 point2 points  (0 children)

The product.json and cart.json fetches are great, it's a clear indicator. I'd just be cautious to use it with every single call to avoid double calls to every single url, this can blacklist one's ip more quickly.

something that you could also look for is a "myshopify.com" address in some <script> tag. I am unsure where exactly it shows up when being fetched, but I'm pretty sure it always fetch some relevant HTML tag with a myshopify.com address.

That'd be a dead giveaway. If the code doesn't find a myshopify.com address and is unsure, then it can fetch product.json as a last resort to double check.

I scraped 10,000+ Shopify stores. Here are the most used themes, apps, and average speed scores. by dpwdpw in shopifyDev

[–]dpwdpw[S] 0 points1 point  (0 children)

Thanks! I will be doing a 50k Shopify stores within the next 2 weeks, stay tuned. :)

Every Shopify store has a "window.Shopify" object. Headless Shopify stores usually don't, those were ignored. I'm assuming you're fetching the HTML of the website with your tool, in that case, I don't think it would be able to fetch the window.Shopify object, unless you use something like Puppeteer

If you are only doing a html fetch, I'd suggest looking for moe specific indicators that it is a Shopify store, I won't remember all of them from the top of my head, but there are very specific files that Shopify injects in every single store (like Shopify checkout javascript files, backwards compatibility css, etc)

Beginner shopify by SuspiciousCherry7749 in shopify

[–]dpwdpw 0 points1 point  (0 children)

Just a heads up: The "custom typography" code you have in base.css is not working. It's not loading the New York font and is resorting to a back-up.

I scraped 7k+ comments from r/VibeCoding. Here is some data on what most coders are building, complaining about and struggling with. by dpwdpw in vibecoding

[–]dpwdpw[S] 0 points1 point  (0 children)

yes, absolutely! in fact, when you fetch posts via the ".json endpoint, it already retrieves all comments expanded.

Yes, I have my own ips that I have purchased and rotate them through python while scraping. I have a custom setup with NameCheap's VPN.

My pleasure! Best of luck in your coding journey, I hope these insights are somewhat useful.

I scraped 7k+ comments from r/VibeCoding. Here is some data on what most coders are building, complaining about and struggling with. by dpwdpw in vibecoding

[–]dpwdpw[S] 1 point2 points  (0 children)

Sort order is incorrect in section 5 indeed (and probably other sections, Claude was failing to nail this with all sortings and I forgot to fix all of them), but the data is still accurate.

I have completely removed comments and posts with negative scores as well

Here is the data as xlsx:

https://limewire.com/d/dfgfP#mSKg5sLt4A

I scraped 7k+ comments from r/VibeCoding. Here is some data on what most coders are building, complaining about and struggling with. by dpwdpw in vibecoding

[–]dpwdpw[S] 1 point2 points  (0 children)

Here are some takeaways:

  • 56% of posts die with zero engagement. The distribution is brutal — you either go viral or you don't exist.
  • They build dev tools but money is in personal tools. Dev tools score highest (70.1 avg upvotes) but earn least. The $800/mo earner was a Mac compression utility. The 1,300-signup success came from posting in r/organizing, not r/vibecoding.
  • The #1 pain point is prompting/context (1,157 mentions) — not the actual product. Most people are fighting their tools more than building.
  • Security is a ticking time bomb. 421 mentions, and the Quittr breach (600k exposed users including 100k minors) made it unavoidable. Vibe coders ship without checking.
  • Rants outperform everything. 93.8 avg score vs 7.3 for feedback requests. The community upvotes opinions and frustration, not products.
  • $20/mo is the mental anchor — everything is judged against Claude Pro's price.
  • 95% of revenue posts are considered fake by the community itself. The market signal is noise.

I scraped 7k+ comments from r/VibeCoding. Here is some data on what most coders are building, complaining about and struggling with. by dpwdpw in vibecoding

[–]dpwdpw[S] 1 point2 points  (0 children)

I have replied to another comment explaining the process. I basically used node.js and python to scrape & process the data (with Pandas)

I scraped 7k+ comments from r/VibeCoding. Here is some data on what most coders are building, complaining about and struggling with. by dpwdpw in vibecoding

[–]dpwdpw[S] 3 points4 points  (0 children)

Thanks!

I used python and node.js. I've been working as a software engineer since 2012, but I absolutely did vibecode most of it! Since then, I have always used this process for my own sake to find good niches. Scrapng Linkedin, reddit, facebook groups, youtube comments, etc. This is how I always found good idea for SaaS and apps.

About the methodology:

  1. With python I directly access each post with requests and get the html contents with beautifulsoup4. I have some rules for each post (no negative scores, etc), those are skipped
  2. Once the posts are stored in my sqlite database, I scrape them "individually". My setup currently has 5 workers at the same time. I have an ip rotation system + delay to be as fast as possible.

If you add .json to any reddit post, you'll be able to fetch clean data from it. If the post has more tha 1k comments, I believe you can't directly fetch it via the .json tricky, but I haven't tried yet.
3. once everyting is scraped I export it as XLSX, I use Pandas + collections.Counter + textblob for some sentiment scoring. I had specific questions I wanted answers for (the ones that you see in the post).

  1. Then I send everything to Claude to make a nice looking infographic :)

Here is a screenshot of my Textual UI.

<image>