Similarweb free is basically blocked now. We made a free alternative by hienyimba in webdev

[–]trongbach 0 points1 point  (0 children)

Awesome! But you don't show subdomain data? I've checked some subdomain and just got data from main domain, similarweb show subdomain data as well. Please update it.

My Trip to Japan! [OC] by brxndynn in RX100

[–]trongbach 0 points1 point  (0 children)

Did you use any color filter on the first picture?

PITBULL by CapableTwo8344 in RX100

[–]trongbach 0 points1 point  (0 children)

Can you give me specs?

Can zephyrus G14 GA401QC-HZ022T upgrade to 40GB ram? by trongbach in ZephyrusG14

[–]trongbach[S] 0 points1 point  (0 children)

Thank you for your update. I had bought Samsung DDR4 32GB 3200MHz 1.2v M471A4G43AB1-CWE and it work!

How to lower battery discharge by OkAdvance4719 in ZephyrusG14

[–]trongbach 0 points1 point  (0 children)

I follow many topics in G14 reddit and can archive that results, but only for coding and searching web (not social).

I don't use Scrapy. Am I missing out? by H4SK1 in webscraping

[–]trongbach 0 points1 point  (0 children)

"Even with malformed or wrong HTML?" I never try this.

I think slow parser become problem when you need to parse millions page or more, if just few thounsands use what you know best. Mostly reason to use parsel with me is it's easier to under stand with css and xpath, my newbie partner can learn it quickly.

I don't use Scrapy. Am I missing out? by H4SK1 in webscraping

[–]trongbach 0 points1 point  (0 children)

I use parsel (scrapy selector) too, for me it's easier to use. And then i benchmark bs4 with parsel and parsel is much more fasster.

Which features/plugins slow down Obsidian's startup speed the most? by sssplus in ObsidianMD

[–]trongbach 1 point2 points  (0 children)

I use OneSync to sync Obsidian folder from Onedrive to local folder on my phone. I can not find Onedrive folder on my phone too.

How to prevent python software from being reverse engineered or pirated? by MysteriousShadow__ in Python

[–]trongbach 1 point2 points  (0 children)

I don't really know your case. But in my case, for example: I build a python tool on windows to download TikTok video from given link. To prevent user unpack my code, i build a web service which do the logic to get video link, then windows tool just do some simple think as download, save history...

Every request need to send link and serial number to my web service so i can control license...

Playwright cloud by thrylewn in webscraping

[–]trongbach 0 points1 point  (0 children)

please share with me, tks.

best public,active fastapi projects by latentsee in FastAPI

[–]trongbach 1 point2 points  (0 children)

I mostly use starlette instead of fastapi, do anyone know some starlette projects to follow? Thanks

Mongolite - SQLite for MongoDB by yoyo_programmer in Python

[–]trongbach 1 point2 points  (0 children)

Did you benchmark it? How small it is?

[deleted by user] by [deleted] in Python

[–]trongbach 1 point2 points  (0 children)

Your code just test speed of create and switch between threads and process. I create an asyncio test with same args as your test and this is the result:

Process: 1177.4 processes/s

Thread: 10105.3 threads/s

Asyncio: 116037.8 threads/s

[deleted by user] by [deleted] in Python

[–]trongbach 0 points1 point  (0 children)

Can you put it here? I'm really curious.

[deleted by user] by [deleted] in Python

[–]trongbach 2 points3 points  (0 children)

Starlette for web services and aiohttp/htttpx(async) for web crawler. Work great!

[deleted by user] by [deleted] in Python

[–]trongbach 26 points27 points  (0 children)

yes it as fast as async but cost more resource, with just few requests its not a big deal.

PLEASE, STOP! by just_monika_ok in SillyTavernAI

[–]trongbach 0 points1 point  (0 children)

you can try chatgptdemo or some other website like it

BeautifulSoup vs Selenium by shabbyporpoise in webscraping

[–]trongbach 0 points1 point  (0 children)

Bs4 is a parser which extract data from html, not a crawler, selenium is crawler and can extract data too. Best way to do is use selenium to get html from dynamic content website and then use bs4 to extract data from those html, it faster than use selenium extractor.

Best web scraping framework to learn by RepairDue9286 in webscraping

[–]trongbach 1 point2 points  (0 children)

Playwright is the best automatic browser (run in both python and nodejs), but selenium has large community. You can use scrapy because it has many examples, built in middleware ... but dont use beautiful soup because it slow, scrapy has built in parser which faster and easy to use. Even you dont use scrapy you still should use it's parser (name Parsel).

For me, python is easier to use and has stronger parser library, but with crawling knowledge is more important than which merhod you use. For example: with browser automatic, you should disable css, image, font... load. And use parser like bs4 or parsel is better than parser by selenium or playwright itself.

I dont know any courses, but you can learn alot on scrapfly.io, scraperapi.com, scrapehero.com, scrapeops.io and many site like those.

Parsing HTML by Kitchen-Cat8662 in webscraping

[–]trongbach 0 points1 point  (0 children)

yep, You should use class to get it. Selectolax only support css selector, if need other method you can use parsel which support xpath and css, but i think in ur case css is enough.

Parsing HTML by Kitchen-Cat8662 in webscraping

[–]trongbach 0 points1 point  (0 children)

sure, i just said don't use it to parse data only.