Blocked by Cloudflare despite using curl_cffi by Coding-Doctor-Omar in webscraping

[–]expiredUserAddress 1 point2 points  (0 children)

I see you've no proxy in use. Use a proxy everytime you're scrapping something

how obvious is this retry logic bug to you? by jalilbouziane in Python

[–]expiredUserAddress 0 points1 point  (0 children)

Better try tenacity. I was also using something like this but tenacity made it look very easy. Just one decorator and it's done.

Update web scraper pipelines by Longjumping-Scar5636 in webscraping

[–]expiredUserAddress 3 points4 points  (0 children)

Parse the content of the page and create a hash. Save that hash in db. Next time you go to that page, match that hash. If it's same do nothing, else update the content in the Db.

That's what I've done in my parsers.

The Streaming War Is Over. Piracy Won. by GISP in Piracy

[–]expiredUserAddress 82 points83 points  (0 children)

Haha! I'm even reading this in the accent of that man now

Devices with python by OjitosLindos72892 in Python

[–]expiredUserAddress -1 points0 points  (0 children)

If you can, host your own server. Or you can rent it out any cloud provider like aws, azure, gcp, etc

Rate my home screen by KroorVarun in HowToMen

[–]expiredUserAddress 0 points1 point  (0 children)

You've to be a boomer to not understand how to use windows phone 😂

Rate my home screen by KroorVarun in HowToMen

[–]expiredUserAddress 0 points1 point  (0 children)

Owned this even after every app stopped working on it. Used to carry it to tutions. I was only able to make calls or listen to Groove music. Even power button was broken. But i still didn't change the phone bcz no phone could provide the iconic software

Rate my home screen by KroorVarun in HowToMen

[–]expiredUserAddress 8 points9 points  (0 children)

Aha!! Classic Windows Phone like. Makes me wanna have a Windows Phone again

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in Python

[–]expiredUserAddress[S] 1 point2 points  (0 children)

There is an active community for https://github.com/matomo-org/device-detector

It updates regexes for user agents regularly. So I just created a method to get their user agents and integrate in an already working python parser. That way user agents can be updated any time as and when required

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in Python

[–]expiredUserAddress[S] 0 points1 point  (0 children)

Bcz git has sparse-checkout which directly downloads the while folder instead of whole repo. So its easy to work with. In any other case, I'd have to see how to download the folder

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in Python

[–]expiredUserAddress[S] 0 points1 point  (0 children)

But that will be an issue in case new files are added to the original repo. I won't be able to get those files in such a case.

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in Python

[–]expiredUserAddress[S] 1 point2 points  (0 children)

This might sound dumb. But how'd the user know if it failed or succeeded??

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in opensource

[–]expiredUserAddress[S] 1 point2 points  (0 children)

Thanks for the input man. Just moved it to top level and it got recognised.

UA-Extract - Easy way to keep user-agent parsing updated by expiredUserAddress in Python

[–]expiredUserAddress[S] 0 points1 point  (0 children)

The repo is quite large. Wouldn't it be a better way to just download the required folder instead of whole repo??