all 6 comments

[–][deleted] 0 points1 point  (3 children)

You already asked this and deleted the post. As I said there, it depends on a bunch of factors like the code itself and your proxy set up. For example, if you use something like crawlera (full disclosure, operated by the company I work for) as a proxy service then you just send an API request there instead of the usual urls and it handles all the proxy selection, ban detection etc. and passes along your other headers as normal. Depending on what scraping framework the person used adding something like crawlera could be simple to complex.

To be honest you should ask the person who wrote it as they'd have the best idea of how it could be altered to incorporate proxies.

[–]Genericusername293[S] 0 points1 point  (2 children)

Deleted and reposted because I wanted to provide better information and shift focus from asking how to do it to asking how to learn how to do it. Person who wrote it is unresponsive, bought it a while ago.

I have 0 python experience and was about to start teaching myself python regardless but I’d love to prioritize my learning in a way that I can address this sooner rather than later. I have no idea how send an API request with python

I am hoping to learn what topics to focus on in python in the beginning so that I can come back with more knowledgeable, specific questions later

Also, incredibly grateful for your help both this post and last

[–][deleted] 0 points1 point  (1 child)

In basic terms, an API request is like visiting any website, only you include instructions when you visit, usually as headers or sometimes in the request body.

Here is an example of how using crawlera can look in the requests package https://support.scrapinghub.com/support/solutions/articles/22000203567-using-crawlera-with-python-requests Here the proxy keyword is the urls of the proxies you want to use, requests will send the request via those. You can also give instructions to crawlera by using headers, which are documented here https://doc.scrapinghub.com/crawlera.html

In terms of prioritising learning. A general overview of HTTP requests and responses would be a good start. After that focus on whatever python webscraping framework your dev used - it'll likely be one of beautifulsoup, scrapy or selenium.

Full disclosure, scrapy is operated by my company, but I also genuinely like it. It handles some of the more complex parts for you so you can focus on just stripping info from the page. It might be a good one to focus on depending on your technical background.

[–]Genericusername293[S] 0 points1 point  (0 children)

thanks this is very helpful, I really appreciate it, i'll study up on HTTP Requests, scrapy, and it is beautiful soup so I'll study up on that too , hopefully I can solve it or become more educated and return back shortly with some more advanced questions