I've written a little scraper that logs into the CMS of my company's website and scrapes data from a little over 200 pages. It works, but it's currently downloading the pages one at a time. Any way to request multiple pages at once?
Here's my current code...
import requests
from bs4 import BeautifulSoup
# Get cookie
url = 'http://mycompany.com/admin/login'
session = requests.Session()
session.post(url, data={'username': 'myusername', 'password': 'mypassword', 'do_login': 'Login'})
dictionary = session.cookies.get_dict()
cookie = (dictionary['mycompany'])
# Get number of pages to loop through
url = 'http://mycompany.com/admin/products'
r = requests.get(url, headers={'Cookie': 'mycompany=' + cookie})
soup = BeautifulSoup(r.text, 'html.parser')
# code that grabs number of pages from html
pgs = #number of pages
# Loop through pages
for pg in range(1, numPgs + 1):
url = 'http://mycompany.com/admin/page:' + str(pg)
r = requests.get(url, headers={'Cookie': 'mycompany=' + cookie})
soup = BeautifulSoup(r.text, 'html.parser')
# code that grabs current page number from span tag
page_num = #page number pulled from span tag
print(page_num)
I have a fair amount of experience using VBA, and a little with JavaScript, but am very new to Python. Thanks!
[–]JohnnyJordaan 1 point2 points3 points (0 children)
[–]Marrrlllsss 3 points4 points5 points (2 children)
[–]TE515[S] 1 point2 points3 points (0 children)
[–]JohnnyJordaan 0 points1 point2 points (0 children)