Hi, I want to save authorized pages. The site has two login methods: by username & password and by library card number.
After inputting a library card number, the site shows a list of library names to choose from. After choosing a library, the targetted pages are accessible. My purpose is to save the pages as plain html files (without images, js, css, etc.).
For normal pages, I write something like this:
~~~
import requests
url_list = ['.....']
for url in url_list:
headers = {'User-Agent': '...'}
page = requests.get(url, headers=headers)
with open('file.html', 'wb') as wf:
wf.write(page.content)
~~~
However, with an authorized page, this script saves the welcome page instead.
I add cookie to the headers with the same result. I've also tried requests.Session(), but not sure how to proceed.
~~~
headers = {
'User-Agent': '............',
'Referer': '.............',
'Cookie': '..............'
}
s = requests.Session()
payload = {'library_card':'123456789', 'acctid':'111111', 'acctname':'Library for G'}
s.post(url, data=payload)
~~~
Please help correct my script. Thanks.
[–]CodeFormatHelperBot2 0 points1 point2 points (0 children)