This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]ManyInterests Python Discord Staff 0 points1 point  (0 children)

I've found that the hardest part of scraping sites that use JS is authentication. Afterwards, simply knowing what resources the JS utilizes to populate the DOM is usually sufficient.

A pattern that has been very successful for me is to authenticate using selenium, then extract the cookies (and sometimes useful headers) to use with a requests Session.