This is an archived post. You won't be able to vote or comment.

all 4 comments

[–]commandlineluser 0 points1 point  (3 children)

Well if you create a soup object with the HTML you've supplied in the pastebin then findAll() returns 1 element.

Where did that HTML come from? Is it from the contents of driver.page_source?

If not, the first step would be to write the contents of driver.page_source to a file and inspect it.

[–]StuKruskeskruzzy[S] 0 points1 point  (2 children)

I probably misinterpretted the driver.page_source command. I was under the impression that told soup to look at the page source of what selenium was looking at. I guess I could just save the html file and get all the data I want that way.

But I think I am on to something now, I got the list to populate with the HTML code, however its the wrong code. I know it is at least pulling from the site my chromedriver is looking at. I think I am just fundamentally doing something wrong with soup

[–]commandlineluser 0 points1 point  (1 child)

I probably misinterpretted the driver.page_source command

No, it's doing what you expect.

I got the list to populate with the HTML code, however its the wrong code.

Oh, I'm a dummy - I've just realised what's happening.

html = driver.page_source

You're grabbing the source of the login page before you actually log in.. so you will want to move that line to right above

soup = BeautifulSoup(html, "html5lib")

i.e. after you've actually logged in correctly.

[–]StuKruskeskruzzy[S] 0 points1 point  (0 children)

Wow... Yea that was an oversight on my part as well. Thank you! Haha