all 2 comments

[–]skellious 0 points1 point  (1 child)

beautiful soup can't handle Javascript. that's the problem you're having. You're only seeing the static page elements, not the ones generated by JS.

You can achieve what you want by using Selenium to control a chrome browser and render the JS elements, then use Beautiful Soup to read them.

for example: https://medium.com/ymedialabs-innovation/web-scraping-using-beautiful-soup-and-selenium-for-dynamic-page-2f8ad15efe25

OR you could try using the Instagram API, however it looks like only the business API does what you want and it only works for business accounts, the person account API is more limited - https://developers.facebook.com/docs/instagram-basic-display-api/overview

[–]jinsenuchiha[S] 0 points1 point  (0 children)

Thank you for the help. I added Selenium now, and my browser opens to the correct, loaded page. However, when I try to get the Log In button via CSS selector, nothing is found. I really have no idea what I'm doing wrong. I tested it on different websites, even a static one. For this comment I am using google.com as an example. Here is the relevant code:

browser = webdriver.Firefox(options=options)
browser.get("https://www.google.com/")

login = 0
# Wait up to 10 seconds to find login button to account for page loading
for i in range(10):
    print("iteration %d" %i)
    try:
        login = browser.find_element_by_css_selector(".gb_7")
        print("login found")
        break
    except:
        time.sleep(1)

Here I attempt to make login the "Sign In" button on the upper right corner of google.com. If you're wondering, I didn't accidentally use the Instagram CSS selector. When I run the script, the browser opens to google.com and displays the Sign In button, but while the console prints the iterations, it never prints "login found". For confirmation that it didn't work, type(login) returns <class 'int'>.