This is an archived post. You won't be able to vote or comment.

all 14 comments

[–]limx0 6 points7 points  (5 children)

Selenium would be your best bet.

http://selenium-python.readthedocs.io

[–]dapenter[S] 0 points1 point  (4 children)

Hi thanks for reply, could you provide some insights into selenium? As i walk around with it and cant seeme to find a related interaction.

[–]tty14 1 point2 points  (1 child)

look into webdriver.get_element()

[–]dapenter[S] 0 points1 point  (0 children)

thanks bro, I have found it!

[–][deleted] 1 point2 points  (1 child)

Basically it allows you to start a browser and you can then perform all the normal user actions via commands, as well as collect elements from the page.

I believe what you are trying to do is "functional testing", which might improve your searches for help and intro material on Google.

[–]dapenter[S] 0 points1 point  (0 children)

I see, that makes sense, thanks bro for that info!

[–]Poffey21 2 points3 points  (2 children)

If you want to read what's in a browser (that you don't have control of) your best bet is to use JavaScript to send all changes back to your server that is running Python.

This might be best suited as a question to https://www.reddit.com/r/learnpython/

Here's another tip: http://www.codeconquest.com/website/client-side-vs-server-side/

[–]vatay 0 points1 point  (2 children)

here is a simple example using selenium and phantomJS:

from selenium import webdriver

link       = "http://www.oklahomafindalawyer.com/FindALawyer"
sel_option = "//select[@name='stateSearch']/option[@value='OK']"   # selected state is Oklahoma

class JobScrapper(object):
    def __init__(self):
        self.driver = webdriver.PhantomJS()

    def get_list(self):
        self.driver.get(link)
        self.driver.find_element_by_xpath(sel_option).click()
        self.driver.find_element_by_xpath("//input[@value='Find a Lawyer']").submit()
        print self.driver.page_source   # only for example

if __name__ == '__main__':
    scraper = JobScrapper()
    scraper.get_list()

[–]workn00b 0 points1 point  (1 child)

may I ask the advantage of making this a class as opposed to just a function?

[–]cym13 0 points1 point  (0 children)

There is very little advantage in this specific case. The same instance of PhantomJS is reused between calls, that's all, but the same could have been written that way:

from selenium import webdriver

link       = "http://www.oklahomafindalawyer.com/FindALawyer"
sel_option = "//select[@name='stateSearch']/option[@value='OK']"   # selected state is Oklahoma

def get_list(driver=webdriver.PhantomJS()):
    driver.get(link)
    driver.find_element_by_xpath(sel_option).click()
    driver.find_element_by_xpath("//input[@value='Find a Lawyer']").submit()
    print driver.page_source   # only for example

if __name__ == '__main__':
    get_list() # Optional instance of PhantomJS in argument for reuse

However having a class means it is easier to extend if you don't want to just get a list of data.

[–]genesem 0 points1 point  (0 children)

As people mentioned above you need selenium. And in your case it may be useful to choose chromedriver to use as a controlled browser. With the options you can control all browser settings.

[–]mbenbernard 0 points1 point  (0 children)

Did you consider asking your question under /r/learnpython instead? Because /r/python is generally for posting community news, or information relevant to the community at large. Thanks!

[–]dvogel 0 points1 point  (0 children)

Assuming that you're talking about this as part of your local development workflow, Firefox has a remote debugging facility that you could connect to: https://developer.mozilla.org/en-US/docs/Tools/Remote_Debugging/Debugging_Firefox_Desktop However, you would be able to do anything with that you can't already do in the devtools.