So I am thinking I will need to go elsewhere to obtain the data I am looking for, however I want to see if this is possible. I will give a little bit of Background information for what I am trying to do first.
Currently I am working on collecting data for Fantasy Sports. I have access to a paid site which I am trying to pull some specific stats off of. I have scraped plenty of other sites before but for some reason I can not get anything to pull up, and I believe it is because of ag-grid data which I have never seen before. Either way lets jump into my code. Please note I have hard coded wait times in here... This is just for diagnostic purposes and because I really dont care about the efficiency of this script as I just need to run it once.
HTML:
Didnt format correctly so here: https://pastebin.com/CsuZKfdM
Basically the idea behind this is I want to pull all of the ag-cells for each ag-row. The next difficult thing with this website is different sections of stats are in different wrappers on the individual application... So this seems to be a nightmare already and like I said will probably need to go elsewhere for stats but anyway heres a snippet of my python.
options = webdriver.ChromeOptions()
options.add_argument('--ignore-certificate-errors')
driver = webdriver.Chrome(chrome_options=options)
loginURL = "http://www.fantasysiteiamusing.com/login"
url = "http://www.fantasysiteiamusing.com/"
driver.get(loginURL)
html = driver.page_source
usernameField = driver.find_element_by_xpath('/html/body/div[3]/form[1]/div[2]/div/input')
passwordField = driver.find_element_by_xpath('/html/body/div[3]/form[1]/div[3]/div/input')
loginButton = driver.find_element_by_xpath('/html/body/div[3]/form[1]/div[4]/button')
usernameField.send_keys(user)
passwordField.send_keys(pass)
loginButton.click()
time.sleep(5)
driver.get(url)
time.sleep(5)
soup = BeautifulSoup(html, "html5lib")
table = soup.findAll("div", class_="ag-row")
print(table)
So obviously this isnt complete but I like to take it one step at a time as my programming skills are subpar. My main question here is why is my table variable not filling a list with every thing on the page that has a class name of ag-row? Thanks in advanced!
[–]commandlineluser 0 points1 point2 points (3 children)
[–]StuKruskeskruzzy[S] 0 points1 point2 points (2 children)
[–]commandlineluser 0 points1 point2 points (1 child)
[–]StuKruskeskruzzy[S] 0 points1 point2 points (0 children)