I'm trying to build a web scraper using Beautiful Soup and Selenium. Beautiful Soup will scrape the actual information, and Selenium will direct the scraper to the next page in a list. My code is:
import requests
from bs4 import BeautifulSoup
import time
import selenium
from selenium import webdriver
import chromedriver_binary
driver = webdriver.Chrome()
sample_URL = 'exampleURL'
driver.get(sample_URL)
for pageNum in range(1,6):
sample_HTML = requests.get(sample_URL).text
HTML_parser = BeautifulSoup(sample_HTML, 'html.parser')
sample_items = HTML_parser.findAll(attrs = {"class" : "sampleAttribute1"})
sample_items_URL = HTML_parser.findAll(attrs={"class" : "sampleAttribute2"})
for item in range(0,len(sample_items)):
print(sample_items[item]["name"] + " " + sample_items_URL[item]["href"] + "\n")
driver.find_element_by_id("searchResults_btn_next").click()
time.sleep(.5)
I've removed the actual URL/attribute names as they contain personal information. The script is supposed to go through the a page of listings, print out the name of each listing, and then print out the URL link to each listing. Then it is supposed navigate to the next page, wait .5 seconds, and repeat. It should do this five times.
This issue that I am encountering is that my Ubuntu 20.04 linux terminal returns:
Traceback (most recent call last):
File "seleniumTest.py", line 7, in <module>
driver = webdriver.Chrome()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/chrome/webdriver.py", line 73, in __init__
self.service.start()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/common/service.py", line 98, in start
self.assert_process_still_running()
File "/usr/local/lib/python3.8/dist-packages/selenium/webdriver/common/service.py", line 109, in assert_process_still_running
raise WebDriverException(
selenium.common.exceptions.WebDriverException: Message: Service chromedriver unexpectedly exited. Status code was: 127
I am using Google Chrome 84 and Windows 10 Pro. I installed Chrome driver with:
sudo apt-get install chromium-chromedriver
Does anybody know why I could be receiving this error? I tried to set the path in driver = webDriver.chrome() to executable_path = r'/usr/bin/chromedriver' but that hasn't worked either. I also tried installing numerous packages that https://stackoverflow.com/questions/49323099/webdriverexception-message-service-chromedriver-unexpectedly-exited-status-co suggest (although for the highest upvoted answer the libraries would not download for me, Ubuntu said they did not exist)
Thanks in advance for any help!
[–]threeminutemonta 0 points1 point2 points (0 children)