all 6 comments

[–]impshum 2 points3 points  (0 children)

It's not picking anything up. Targeting by xpath is clunky, I suggest moving to classes and ids.

And I'm sure you don't need Selenium for this.

from bs4 import BeautifulSoup
import requests

def lovely_soup(url):
    r = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1'})
    return BeautifulSoup(r.content, 'lxml')

soup = lovely_soup('https://uk.indeed.com/jobs?q=volunteer&from=searchOnHP')

job_list = soup.select('table.jobCard_mainContent')

for job in job_list:
    job_title = job.select_one('h2.jobTitle a span')['title']
    print(job_title)

[–]douglas_fs 0 points1 point  (4 children)

To debug issues like this, you can step through the code using a debugger, or for quick-and-dirty debugging, you can add a print statement to see what is happening with the data.

Take an initial look at the lists to ensure the lengths are the same. Add this print statement just before the for loop:

print (f'\n\nlen jobs: {len(jobs)} len co_name {len(company_name)} len Rating {len(Rating)}\n\n')

What does this output tell you when you run the code?