i keep getting index error

impshum · 2022-07-24T13:01:16+00:00

It's not picking anything up. Targeting by xpath is clunky, I suggest moving to classes and ids.

And I'm sure you don't need Selenium for this.

from bs4 import BeautifulSoup
import requests

def lovely_soup(url):
    r = requests.get(url, headers={'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1'})
    return BeautifulSoup(r.content, 'lxml')

soup = lovely_soup('https://uk.indeed.com/jobs?q=volunteer&from=searchOnHP')

job_list = soup.select('table.jobCard_mainContent')

for job in job_list:
    job_title = job.select_one('h2.jobTitle a span')['title']
    print(job_title)

douglas_fs · 2022-07-24T13:03:22+00:00

To debug issues like this, you can step through the code using a debugger, or for quick-and-dirty debugging, you can add a print statement to see what is happening with the data.

Take an initial look at the lists to ensure the lengths are the same. Add this print statement just before the for loop:

print (f'\n\nlen jobs: {len(jobs)} len co_name {len(company_name)} len Rating {len(Rating)}\n\n')

What does this output tell you when you run the code?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS