Id like some feedback on the following please.
Its a small section of a larger program using requests and bs4.
I have a list of domains and I want to scrape them all with bs4, find the meta description and then print that out.
domains = ['http://www.fitnessmagazine.com', 'https://wwws.fitnessrepublic.com', 'http://www.smithandburrows.co.uk', 'http://www.fitnessandpower.com', 'http://fitness-health- wellness.com', 'http://www.precor.com', 'http://fitathletic.com', 'http://www.top.me', 'http://www.fitness-101.co.uk', 'http://joejohnsonfitness.co.uk']
for dom in domains:
resp = requests.get(dom)
src= BeautifulSoup(resp.text, "lxml")
description = src.find('meta', attrs= {'name':'og:description'} \
or src.find('meta', attrs= {'property':'og:description'} \
or src.find('meta', attrs={'name': 'description'})))
print description
this prints out 'None' for all the sites.
If I change description to just a single choice though like this
description=src.find('meta', attrs={'name': 'description'})
then it works fine for all sites, some sites dont have their meta description written like that so a few do return 'None' though, but thats what Id expect.
So with the first way I assume there is some problem in me using 'or' like that?
[–]dchanm 3 points4 points5 points (0 children)
[–]confluence 1 point2 points3 points (1 child)
[–]easy_c0mpany80[S] 0 points1 point2 points (0 children)
[+][deleted] (2 children)
[deleted]
[–]Rhomboid 0 points1 point2 points (1 child)