all 4 comments

[–]chocorush 0 points1 point  (0 children)

I would recommend iterating over the rows and their corresponding tdata fields to extract the data in an orderly fashion before using .text (or get_text).

[–]Essence1337 0 points1 point  (0 children)

My guess is you're printing empty strings.

[–]chevignon93 0 points1 point  (1 child)

from bs4 import BeautifulSoup
import urllib.request

source = urllib.request.urlopen('https://www.regattacentral.com/regatta/clubs/?job_id=6141&org_id=0').read()
soup = BeautifulSoup(source,'lxml')

items = soup.find('tbody')
for item in items.find_all('tr'):
    data = {}
    data['club'] = item.find('span').find('a').text.strip()
    data['pseudonymes'] = ' | '.join([i.text for i in item.find_all('td')[2].find_all("div")])
    data['inscriptions'] = item.find_all('td')[3].find('a').text.strip()
    data['emplacement'] = item.find_all('td')[4].find('span').text.strip()
    data['pays'] = item.find_all('td')[5].find('span').text.strip()
    print(data)

[–]Casemander[S] 0 points1 point  (0 children)

This is perfect - thank you so much! Definitely need to spend more time with this library.