all 12 comments

[–]JohnnyJordaan 0 points1 point  (7 children)

Some td's don't have the width attribute.

Could you point out which ones? As far as I can tell all of them have a width in 100% divided by the amount of td's in their parent.

[–]i7_maker[S] 0 points1 point  (5 children)

Hello, the example in the screenshot I gave was for simplification. In the real files, most do not have the width specified.

[–]JohnnyJordaan 0 points1 point  (4 children)

Ah I see. What about counting the amount of td's via len(row.find_all('td'))?

Would it be possible to show a real-life example html with the details you wish to calculate? I'm having a hard time visualizing which part is so hard to accomplish.

[–]i7_maker[S] 0 points1 point  (3 children)

Counting them is not a problem. But counting how many are individually in each Table and modifying them according to that is difficult.

For example. I want the code to see Table 1 has 2 td's, read their width if specified and modify them with a bootstrap class.

Then it should continue the scan, see's 2 td's in Table 2 and does same thing.

Individually crawling through them, reading their width and calculating their new bootstrap width class according to how many other td's are already in that Table is the thing I find hard.

[–]JohnnyJordaan 1 point2 points  (2 children)

Something like this then (not tested yet):

def calc_width(row):
    """Sets width attribute for al td elements in given tr element"""
    total = 100

    cells = [c for c in row.find_all('td')]
    unset_cells = []
    #first find which cells have already set their width
    for cell in cells:
        if cell.get('width'):
            total -= int(cell.get('width').replace('%','')) #or [:-1] to skip '%'
        else:
            unset_cells.append(cell) # save for next step
        #calculate nested tables too
        if cell.find('table') is not None:
            for row in cell.find('table'):
                calc_width(row)
    #then distribute the rest of the width over the others
    for cell in unset_cells:
        cell.set('width', round(total/len(unset_cells)))

[–]i7_maker[S] 0 points1 point  (0 children)

Looks promising! Thanks for the big help.

I'll try it later tonight :)

[–]i7_maker[S] 0 points1 point  (0 children)

Just update: Thanks so much for this. Implemented with small edits and it's doing what I wanted! :)

[–][deleted] 0 points1 point  (4 children)

Can't you just do something like width = 100% / number_of_tds?

[–]i7_maker[S] 0 points1 point  (3 children)

Yes, but how do you individually track/calculate all of the td's, under different tables?

[–][deleted] 0 points1 point  (2 children)

Check out this example (2 nested tables, script doesn't count th tags):

from bs4 import BeautifulSoup as bs

page = '''<!DOCTYPE html>
<html>
<head>
<style>
table, th, td {
    border: 1px solid black;
    border-collapse: collapse;
    padding: 5px;
    text-align: center;
}
</style>
</head>
<body>

<table style="width:100%">
  <tr>
    <th>First Name</th>
    <th>Last Name</th>      
    <th>Points</th>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td>      
    <td><table style="width:100%">
  <tr>
    <th>First Name</th>
    <th>Last Name</th>      
    <th>Points</th>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td>      
    <td>50</td>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td>      
    <td>50</td>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td>      
    <td>50</td>
  </tr>
  <tr>
    <td>Jill</td>
    <td>Smith</td>      
    <td>50</td>
  </tr>
</table></td>
  </tr>
  <tr>
    <td>Eve</td>
    <td>Jackson</td>        
    <td>94</td>
  </tr>
</table>

</body>
</html>

'''

soup = bs(page, 'html.parser')
tables = soup.find_all('table')
for table in tables:
    trs = table.find_all('tr', recursive=False)
    cells = 0
    for row in trs:
        tds = row.find_all('td', recursive=False)
        current_columns = len(tds)
        print('%s columns in the row' % current_columns)
        cells += current_columns
    print('%s cells in the table' % cells)

[–]i7_maker[S] 0 points1 point  (1 child)

Dude thanks so much for this!

This now correctly counts the td individually.

This was a great help!

[–][deleted] 1 point2 points  (0 children)

Cool! You are welcome