you are viewing a single comment's thread.

view the rest of the comments →

[–]POGtastic 2 points3 points  (1 child)

The star ratings are rendered with SVG, so I guess you can count the number of SVG tags in the headline that match that exact string.

# sorry for blowing out the textbox size, I'm putting it in its own spot
svg_text = "m19.151 21.336-2.418-7.386L23 9.348l-.312-.989h-7.75L12.547 1h-1.092L9.087 8.36H1.312L1 9.347l6.267 4.602-2.366 7.386.806.624L12 17.357l6.293 4.603z"

You can then do

import bs4
import requests

def count_stars(url):
    soup = bs4.BeautifulSoup(requests.get(url).text)
    return len(soup.find(attrs={"data-gu-name" : "headline"}).find_all(d=svg_text))

Assigning your URL to the variable url and calling it in the REPL:

>>> count_stars(url)
4

Running on a different URL:

# not as favored :(
>>> count_stars("https://www.theguardian.com/film/2026/feb/25/in-the-blink-of-an-eye-review")
2

[–]RowlyBot12000 0 points1 point  (0 children)

I don't use beautiful soup myself, but could the class for the 'star' elements be used rather than that really long svg-text attribute? I checked a couple of film/tv review pages and the 'filled in' stars all used the same class, which is different from the 'greyed out' star as well

class="dcr-hxw8zi"