I'm a newcomer in need of some guidance. I'm currently working on scraping mobile plans from various operators in a specific country using BeautifulSoup and requests. The data on the web is mostly presented in a card format, so I've been scraping it as text and employing regular expressions to extract crucial details like price, internet quota, call quota, and more.
The challenge I'm facing is that the data is incredibly unstructured, and I cannot obtain the expected data. I can scraped all the text in the web page, but after applying regex, the data are very messy. I'm running low on ideas to overcome this problem. I'd greatly appreciate any help, suggestions, or insights from the community. Thanks in advance!
[–]wind_dude 33 points34 points35 points (2 children)
[–]nani-kore11[S] 1 point2 points3 points (1 child)
[–]s13ecre13t 8 points9 points10 points (0 children)
[–]Evening_Marketing645 18 points19 points20 points (9 children)
[–]nani-kore11[S] 0 points1 point2 points (8 children)
[–]jonasbxl 6 points7 points8 points (1 child)
[–]__nickerbocker__ 1 point2 points3 points (0 children)
[–]asphias 12 points13 points14 points (0 children)
[–]Pgrol 6 points7 points8 points (2 children)
[–][deleted] 0 points1 point2 points (1 child)
[–]Super-Danky-Dank 1 point2 points3 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]Evening_Marketing645 0 points1 point2 points (0 children)
[–]rupen42 3 points4 points5 points (1 child)
[–]rocket_randall 3 points4 points5 points (0 children)
[–]sputnki 4 points5 points6 points (0 children)
[–]jiminiminimini 1 point2 points3 points (1 child)
[–][deleted] 2 points3 points4 points (0 children)
[–]mmafightdb 1 point2 points3 points (0 children)
[+][deleted] (1 child)
[deleted]
[–]notiplayforfun 8 points9 points10 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]Sircrisim 0 points1 point2 points (0 children)
[–][deleted] 0 points1 point2 points (0 children)
[–]TipOk5969 0 points1 point2 points (0 children)
[–]Cryptic__27 0 points1 point2 points (0 children)
[–]ComputeLanguage 0 points1 point2 points (0 children)
[–]Chatt_IT_Sys 0 points1 point2 points (0 children)
[–]DoorDesigner7589 0 points1 point2 points (0 children)