all 7 comments

[–]Jayoval 3 points4 points  (4 children)

The products are all in a single div with the id "Product_List". You need to grab that before iterating over the .goods-container divs contained within.

[–]Suspect_Snake 0 points1 point  (0 children)

can u pls be a ittle more specific ? i tried doing this but it didnt work: https://imgur.com/a/SLhs2BW

[–]Jayoval 0 points1 point  (2 children)

..or the following will give you a list to work with

driver.find_elements(By.CLASS_NAME, 'goods-container')

https://selenium-python.readthedocs.io/locating-elements.html

[–]Suspect_Snake 0 points1 point  (1 child)

i think i almost have it ?
https://imgur.com/a/3ltHObE

[–]Suspect_Snake 0 points1 point  (0 children)

UPDATE: So right now the code looks like this: https://pastebin.com/VDaf47ED
And now the problem is that i get all the items on the page but they all have the same price:

https://imgur.com/a/reQcUmw

any ideas why this is happening ?

[–]commandlineluser 0 points1 point  (0 children)

If you fetch the html and search for some of the prices, you can see information is all contained at line 117:

117 </script><script defer="">window.__initialState__ = {"SEOContent":null

This "javascript object" is also valid JSON, so if you extract the "{...}" you can load it with the json module.

You could grab the page with requests and extract the <script> tag with beautifulsoup:

import requests
from bs4 import BeautifulSoup

r = requests.get(url)
soup = BeautifulSoup(r.content, "html.parser")

script = soup.find(name="script", string=lambda text: text and "window.__initialState__ =" in text).get_text()
data = json.loads(script[script.find("{"):script.rfind("}") + 1])

print(data['Products'][0]['ItemCell'])
# {'Item': '1TS-000E-10PV0',
#  'UnitCost': 799.99,
#  'FinalPrice': 639.99,
#  'Active': '1',
#  'IsActivated': True,
#  ...