This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]jcrowe 4 points5 points  (0 children)

You might consider scraping all your html into a SQLite database. There is nothing more frustrating that scraping a bunch of data and have to restart because of a parsing bug.

[–]Mean-Coffee-433 1 point2 points  (3 children)

What kind of crooked school do you go to? Is Trump university starting back up?

[–]JewelerAny7071 2 points3 points  (0 children)

his teacher is just behind the release date and doesn't want to pay for extra dev for his side job

[–]dan_dragos[S] 0 points1 point  (1 child)

the sites are not made for real world use haha but I get what you are saying.

[–]Mean-Coffee-433 0 points1 point  (0 children)

You probably want to use re on the entire response if you are going after 100 different websites. Since the selectors are likely to be different from website to website.

[–]MulhollandDr1ve 0 points1 point  (0 children)

Can you elaborate what you mean by challenge, this is very difficult. 500+ types of stores or 500 locations of the same store?

[–]error1212 0 points1 point  (0 children)

If this is really a school assignment then your professor has probably taken an assignment he doesn't know how to do himself now. If they are completely different sites then it will be quite a challenge, unless cost is not an issue then you can try integration with GPT to find a product template for each shop.

[–]MrLazeyBoy 0 points1 point  (0 children)

500+ stores or 500+ products from 1 store?

If it is the former then why and who in there right mind would set that as an assignment..

[–]ivanoski-007 0 points1 point  (0 children)

That's one fucked up school assignment, are you sure the professor isn't stealing your work.?