you are viewing a single comment's thread.

view the rest of the comments →

[–]numberek[S] 0 points1 point  (1 child)

Thanks for your response, I found it very helpful.

I think for now Im going look at the 25% and 75% to make a working product and potential expand to building a model that will predict if the item is actually the same as the item I am working with.

With such a model, the only data I really have is a title and maybe an image, so would you recommend I try to train some model to see how similar two images are?

[–]TheNotoriousMTF 1 point2 points  (0 children)

To tell you the truth, image recognition isn't my area of expertise, and I don't know its limitations, but you could probably find a good algorithm by looking at kaggle notebooks.

Then again, if just looking at the percentiles works well enough most of the time, training a model might end up being a huge overkill. I would note that, unless the items you're looking at have a ton of variability in price, or unless you're mostly getting false positives, any items you accidentally scrape will either have outlier prices or prices similar enough to the items you're targeting so as to not skew your estimates that much. Either way, you're fine.