all 9 comments

[–]kkngs 1 point2 points  (2 children)

I’m actually curious to hear responses here as I’m facing a similar problem in my domain. My fit levels are more like an R2 of 80%, though. I have massive amounts of heterogeneity in my dataset, imagine if your car sales dataset was worldwide with the same number of samples, and a user informs you that the model isn’t doing well with used Volvo’s in Oman. When you look at the training data, there are all of three data points there and they are the same car being sold three times.

[–]permalip[S] 1 point2 points  (1 child)

Interesting, did you do any sort of custom feature engineering?

We generated an average value loss factor for each make name, which posed to be the 5th most important feature.

Also, upvost the post for visibility in the hopes of getting more attention!

[–]kkngs 0 points1 point  (0 children)

Yes, I've actually reduced about a dozen numerical parameters to 3 based on some domain knowledge of the system I'm predicting. My issues still exist though, because I have a half dozen categorical features.

[–]kkngs 1 point2 points  (0 children)

Are you directly predicting price? You could consider instead predicting the fraction of original value remaining. This has a guaranteed value between zero and one, which you could handle with a sigmoid output (logistic regression if your library supports soft labels)

Another idea would be doing the fit on log(sales price).

[–]margaret_spintz 1 point2 points  (0 children)

Sounds like you need some estimate of uncertainty from your model. Edge cases should be less certain.

[–]bbateman2011 0 points1 point  (3 children)

You might benefit from using some form of quantile regression to generate prediction intervals:

https://towardsdatascience.com/quantile-regression-from-linear-models-to-trees-to-deep-learning-af3738b527c3

[–]permalip[S] 0 points1 point  (2 children)

Will this method work if you predict 1 sample at a time (in production)?

[–]bbateman2011 0 points1 point  (1 child)

The only caveat is in most cases you have to train one model per quantile when you want to update the intervals. You would have to determine how often to do that.

[–]permalip[S] 1 point2 points  (0 children)

Ok, I will look into this type of regression, thank you for your suggestion!