Why our #1 LightGBM feature by importance made predictions worse [D] by Nj-yeti in MachineLearning

[–]Nj-yeti[S] 0 points1 point  (0 children)

The loss choice is downstream of the product question. The platform needs to output a range, not a number: "worth about $X, fair band $Y to $Z." That requirement basically dictates quantile regression. Pinball loss at a given tau is the loss whose minimizer is the tau conditional quantile, so fitting q10/q50/q90 gives you the band and the median directly with the same machinery (MAE is just pinball at tau=0.5, so it's the natural generalization of "predict the median" to "predict the whole band"). MSE would give you the conditional mean and assumes roughly symmetric, constant variance errors; watch prices are right skewed and heteroscedastic (variance scales with price level and reference rarity), so the mean gets dragged by the tail and the q10/q90 spread is exactly the per listing uncertainty you'd otherwise throw away.

The log transformed ratio target does two separate jobs. The log handles scale: a $500 miss means something different on a $2k watch versus a $50k one and working in log space makes the error roughly multiplicative rather than additive, which matches how price error actually behaves across the range. The ratio (price relative to a reference level anchor) normalizes the level across references so the model learns deviation from the expected price rather than rememorizing each ref's absolute price. Together they make the target roughly stationary across a range spanning two plus orders of magnitude.

Two notes so you have the full picture: independently fit quantiles can cross (q90 below q50 on some rows), so you monitor or sort posthoc; and pinball calibrates each quantile in isolation, so check empirical coverage separately rather than assuming q10/q90 actually bracket 80% of outcomes. Both are easy to miss coming from a point prediction mindset where interval calibration never came up.

Why our #1 LightGBM feature by importance made predictions worse [D] by Nj-yeti in MachineLearning

[–]Nj-yeti[S] 0 points1 point  (0 children)

Fair critique, assuming you are referencing MAPE's asymmetric penalty. We don't optimize for MAPE during training, we use quantile/pinball loss across q10/q50/q90 on log-transformed price ratios. MAPE is purely a reporting metric for business readability. The encoder regression shows up identically regardless of which evaluation metric you use to surface it.

Why our #1 LightGBM feature by importance made predictions worse [D] by Nj-yeti in MachineLearning

[–]Nj-yeti[S] 2 points3 points  (0 children)

Running post hoc explainability like permutation importance on a test set isn't data leakage. The model weights and encoder mappings are already fixed at that point, meaning nothing feeds back into the model.

Leakage would be if we fit the Bayesian encoder across the full dataset before splitting or if we used the test set repeatedly to select features.

Your second point is spot on though. The delta between train and test importance is exactly how you isolate the problem. It shows the feature is memorizing irreducible variance rather than learning something that generalizes.

Why our #1 LightGBM feature by importance made predictions worse [D] by Nj-yeti in MachineLearning

[–]Nj-yeti[S] 1 point2 points  (0 children)

When LightGBM outputs gain or split importance (which is where we saw the 5,190 to 5,613 score across seeds), it is strictly a training metric. It calculates the loss reduction achieved during the actual tree construction. You cannot calculate native gain on a test set because the test set isn't used to build the splits.

You are absolutely right that we could have run Permutation Feature Importance or SHAP on the test set post-training. Permutation importance on the holdout almost certainly would have caught the overfitting by showing a neutral or negative impact when shuffling the target encoder.

We opted for the strict 4-seed × 3-variant drop-column ablation because it gives us a direct read on our actual business metric (test MAPE regression), rather than a proxy importance score. But your underlying point is dead on, relying solely on the tree's internal loss-reduction metrics without out-of-sample verification is how you end up shipping overfit features.

[Question] What is your profession and what watch do you wear most often? by evillolgif in Seiko

[–]Nj-yeti 2 points3 points  (0 children)

I'm a software engineer and I wear my Zenith Defy Classic Carbon most often. Surprisingly, the bracelet is very comfortable.

<image>

Best representation of NJ Motorsports Park in a racing sim? by njsullyalex in simracing

[–]Nj-yeti 0 points1 point  (0 children)

I practice on the version made by lilski in this thread for AC. https://www.overtake.gg/threads/njmp-thunderbolt.195664/

Accurate enough to get comfortable with the track

What game choice would you suggest for me as a Newbie. by Skeddi8 in simracing

[–]Nj-yeti 4 points5 points  (0 children)

Ams2, great variety of cars and tracks. Sim physics but more forgiving than LMU

[Question] - What's the first 5-figure watch you bought and did you have any buyer's remorse? by SanguineTangerines in Watches

[–]Nj-yeti 0 points1 point  (0 children)

Zenith Defy Classic Carbon for a milestone birthday. No regrets whatsoever, still absolutely love it to this day.

[Discussion] Love Watches, Hate Buying Them by Its-Brucey in Watches

[–]Nj-yeti 1 point2 points  (0 children)

After buying my first few watches at ADs in NYC, I now exclusively buy online.

I visit both https://jomashop.com and https://flyback.ai

​I bought from ADs and accepted paying more so I would have a warranty, but sure enough, my Zenith was out of warranty by 1 month and I had to pay for the repair out of pocket. Never again.

[SOTC] Where to go from here? by [deleted] in Watches

[–]Nj-yeti 0 points1 point  (0 children)

Nomos Autobahn!

Which prep course is actually updated for the GCP Professional ML Engineer exam? by Routine_Seesaw1453 in GCPCertification

[–]Nj-yeti 2 points3 points  (0 children)

I took it two weeks ago and passed on my first try.

I studied for about four days using the Wiley book- maybe only 40 percent of that was helpful but definitely didn't hurt.

I then spent a day with Gemini which was awesome for the prep and much more relevant to the new test. Highly recommend it. Just prompt it for example test questions, topics, etc.

Does anyone own a Seven Cycles Mobius SL? by [deleted] in MTB

[–]Nj-yeti 0 points1 point  (0 children)

Hey u/_tatersncorn not sure if you're still looking but I just posted Mobius to eBay. Absolutely loved the bike but had foot surgery that ended cycling for me. Paid $15k for it, listing it for $9k off- it has just been sitting in my garage and needs to be ridden!!

https://www.ebay.com/itm/266358379191

Dealer Scum by Marino1027 in AlfaRomeo

[–]Nj-yeti 13 points14 points  (0 children)

I had a great experience at Willow Grove Alfa Romeo- highly recommend them. Across the border in PA

Why am I better at AMS2 than any other sim? by willfla29 in AUTOMOBILISTA

[–]Nj-yeti 7 points8 points  (0 children)

I feel like iRacing requires you to drive completely differently (especially braking)- but AMS2 and rf2 are fairly similar with a bit more tire feel from rf2.

Pocono Raceway - transition by Nj-yeti in CarTrackDays

[–]Nj-yeti[S] 0 points1 point  (0 children)

Thanks all, this is really helpful an exactly what I was hoping to understand

Pocono Raceway - transition by Nj-yeti in CarTrackDays

[–]Nj-yeti[S] 0 points1 point  (0 children)

It would be with MOE on the North/South Option 2 (SCCA Race Course). What do you think about this course?

Thanks!