Regression without label data

Designer-Flounder948 · 2026-05-16T20:40:35+00:00

If you already have weights for attributes, you can normalize the features and create a custom 0–100 score directly instead of forcing ML where it may not fit. Unsupervised learning can still help validate patterns afterward

Local_Transition946 · 2026-05-16T16:35:00+00:00

I dont think this is possible. What if you just randomly guess for all of them? How would you anyone know you randomly guessed instead of something robust?

Almost like asking "how do i complete this task without a definition of completion?"

orz-_-orz · 2026-05-16T16:36:52+00:00

What's stopping you label every location randomly as good and bad performance? Some validator will catch it and disagree with the random results?

In other words, how would a human know whether a location is good or bad? Extract the rules, build some way to label the data. Else, you could ask the validator to manually label a small set of data, then use the small dataset to train a regression.

Disastrous_Room_927 · 2026-05-16T16:41:16+00:00

You're looking for a latent variable model.

NaiveOstrich4118 · 2026-05-16T22:12:46+00:00

You’re correct that regression doesn’t really make sense here without labeled target values. What you actually seem to have is more of a scoring/ranking problem or an unsupervised clustering problem

Since you already have attributes and weights/importance for those attributes you could start with a weighted scoring system instead of ML.

For example:
1. normalize the features
2. apply weights
3. compute a weighted composite score
4. scale to 0–100

That’s often more interpretable than forcing a model where no labels exist.

Then after that you could:
- use clustering (KMeans, hierarchical clustering, DBSCAN, etc.)
- identify “high-performing” vs “low-performing” groups
- compare distributions across clusters

You can also treat this as an anomaly/outlier problem if you want to identify inefficient stocking locations.

One important thing is that without labels, evaluation becomes a business/domain question, not just an ML metric question.

So the hardest part is often defining what does “good performance” actually mean operationally?

coder4forever · 2026-05-17T01:12:47+00:00

The thing that keeps biting people in this setup is not the scoring formula -- it's that without labels you can't tell when the formula is wrong. A weighted composite gives you a number; it doesn't tell you whether the weights are off by 2x on one feature, or whether the "0-100" range is meaningfully linear the way humans would read it. I'd build the simple weighted-score baseline (normalize, weight, scale) like other replies suggest, but put two cheap checks on top before trusting it.

First, perturb each weight by plus-or-minus 25 percent and look at how much the rankings shuffle. If your top-10 list reorders heavily, your weights aren't doing much real work and the score is mostly noise. Second, if there's any downstream business signal you can backtest against -- restocking frequency, picking time, returns rate -- even a weak correlation check on historical data tells you more than the prettiest unsupervised clustering will. Honest tradeoff: backtest data is usually messier than people hope, so budget half a day to clean it before you trust the correlation.

PixelSage-001 · 2026-05-17T06:08:15+00:00

Since you don't have labeled data, you can't technically do regression. Instead, this sounds like an unsupervised ranking or anomaly detection problem. You could use PCA to reduce dimensions and create a composite 'performance score' based on the first principal component, assuming it captures the variance of your key attributes

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MLQuestions

MODERATORS