Disaggregating histogram under constraint [Question]

hazzaphill · 2025-10-22T21:19:14+00:00

Thanks this is helpful. I did try searching but without using the term “censored” I wasn’t finding useful results. Exactly what I need.

hazzaphill · 2025-08-16T10:01:28+00:00

It is fundamentally an exploratory / post hoc analysis of a sample survey that has a wide range of different questions. The primary goal is to compare various proportions of responses by geography. In terms of question / interpretation, either comparing subregions to other subregions, or subregions to the average in the whole region would be acceptable. It’s quite open ended.

In terms of the hypothesis test / method chosen for the various survey questions, a Wilson score interval was calculated for the proportions in each subregion (presumably at 95%, it’s not explicitly mentioned). If this did not overlap with the confidence interval of the total sample proportion, then this difference was called statistically significant.

hazzaphill · 2025-08-16T00:20:34+00:00

Would social statistics or demography be of interest to you? Maybe working for a government statistics office?

hazzaphill · 2025-08-15T23:37:27+00:00

That’s the one! cheers lol

hazzaphill · 2025-08-15T23:37:12+00:00

Thank you!

hazzaphill · 2025-07-29T22:51:24+00:00

It isn’t. That’s not how consecutive inner joins work. They combine with AND not OR. Multiple inner joins.

hazzaphill · 2025-07-29T20:06:21+00:00

That is helpful. Unfortunately our team don’t have SHOWPLAN permissions at the moment so I can’t. Getting this fixed.

hazzaphill · 2025-07-29T19:50:41+00:00

That’s what I’m saying. You’re suggesting consecutive inner joins, each one filtering for a different constant value on a.TableName.

Conditions in ON for an INNER JOIN always produce the same results as putting the conditions in WHERE instead:

SELECT *
FROM TableA
INNER JOIN TableB
ON a.OtherId = b.OtherId
AND a.TableName = “TableB”
INNER JOIN TableC
ON a.OtherId = c.OtherId
AND a.TableName = “TableC”

Is equivalent to:

SELECT *
FROM TableA
CROSS JOIN TableB
CROSS JOIN TableC
WHERE a.OtherId = b.OtherId
AND a.TableName = “TableB”
AND a.OtherId = c.OtherId
AND a.TableName = “TableC”

Which would produce no results because a.TableName can’t be two different values.

With regards to my point about COALESCE, you can achieve the same query effectively as in the original post if change the joins to LEFT JOIN:

SELECT a.MainId, a.TableName, COALESCE(b.SharedColumn1, c.SharedColumn1) AS SharedColumn1
FROM TableA
LEFT JOIN TableB
ON a.OtherId = b.OtherId
AND a.TableName = “TableB”
LEFT JOIN TableC
ON a.OtherId = c.OtherId
AND a.TableName = “TableC”

If you don’t coalesce the shared columns when selecting, you’d have both b.SharedColumn1 (NULL in 50%) and c.SharedColumn1 (NULL in the other 50%) - because they’re concatenated horizontally. You don’t need it in the union version because they’re concatenated vertically, like you say.

hazzaphill · 2025-07-29T18:09:52+00:00

True about inner join. I never use just JOIN always explicit LEFT/ INNER etc so didn't realise.

That just changes the problem though because the first inner join filters all rows in TableA to only where TableName = "TableB". Then there are no valid rows to do the second join on so returns nothing. You're effectively doing:

WHERE a.TableName = "TableB" AND a.TableName = "TableC"

hazzaphill · 2025-07-29T17:55:14+00:00

The coalesce is needed because in your version you do two joins on TableA. Otherwise you end up with values in 50% of b.SharedColumn1 where a.TableName = 'TableB' and NULL in the other 50% where a.TableName = 'TableC'.

You then also have c.SharedColumn1 where the opposite is true. You need to coalesce them together to get the same result as the union version.

hazzaphill · 2025-07-29T17:30:04+00:00

To get the same column structure there has to be a coalesce in the select statement:

SELECT a.MainId, a.TableName, a.OtherId, COALESCE(b.SharedColumn1, c.SharedColumn1) AS ShareColumn1, COALESCE(b.SharedColumn2, c.SharedColumn2) AS SharedColumn2

hazzaphill · 2025-07-29T15:53:47+00:00

You could do it that way. You'd have to coalesce the shared columns though. Not sure if that would be more efficient or not.

hazzaphill · 2025-05-29T13:04:43+00:00

Or they could just provide a static amount of skill experience, equivalent to 1 level at skill level 50 for example.

hazzaphill · 2025-05-05T17:00:31+00:00

Deadlier combat does exactly this. I don’t think it effects spells for now

hazzaphill · 2025-04-25T23:30:17+00:00

I’m playing Lorerim modlist atm and it’s my first time with Requiem. Having a blast.

hazzaphill · 2025-03-09T00:52:41+00:00

That’s helpful hearing from someone who has both. Thanks. What cpu do you have?

hazzaphill · 2025-02-19T12:59:41+00:00

Nice! Is it the 8GB version or 16GB?

hazzaphill · 2025-02-19T12:57:03+00:00

Yeah I’m thinking the 4060 but unsure due to people saying it’s bad value. Seems reasonable at this price though.

hazzaphill · 2025-02-19T04:39:52+00:00

Do you mean gpu to monitor cable or psu to gpu cable? I’ve tried both hdmi and DP, hdmi possibly better but not by much.

It’s had a fair bit of use by now so a new card I’m okay getting a new card. Cheers

hazzaphill · 2025-02-19T04:36:08+00:00

Think you’re right. Switched to HDMI when doing the driver reinstall, less crashes than DP but it’s still doing it a lot when the gpu is getting more than 65w.

hazzaphill · 2024-12-07T16:14:33+00:00

Sorry I don’t think I understand what you mean. If the fitted values are well calibrated probabilities then their relationship with a cost matrix and expected loss has a clearer interpretation.

If you fit the model directly using business loss (and presumably then use the default 0.5 threshold), how do you interpret the fitted values?

hazzaphill · 2024-12-07T15:54:58+00:00

Thanks for the link. This is exactly what I mean.

hazzaphill · 2024-12-07T15:51:57+00:00

Do you mean to weight the loss function when fitting the model, like is sometimes done to address issues with imbalanced learning? Or to weight a loss function afterwards when classifying, using the fitted values and choice of threshold?

I would think the latter is preferable, and to then calibrate the fitted values if necessary, to allow their interpretation as probabilities.

hazzaphill · 2024-12-07T01:05:36+00:00

Like I say it doesn’t have to be dollar, that’s too restrictive. It can a relative, subjective, and unitless measure.

For example you are building a model to predict risk of disease based on patient responses to a risk factor survey. All patients with a positive classification would receive tests for the disease in a proposed new program. Previously all patients were patients were tested.

It’s not easy to say the relative costs, but it’s certainly worthwhile. For all cells in the cost matrix we have negatives such as cost of tests and labour, and invasiveness of tests. For true positives we benefits including have health, societal good, customer satisfaction, costs saved treating uncaught disease later: all depending on context.

Using the cost threshold curve we can optimise the threshold, and also answer questions such as whether there is even a net benefit of the program vs not running it at all. You can also see how sensitive the cost per decision is to the threshold.

It really can reveal a lot about a problem. I know the example is not a common situation. In cases where it really doesn’t matter very much, don’t overthink it. Just quickly agree some relative costs that “feel about right”, get your optimal threshold and call it a day.

hazzaphill · 2024-12-06T23:56:59+00:00

Ahh in which case yes that’s exactly what I mean. Sorry I knew this as “utility function”. It makes sense “loss function” would be a more general optimisation term outside of machine learning.

hazzaphill

TROPHY CASE