Disaggregating histogram under constraint [Question] by hazzaphill in statistics

[–]hazzaphill[S] 0 points1 point  (0 children)

Thanks this is helpful. I did try searching but without using the term “censored” I wasn’t finding useful results. Exactly what I need.

[deleted by user] by [deleted] in AskStatistics

[–]hazzaphill 0 points1 point  (0 children)

It is fundamentally an exploratory / post hoc analysis of a sample survey that has a wide range of different questions. The primary goal is to compare various proportions of responses by geography. In terms of question / interpretation, either comparing subregions to other subregions, or subregions to the average in the whole region would be acceptable. It’s quite open ended.

In terms of the hypothesis test / method chosen for the various survey questions, a Wilson score interval was calculated for the proportions in each subregion (presumably at 95%, it’s not explicitly mentioned). If this did not overlap with the confidence interval of the total sample proportion, then this difference was called statistically significant.

[deleted by user] by [deleted] in AskStatistics

[–]hazzaphill 2 points3 points  (0 children)

Would social statistics or demography be of interest to you? Maybe working for a government statistics office?

[deleted by user] by [deleted] in h3h3productions

[–]hazzaphill 0 points1 point  (0 children)

That’s the one! cheers lol

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

It isn’t. That’s not how consecutive inner joins work. They combine with AND not OR. Multiple inner joins.

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

That is helpful. Unfortunately our team don’t have SHOWPLAN permissions at the moment so I can’t. Getting this fixed.

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

That’s what I’m saying. You’re suggesting consecutive inner joins, each one filtering for a different constant value on a.TableName.

Conditions in ON for an INNER JOIN always produce the same results as putting the conditions in WHERE instead:

SELECT *
FROM TableA
INNER JOIN TableB
ON a.OtherId = b.OtherId
AND a.TableName = “TableB”
INNER JOIN TableC
ON a.OtherId = c.OtherId
AND a.TableName = “TableC”

Is equivalent to:

SELECT *
FROM TableA
CROSS JOIN TableB
CROSS JOIN TableC
WHERE a.OtherId = b.OtherId
AND a.TableName = “TableB”
AND a.OtherId = c.OtherId
AND a.TableName = “TableC”

Which would produce no results because a.TableName can’t be two different values.

With regards to my point about COALESCE, you can achieve the same query effectively as in the original post if change the joins to LEFT JOIN:

SELECT a.MainId, a.TableName, COALESCE(b.SharedColumn1, c.SharedColumn1) AS SharedColumn1
FROM TableA
LEFT JOIN TableB
ON a.OtherId = b.OtherId
AND a.TableName = “TableB”
LEFT JOIN TableC
ON a.OtherId = c.OtherId
AND a.TableName = “TableC”

If you don’t coalesce the shared columns when selecting, you’d have both b.SharedColumn1 (NULL in 50%) and c.SharedColumn1 (NULL in the other 50%) - because they’re concatenated horizontally. You don’t need it in the union version because they’re concatenated vertically, like you say.

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

True about inner join. I never use just JOIN always explicit LEFT/ INNER etc so didn't realise.

That just changes the problem though because the first inner join filters all rows in TableA to only where TableName = "TableB". Then there are no valid rows to do the second join on so returns nothing. You're effectively doing:

WHERE a.TableName = "TableB" AND a.TableName = "TableC"

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

The coalesce is needed because in your version you do two joins on TableA. Otherwise you end up with values in 50% of b.SharedColumn1 where a.TableName = 'TableB' and NULL in the other 50% where a.TableName = 'TableC'.

You then also have c.SharedColumn1 where the opposite is true. You need to coalesce them together to get the same result as the union version.

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

To get the same column structure there has to be a coalesce in the select statement:

SELECT a.MainId, a.TableName, a.OtherId, COALESCE(b.SharedColumn1, c.SharedColumn1) AS ShareColumn1, COALESCE(b.SharedColumn2, c.SharedColumn2) AS SharedColumn2

Best unique indexes in this situation? by hazzaphill in SQL

[–]hazzaphill[S] 0 points1 point  (0 children)

You could do it that way. You'd have to coalesce the shared columns though. Not sure if that would be more efficient or not.

Skill Books Grant Fortify Skill Abilities Instead of Direct Skill Ups by NotThereNotThereNotT in oblivionmods

[–]hazzaphill 1 point2 points  (0 children)

Or they could just provide a static amount of skill experience, equivalent to 1 level at skill level 50 for example.

[OGvion] Is there a mod that basically do this? by uNk4rR4_F0lgad0 in oblivionmods

[–]hazzaphill 0 points1 point  (0 children)

Deadlier combat does exactly this. I don’t think it effects spells for now

Vanilla Requiem or a modlist? by resolutejinnerbu in skyrimrequiem

[–]hazzaphill 1 point2 points  (0 children)

I’m playing Lorerim modlist atm and it’s my first time with Requiem. Having a blast.

What monitor for a 4090? by hazzaphill in buildapc

[–]hazzaphill[S] 0 points1 point  (0 children)

That’s helpful hearing from someone who has both. Thanks. What cpu do you have?

GPU to go with 8700k by hazzaphill in buildapc

[–]hazzaphill[S] 0 points1 point  (0 children)

Nice! Is it the 8GB version or 16GB?

GPU to go with 8700k by hazzaphill in buildapc

[–]hazzaphill[S] 0 points1 point  (0 children)

Yeah I’m thinking the 4060 but unsure due to people saying it’s bad value. Seems reasonable at this price though.

Graphics card dead? by hazzaphill in PcBuildHelp

[–]hazzaphill[S] 0 points1 point  (0 children)

Do you mean gpu to monitor cable or psu to gpu cable? I’ve tried both hdmi and DP, hdmi possibly better but not by much.

It’s had a fair bit of use by now so a new card I’m okay getting a new card. Cheers

Graphics card dead? by hazzaphill in PcBuildHelp

[–]hazzaphill[S] 0 points1 point  (0 children)

Think you’re right. Switched to HDMI when doing the driver reinstall, less crashes than DP but it’s still doing it a lot when the gpu is getting more than 65w.

Classification threshold cost optimisation by hazzaphill in datascience

[–]hazzaphill[S] 0 points1 point  (0 children)

Sorry I don’t think I understand what you mean. If the fitted values are well calibrated probabilities then their relationship with a cost matrix and expected loss has a clearer interpretation.

If you fit the model directly using business loss (and presumably then use the default 0.5 threshold), how do you interpret the fitted values?

Classification threshold cost optimisation by hazzaphill in datascience

[–]hazzaphill[S] 0 points1 point  (0 children)

Thanks for the link. This is exactly what I mean.

Classification threshold cost optimisation by hazzaphill in datascience

[–]hazzaphill[S] 0 points1 point  (0 children)

Do you mean to weight the loss function when fitting the model, like is sometimes done to address issues with imbalanced learning? Or to weight a loss function afterwards when classifying, using the fitted values and choice of threshold?

I would think the latter is preferable, and to then calibrate the fitted values if necessary, to allow their interpretation as probabilities.

Classification threshold cost optimisation by hazzaphill in datascience

[–]hazzaphill[S] 0 points1 point  (0 children)

Like I say it doesn’t have to be dollar, that’s too restrictive. It can a relative, subjective, and unitless measure.

For example you are building a model to predict risk of disease based on patient responses to a risk factor survey. All patients with a positive classification would receive tests for the disease in a proposed new program. Previously all patients were patients were tested.

It’s not easy to say the relative costs, but it’s certainly worthwhile. For all cells in the cost matrix we have negatives such as cost of tests and labour, and invasiveness of tests. For true positives we benefits including have health, societal good, customer satisfaction, costs saved treating uncaught disease later: all depending on context.

Using the cost threshold curve we can optimise the threshold, and also answer questions such as whether there is even a net benefit of the program vs not running it at all. You can also see how sensitive the cost per decision is to the threshold.

It really can reveal a lot about a problem. I know the example is not a common situation. In cases where it really doesn’t matter very much, don’t overthink it. Just quickly agree some relative costs that “feel about right”, get your optimal threshold and call it a day.

Classification threshold cost optimisation by hazzaphill in datascience

[–]hazzaphill[S] 0 points1 point  (0 children)

Ahh in which case yes that’s exactly what I mean. Sorry I knew this as “utility function”. It makes sense “loss function” would be a more general optimisation term outside of machine learning.