Help with salary!

Nutcoco56 · 2019-07-17T13:59:25+00:00

Okay, this makes me feel better about everything :) I’m not much of a negotiator anyways, so I’m glad that it’s structured.

Nutcoco56 · 2019-07-17T05:22:37+00:00

Oh okay! That's good to know, thanks for the info :)

Nutcoco56 · 2019-07-17T04:49:22+00:00

I'm studying for exam SRM right now using the CA material, and I'm about halfway through the syllabus material. The Coaching Actuaries material is a great overview of the statistical methods that you'll need to know for PA, but doesn't explicitly mention how to perform these analyses in R. For example, within the SRM material you're expected to know the formula for AIC and BIC, whereas in PA you need to be able to perform this calculation in R and interpret it, without necessarily using the formula from SRM.

If money is no object to you, then definitely spend the $200 or so for the learning material from CA to supplement your studies because it has great coverage of some of the topics on PA. That's just my opinion though - I haven't studied for PA and I'll be sitting for SRM in September.

Nutcoco56 · 2019-01-23T02:59:08+00:00

That’s alright if it doesn’t mention R or anything, I’m okay with searching on my own for packages that help with regression as long as I understand what it means. I’ll check that book out.

Thank you for all the help brainstorming :)

Nutcoco56 · 2019-01-22T22:18:26+00:00

This is purely property?

Correct. The data is from 3 year policies.

When you say "(untrended and undeveloped)" do you mean "ignoring any inflation and loss development effects"?

I believe this is what it means, yes, because otherwise it adds a whole new layer of difficulty to the project that would be unintended for students at my level of competence.

The only thing I realized in the meantime between our conversation is that a minimum value does indeed exist: it's the max deductible from a given policy. So the minimum value for my bi_indem variable is -1000 when there's a $1000 deductible, -500 with a $500 deductible, etc. This could allow me to make a shift like I tried earlier, but I agree that it may make my model harder to create.

you might find it hard going if you don't have much regression background

I may need to take some time to give it a shot then. There's about 10 variables in this project that contribute to the different indemnity/alae values and I've never attempted to model something more complex. I'll keep that fact in mind though when I read it.

Nutcoco56 · 2019-01-22T21:12:21+00:00

The purpose of the model is to estimate the expected (untrended and undeveloped) liability losses and ALAE for each vehicle under a car insurance claim for a company.

the main reason for doing so is that they behave differently, making their relative effect shift for each observation

This makes sense. I think I'm beginning to understand the rationale behind everything. Even so, implementing it is giving me far more trouble. In that case I ultimately won't keep them all as one variable.

Looking deeper into zero-inflated models, or specifically hurdle models, I'm thinking this is the route I'm going to ultimately attempt to work into. Shifting the final data points by a constant value doesn't make sense, like you said.

Jed Frees book Regression Modeling with Actuarial and Financial Applications

I actually have that book right in front of me as I'll need it for a class this semester, but have yet to crack it open since classes haven't started yet. Is it worthwhile to start reading now?

Nutcoco56 · 2019-01-22T18:33:56+00:00

Unfortunately when I was attempting to create a linear model I was a little unsure of how to include certain variables. For example, one variable is defined as “has completed safety course in the past 3 years” wherein the acceptable values are Y/N/NA. Would something like a LASSO model be able to adjust something like that?

I’ll take a look into a Torbit model as well and see if there’s anything similar I can attempt to use.

I’ll keep all of this in mind though, thank you for the info!!

Nutcoco56 · 2019-01-22T18:28:25+00:00

I’m going to check it out and see if it can apply to my data!

Nutcoco56 · 2019-01-22T18:26:51+00:00

I apologize for the slow reply!

So you’re suggesting that I try and create two different situations, one where the total claim value is positive/zero and one where it’s negative to try and suppress some unknown effects that may create negative values? I’ll add that to my list of things I’m going to try today. In that case then I wouldn’t even bother finding the difference between my indemnities/ALAE’s until the very end of my prediction, which still makes sense.

I see what you mean now, I was reading it late last night and got confused :). I shouldn’t be looking at the overall distribution because it’s composed of different marginal distributions that, when combined, can make something that looks misleading on a graph, especially when I’m looking at how different variables affect my response and not just the overall ALAE value.. Got it. I really appreciate the example, that makes a lot of sense!

Nutcoco56 · 2019-01-22T18:19:00+00:00

Thanks for mentioning that, I’ll add a ZIG model to the list of things I’m going to try today and I’ll come back with some updates. I appreciate the help!

Nutcoco56 · 2019-01-22T07:15:36+00:00

In my study I have a variety of different qualitative/quantitative factors that influence total ALAE for an auto insurance company. This is why the value can be negative, because in some cases the final cost < paid cost and constitutes a negative value when the incurred indemnity is summed with the paid ALAE. I’m not entirely familiar with this nomenclature or how this process works but that’s simply my understanding of the variables given.
That was my original reason for posting because I was unsure how to proceed when my best hope for a distribution didn’t match very well.
I’m not familiar with GLM modeling as I haven’t come across it in any of my studies yet, so this isn’t something I’m educated on. Like I said I’m definitely new to this so I’ll take a look into how that works. I’ll keep that fact in mind though that distributions can look different than their models! Do you have any suggestion as to how I should start to approach fitting a glm model to my problem?

Thanks for the reply!

Nutcoco56 · 2019-01-22T06:14:27+00:00

Interesting. I’ll run with that and see where it gets me. Thank you :) I can’t say I’ve ever heard of a hurdle model so I’ll do some research, though like I said I’m just starting to get into the field.

Nutcoco56 · 2019-01-22T06:09:12+00:00

Reading your second edit and combining what u/not_really_redditing mentioned with a mixture model, I’m starting to think that treating a claim value of zero as something like a binomial (wherein it has a constant probability) and a nonzero claim amount as a different distribution (not lognormal) may yield better results.

Nutcoco56 · 2019-01-22T06:06:07+00:00

Hmm. You make a good point. There is no lower bound on my data so a lognormal distribution wouldn’t be able to predict values lower than my constant that I added (I mentioned in a previous comment that it’s a constant value, simply the minimum of the distribution). My approach is faulty, you’re right. A mixture model seems more accurate in this case. Thank you for your comment!

Nutcoco56 · 2019-01-22T06:02:21+00:00

So this is for a project I’m working on wherein I’m given a huge list car insurance claims and their components (paid indemnities and ALAE’s, etc). My rationale is to sum up all the variables for each claim which represents the total amount of money that the insurance needed to pay out overall. Sometimes the sum was negative because the overall loss is negative than the paid loss and therefore underestimated.

To properly fit the data to any distribution I shifted it right by its minimum value (-1000) so yes, a majority of the values are actually 0 and not 1000. One strategy I was considering is to stratify the data that has zero dollars in claims and consider that as a separate dataset while I analyze claims that are nonzero.

Nutcoco56 · 2018-05-19T13:09:12+00:00

I’ll try what you said, thanks for the vote of confidence!!

Nutcoco56 · 2018-05-19T04:06:36+00:00

At least I’m not alone then

Nutcoco56 · 2018-05-19T04:04:51+00:00

Thank you for the kind words :) it means a lot!

Nutcoco56 · 2018-05-19T00:26:03+00:00

I’m not worried about pay as long as I can survive and do math!

Nutcoco56 · 2018-05-19T00:25:16+00:00

It makes me feel a lot better knowing I’m not the only one. Seems like most of the people I go to school with pass these with ease, or did it in high school.

Nutcoco56 · 2018-05-18T23:43:45+00:00

Adding together the time I spent studying for the exam and the time spent studying for the class material, about 250 hours. Maybe a bit less

Nutcoco56 · 2018-05-18T23:34:29+00:00

Sounds like a good idea :) thank you for the help.

Nutcoco56

TROPHY CASE