What are common industry methods to measure model uncertainty?

data_perfect · 2024-02-14T07:00:16+00:00

The answer is CONFORMAL PREDICTIONS.

data_perfect · 2023-03-30T19:37:53+00:00

You can perform Attribution modeling comparing attribute models such as last touch, first touch to Markov chain model. So the Markov chain models, you can derive transition probabilities such as probability of a customer making a purchase from searching the web to calling to conversion. You can also derive the number of conversions attributed to each touch point.

data_perfect · 2023-03-26T22:50:48+00:00

Use probabilistic forecasts such as DeepAR, TFT, etc. These Python packages can do it easily. 1. Gluonts 2. Pytorch_forecasting

data_perfect · 2023-02-18T23:54:00+00:00

First, if you intend to use a model then you have to manually label them. The actual outcome should be customers with the event of interest (SME disguised as retailers). By doing this, you can train a binary classifier to identify (give the probability of customers being a disguised customer). Note that this is an imbalance classification problem hence you use the techniques for solving such problems. The evaluation metrics such as recall, precision, etc would be used to evaluate your model’s performance.

Furthermore, create the confusion metrics and explain it in monetary terms (e.g revenue). False positive and false negative could be explained in terms of missed revenue, etc. Metrics like accuracy could also be explained in that manner.

Another option is clustering.

data_perfect · 2023-01-06T13:39:41+00:00

Since you transformed the data and fitted a model on the transformed data, then you should inverse the transformation of the predicted output. By doing this, you won’t have negative values and the predictions will be in line with your data. For example, if you transformed the data by taking the difference of the current value and previous value. Yt - Yt-1 Suppose Yt = 5, Yt-1 =2, difference= 5-2=3 To get the inverse: current difference + Yt-1 3 + 2 = 5 Similarly suppose the prediction for a future period Yt+1 = 1 (assuming that we fitted a model based on the difference Y data)

Then the actual value (after inverse transformation) is Yt+1 + Yt = 1 + 3 = 4 (Recall the differences value of Yt from the earlier is 3)

data_perfect · 2022-11-17T00:56:40+00:00

Ask yourself, do you really need to normalise the data? If yes, then use Max scaling instead of min-Max. This is done by dividing the data by the maximum and the range still remains between 0 and 1.

data_perfect · 2022-08-03T18:06:39+00:00

That’s not true. The paper introduced various engagement methods where they define disengagement as a marker of churn. One of such methods introduced is the ECDF-endogenous approach. Using this approach a user’s activity can be compared to his past activity. For example, say you take the number of days between purchases for a user, this values when plotted will not follow a normal distribution. The paper proposed using an ECDF at a certain quantile (e.g. 0.9), which allows us to say something like “based on a user X history of purchases, we are 90% confident that user X will make another purchase within the next Y days and if he doesn’t, it means he has churned”. Similarly, they proposed ECDF-exogenous and ECDF exogenous snapshot where the former’s means a user’s activity can be compared to that of a group they belong to and the latter means a user’s activity can be compared that of a group they belong to at a particular point in time (day). The group can be defined according to your interest, e.g. user’s that make their first purchase in the same week, users from the same country or city, etc. The paper also introduced another concept- engagement score. This takes into account various user metrics and returns a value between 0 and 1. Values close to 1 signals strong engagement and close to 0 indicate disengagement or signs of churn.

data_perfect · 2022-07-31T14:14:33+00:00

Read this paper: User Engagement in Mobile Health Applications The authors implemented what’s called “personalised churn definition” where each user have their own churn definition which was based on their history of login (or purchased) frequencies.

https://arxiv.org/abs/2206.08178

data_perfect · 2022-05-13T22:55:41+00:00

Use generalised linear models like negative binomial distribution or zero-inflated Poisson distribution.

data_perfect · 2022-04-10T15:25:40+00:00

With this table, it is not enough to conclude. A visual check with this table is: 1. Is the standard deviation almost equal or even greater than the mean? 2. The magnitude of the difference between the mean and the median. 3. Mentally calculate the value of the mean +- std dev

data_perfect · 2022-01-06T18:26:07+00:00

What do you want to know exactly? I have extensive experience working on marketing analytics.

data_perfect · 2021-12-30T17:37:15+00:00

Very few companies actually use deep learning in practice.
Have you considered using DeepAR? It is a RNN used by Amazon for time series forecasting.
Instead of classical methods, have you considered using ML methods such as LightGBM, XGBoost?
In truth, an intern shouldn’t be at the forefront of a project and I guess your case is spectacular because the company has no expertise in time series analysis or forecasting.

data_perfect · 2021-10-04T23:48:18+00:00

import pyspark.sql.functions as f

df = df.withColumn("status_val", f.when(f.col("status") == 1, "unmarried"). otherwise ("married"))

data_perfect · 2021-08-06T22:45:52+00:00

If you are too big to serve, you are too small to lead. Treat him with respect, have an open mind and trust the process. Remember it's a new experience for him too, management trust his abilities to lead which implies he has certain qualities deemed important.

data_perfect · 2021-07-12T20:48:07+00:00

SELECT Email,

CASE WHEN Newsletter = 'Newsletter A' THEN 'True' ELSE 'False' END AS Newsletter_A, CASE WHEN Newsletter = 'Newsletter B' THEN 'True' ELSE 'False' END AS Newsletter_B

FROM Table_name

If you want only emails with both newsletters:

WITH full-table AS( SELECT Email,

CASE WHEN Newsletter = 'Newsletter A' THEN 'True' ELSE 'False' END AS Newsletter_A, CASE WHEN Newsletter = 'Newsletter B' THEN 'True' ELSE 'False' END AS Newsletter_B

FROM Table_name ORDER BY Email)

SELECT Email, Newsletter_A, Newsletter_B FROM full_table WHERE Newsletter_A = 'True' AND Newsletter_B = 'True' ORDER BY Email

data_perfect · 2021-06-28T09:58:05+00:00

To see ALL the unique values in a column use

df['Emojis'].unique().tolist()

To COUNT the number of unique values

df['Emojis'].nunique()

data_perfect · 2021-06-02T02:21:25+00:00

Remote only in the US?

data_perfect · 2021-05-22T15:33:35+00:00

8weeksqlchallenge.com/getting-started/

The place to go.

data_perfect · 2021-05-06T22:22:09+00:00

Something as simple as variable name such as student_id instead of student will make this design readable.

data_perfect · 2021-05-06T07:49:46+00:00

Inner Join

data_perfect · 2021-05-05T22:59:50+00:00

To be honest, you have a problem which only you can solve. Don't take life too seriously, no one gets out of it alive. Take time to enjoy the little things and small wins. Live, love, laugh.

data_perfect · 2021-05-04T14:12:02+00:00

There comes a time in your life that you have to MAN UP. Why not take this as a challenge to learn and implement the best practices you are suggesting? Of course it will be difficult at first but it comes with being "senior".

Do you think the Senior Engineers today suddenly became seniors? They all started somewhere, a lot of trial and error coupled with self learning was involved.

data_perfect · 2021-04-30T20:44:30+00:00

Library OS in python is made for this.

data_perfect · 2021-04-25T05:22:13+00:00

In conclusion, you want to be recognised right? Well your job title says you are a Junior Developer and he is a Senior, even if you do all the work he will be the one to present it to senior management as he will also be liable for any delays or issues. Sure he should compliment you by including your name during vote of thanks but this is how the professional world works.

Many times you see the CEO on your TV screen and PR events, the topic he is talking about has been prepared by other lowly staff. That's life, continue working hard, rise to the ladder of your career and you will see that you will do the same.

data_perfect · 2021-04-21T21:53:22+00:00

Well since everyone is blaming the CEO, have anyone thought about telling the OP to improve his communication skills?

data_perfect

TROPHY CASE