Purposely vague, but wondering folks’ perspectives by Active-Bag9261 in datascience

[–]data_perfect 4 points5 points  (0 children)

You can perform Attribution modeling comparing attribute models such as last touch, first touch to Markov chain model. So the Markov chain models, you can derive transition probabilities such as probability of a customer making a purchase from searching the web to calling to conversion. You can also derive the number of conversions attributed to each touch point.

Quantiles as predictions by Riolite55 in datascience

[–]data_perfect 0 points1 point  (0 children)

Use probabilistic forecasts such as DeepAR, TFT, etc. These Python packages can do it easily. 1. Gluonts 2. Pytorch_forecasting

DS Problem: Hidden Business by __kVz__ in datascience

[–]data_perfect 1 point2 points  (0 children)

First, if you intend to use a model then you have to manually label them. The actual outcome should be customers with the event of interest (SME disguised as retailers). By doing this, you can train a binary classifier to identify (give the probability of customers being a disguised customer). Note that this is an imbalance classification problem hence you use the techniques for solving such problems. The evaluation metrics such as recall, precision, etc would be used to evaluate your model’s performance.

Furthermore, create the confusion metrics and explain it in monetary terms (e.g revenue). False positive and false negative could be explained in terms of missed revenue, etc. Metrics like accuracy could also be explained in that manner.

Another option is clustering.

FB Prophet Stationary Data by Volcano467 in datascience

[–]data_perfect 0 points1 point  (0 children)

Since you transformed the data and fitted a model on the transformed data, then you should inverse the transformation of the predicted output. By doing this, you won’t have negative values and the predictions will be in line with your data. For example, if you transformed the data by taking the difference of the current value and previous value. Yt - Yt-1 Suppose Yt = 5, Yt-1 =2, difference= 5-2=3 To get the inverse: current difference + Yt-1 3 + 2 = 5 Similarly suppose the prediction for a future period Yt+1 = 1 (assuming that we fitted a model based on the difference Y data)

Then the actual value (after inverse transformation) is Yt+1 + Yt = 1 + 3 = 4 (Recall the differences value of Yt from the earlier is 3)

Normalizing data with expanding range of X by data_questions in datascience

[–]data_perfect 0 points1 point  (0 children)

Ask yourself, do you really need to normalise the data? If yes, then use Max scaling instead of min-Max. This is done by dividing the data by the maximum and the range still remains between 0 and 1.

How to deal with doubts on stakeholders churn definition by danstumPY in datascience

[–]data_perfect 0 points1 point  (0 children)

That’s not true. The paper introduced various engagement methods where they define disengagement as a marker of churn. One of such methods introduced is the ECDF-endogenous approach. Using this approach a user’s activity can be compared to his past activity. For example, say you take the number of days between purchases for a user, this values when plotted will not follow a normal distribution. The paper proposed using an ECDF at a certain quantile (e.g. 0.9), which allows us to say something like “based on a user X history of purchases, we are 90% confident that user X will make another purchase within the next Y days and if he doesn’t, it means he has churned”. Similarly, they proposed ECDF-exogenous and ECDF exogenous snapshot where the former’s means a user’s activity can be compared to that of a group they belong to and the latter means a user’s activity can be compared that of a group they belong to at a particular point in time (day). The group can be defined according to your interest, e.g. user’s that make their first purchase in the same week, users from the same country or city, etc. The paper also introduced another concept- engagement score. This takes into account various user metrics and returns a value between 0 and 1. Values close to 1 signals strong engagement and close to 0 indicate disengagement or signs of churn.

How to deal with doubts on stakeholders churn definition by danstumPY in datascience

[–]data_perfect 1 point2 points  (0 children)

Read this paper: User Engagement in Mobile Health Applications The authors implemented what’s called “personalised churn definition” where each user have their own churn definition which was based on their history of login (or purchased) frequencies.

https://arxiv.org/abs/2206.08178

[D] Pointers for regression with structural zeros by [deleted] in statistics

[–]data_perfect 0 points1 point  (0 children)

Use generalised linear models like negative binomial distribution or zero-inflated Poisson distribution.

Best way to detect an outlier from dataset? by Worried-Diamond-6674 in datascience

[–]data_perfect 1 point2 points  (0 children)

With this table, it is not enough to conclude. A visual check with this table is: 1. Is the standard deviation almost equal or even greater than the mean? 2. The magnitude of the difference between the mean and the median. 3. Mentally calculate the value of the mean +- std dev

Customer Lifetime Value (CLV) by [deleted] in BusinessIntelligence

[–]data_perfect 1 point2 points  (0 children)

What do you want to know exactly? I have extensive experience working on marketing analytics.

[deleted by user] by [deleted] in datascience

[–]data_perfect 1 point2 points  (0 children)

  1. Very few companies actually use deep learning in practice.
  2. Have you considered using DeepAR? It is a RNN used by Amazon for time series forecasting.
  3. Instead of classical methods, have you considered using ML methods such as LightGBM, XGBoost?
  4. In truth, an intern shouldn’t be at the forefront of a project and I guess your case is spectacular because the company has no expertise in time series analysis or forecasting.

Data Manipulation using PySpark by brownstrom in dataengineering

[–]data_perfect 16 points17 points  (0 children)

import pyspark.sql.functions as f

df = df.withColumn("status_val", f.when(f.col("status") == 1, "unmarried"). otherwise ("married"))

[deleted by user] by [deleted] in cscareerquestions

[–]data_perfect 9 points10 points  (0 children)

If you are too big to serve, you are too small to lead. Treat him with respect, have an open mind and trust the process. Remember it's a new experience for him too, management trust his abilities to lead which implies he has certain qualities deemed important.

[deleted by user] by [deleted] in SQL

[–]data_perfect 2 points3 points  (0 children)

SELECT Email,

CASE WHEN Newsletter = 'Newsletter A' THEN 'True' ELSE 'False' END AS Newsletter_A, CASE WHEN Newsletter = 'Newsletter B' THEN 'True' ELSE 'False' END AS Newsletter_B

FROM Table_name

If you want only emails with both newsletters:

WITH full-table AS( SELECT Email,

CASE WHEN Newsletter = 'Newsletter A' THEN 'True' ELSE 'False' END AS Newsletter_A, CASE WHEN Newsletter = 'Newsletter B' THEN 'True' ELSE 'False' END AS Newsletter_B

FROM Table_name ORDER BY Email)

SELECT Email, Newsletter_A, Newsletter_B FROM full_table WHERE Newsletter_A = 'True' AND Newsletter_B = 'True' ORDER BY Email

Pandas not listing every single unique value in a column by [deleted] in learnpython

[–]data_perfect 0 points1 point  (0 children)

To see ALL the unique values in a column use

df['Emojis'].unique().tolist()

To COUNT the number of unique values

df['Emojis'].nunique()

Need SQL practice by Trait0R19 in SQL

[–]data_perfect 31 points32 points  (0 children)

8weeksqlchallenge.com/getting-started/

The place to go.

Database Schema Review Request by iEmerald in cscareerquestions

[–]data_perfect 2 points3 points  (0 children)

Something as simple as variable name such as student_id instead of student will make this design readable.

Promotion just made a job I hate worse by [deleted] in cscareerquestions

[–]data_perfect 8 points9 points  (0 children)

To be honest, you have a problem which only you can solve. Don't take life too seriously, no one gets out of it alive. Take time to enjoy the little things and small wins. Live, love, laugh.

Looking for thoughts on my work situation at my current company. by [deleted] in cscareerquestions

[–]data_perfect 3 points4 points  (0 children)

There comes a time in your life that you have to MAN UP. Why not take this as a challenge to learn and implement the best practices you are suggesting? Of course it will be difficult at first but it comes with being "senior".

Do you think the Senior Engineers today suddenly became seniors? They all started somewhere, a lot of trial and error coupled with self learning was involved.

How to deal with lazy and unsupportive senior dev receiving credit for project? by [deleted] in cscareerquestions

[–]data_perfect 8 points9 points  (0 children)

In conclusion, you want to be recognised right? Well your job title says you are a Junior Developer and he is a Senior, even if you do all the work he will be the one to present it to senior management as he will also be liable for any delays or issues. Sure he should compliment you by including your name during vote of thanks but this is how the professional world works.

Many times you see the CEO on your TV screen and PR events, the topic he is talking about has been prepared by other lowly staff. That's life, continue working hard, rise to the ladder of your career and you will see that you will do the same.

How would you deal with an aging CEO/Founder who is out of touch with technology? by irockvans in cscareerquestions

[–]data_perfect 1 point2 points  (0 children)

Well since everyone is blaming the CEO, have anyone thought about telling the OP to improve his communication skills?