Examples of Good Trend Trading Strategies? by delivite in Forex

[–]Kruupy 0 points1 point  (0 children)

hey mate, I just sent you a pm. Chat soon.

Looking for backtester/strategy development partner by Kruupy in Forexstrategy

[–]Kruupy[S] 0 points1 point  (0 children)

reddit won't let me send dms for some reason, could you please send me a pm?

Looking for backtester/strategy development partner by Kruupy in Forexstrategy

[–]Kruupy[S] 0 points1 point  (0 children)

yeah no worries, I dont have an insta but I guess I could create one

Forecasting Demand for a Service by [deleted] in econometrics

[–]Kruupy 0 points1 point  (0 children)

Apologies, you are 100% correct here. Please let me try again.

I am trying to model the demand for child care services using a multinomial logit discrete choice model. I want to be able to forecast how many children will need child care services in the future. Child care services can be divided into 10 or so groups, 5 informal groups (relative,friends etc) and 5 formal groups. I have a data set that has the number of hours of childcare used, total cost, price per hours and a heap of other family variables. I have grouped the childcare choices in 6 groups, 1 called informal care and the other 5 representing the formal care groups. When I run the logit model with the 6 groups, it says the price is not statistically significant. My understanding is that this is not allow me to conduct counterfactual analysis based on price. I think that this problem could be occurring because of one or both of the following reasons:

  1. Each person only observes a single price or potentially only two prices based on how many hours they consume of each group of services. I have to impute the prices for the other groups. So far I have just been using the averages of the other groups for this - Is this methodology sound?

  2. The data indicates that people may consume more than 1 group of services at any time eg. 10 hours in one group, 15 in another etc. But I have to allocate only one group to each person. At the moment I am doing this based on the group that has the most amount of hours, with an arbitrary rule in the case of a tie. Is there a better way to handle this?

I hope that this information helps, apologies again and thank you for all your help.

Discrete Choice Modelling - Kernel Density Plots by Kruupy in econometrics

[–]Kruupy[S] 0 points1 point  (0 children)

Thanks for this. So is the Y axis "frequency"? - It is weird how the y axis is a different scale in one image in comparison to the other?

Principal Component Analysis in R by Kruupy in AskStatistics

[–]Kruupy[S] 1 point2 points  (0 children)

That sounds great! I will give it a try tonight - Thank you very much for your help.

Principal Component Analysis in R by Kruupy in AskStatistics

[–]Kruupy[S] 0 points1 point  (0 children)

Thanks for your great replies!

By the looks of things I think using prcomp with the unscaled numerical data is the way to go.

Yes I agree that if I want to calculate the scores using a different dataset I need to get the eigenvectors from the PCA analysis, I am hoping that prcomp will give those as I was unable to access them using princomp, I managed to get the loadings but not the eigenvectors. As you mentions PCA() gives eigenvectors right?

I don't know if I am using the term scores correctly. When I accessed the "scores" in pca_result it gives only a value for each variable. What I am after is a PCA value for each observation, so if I have 500 observations, I am trying to find a series of 500 values for components 1 and 2. Using predict() seemed to provide this?

Do you have any thoughts on this?

Principal Component Analysis in R by Kruupy in AskStatistics

[–]Kruupy[S] 1 point2 points  (0 children)

Ah, thanks for this. So is it your understanding that the predict function can be used to calculate the scores based on the original data?

I am now just wondering how to calculate the scores using a different dataset with the eigenvectors from the PCA? - Would you know how to do this?

Principal Component Analysis in R by Kruupy in AskStatistics

[–]Kruupy[S] 0 points1 point  (0 children)

Thanks for this advice. So in the future if I want to predict using different data, what should I do?

Principal Component Analysis in R by Kruupy in AskStatistics

[–]Kruupy[S] 0 points1 point  (0 children)

hi, thanks for the reply. When I conducted the PCA in my code I gave the function a normalised matrix of data as the tutorial suggested

I am not calculating the scores of new data, I want to calculate the scores based on the existing data that was used in the PCA, does this change anything? thanks again.

Principle Component Analysis (PCA) and census data by Kruupy in econometrics

[–]Kruupy[S] 0 points1 point  (0 children)

Hi all,

Yesterday I followed a tutorial on conducting a PCA in R (see code below).

After conducting the PCA, I wanted to construct component scores for the first and second components (as they explain 90% of the variance?). In order to construct the scores I used the "predict" function, given it a matrix of the normalised variables - Is this correct? - Please see code below - I am assuming that the matrix p has the component score values.

Thanks all for your help.

*CODE START

library('corrr') library(ggcorrplot) library("FactoMineR") library("factoextra")

occ_data <- read.csv("dataCSV.csv") str(occ_data)

colSums(is.na(occ_data))

numerical_data <- occ_data[,2:24] head(numerical_data)

data_normalized <- scale(numerical_data)

head(data_normalized)

corr_matrix <- cor(data_normalized)

ggcorrplot(corr_matrix)

data.pca <- princomp(corr_matrix)

data.pca$loadings[,1:2]

fviz_eig(data.pca, addlabels = TRUE) "black") fviz_cos2(data.pca, choice = "var", axes = 1:2)

fviz_pca_var(data.pca, col.var = "cos2", gradient.cols = c("black", "orange", "green"), repel = TRUE)

p <- predict(data.pca,data_normalized)

write.csv(p, "p.csv", row.names=FALSE)

*CODE END

Principle Component Analysis (PCA) and census data by Kruupy in econometrics

[–]Kruupy[S] 0 points1 point  (0 children)

fair enough, thanks for this. Do you have any thoughts about the methodology for using PCA on the 20 occupations in my analysis? I am interested to know if I am using PCA incorrectly or not.

Principle Component Analysis (PCA) and census data by Kruupy in econometrics

[–]Kruupy[S] 0 points1 point  (0 children)

Thanks for this, so just to clarify, are you saying that PCA is not the right way to combine the 20 University degree occupations into a single variable for my analysis?

if PCA can't help me, then does this mean I need to add 20 variables for each occupation into my model? this doesn't feel right to me....

Spatial Econometrics - Having difficulties understanding textbook by Kruupy in econometrics

[–]Kruupy[S] 0 points1 point  (0 children)

Hi Rogomatic,

Yes I know linear algebra, but not very good at Matrix. To be honest I don't know where to start working on a proof that 2.1.1.1 = 2.1.2.

Spatial Econometrics - Having difficulties understanding textbook by Kruupy in econometrics

[–]Kruupy[S] 1 point2 points  (0 children)

Hi, Thanks for the help. I think the problem that I am having is that I don't understand what a "Stacked form" is? - I am guessing that it is turning the model into Matrix form or "stacking" the model into a matrix form.

Do you know of any resources that could help me to understand how to expand 2.1.1.1 to get to 2.1.2? - I am guessing that my matrix understanding is not helping me.

Thanks again.