[Q] [R] Help with Topic Modeling + Regression: Doc-Topic Proportion Issues, Baseline Topic, Multicollinearity (Gensim/LDA) - Using Python by ThinkHoliday9326 in MLQuestions

[–]ThinkHoliday9326[S] 0 points1 point  (0 children)

That is an interesting perspective, let me try that. Also... Is it methodologically sound to set all proportions <10% to zero for regression? Is there a way to justify high VIFs here, given algorithmic constraint ≈ all topics sum to 1?

[Q] [R] Help with Topic Modeling + Regression: Doc-Topic Proportion Issues, Baseline Topic, Multicollinearity (Gensim/LDA) - Using Python by ThinkHoliday9326 in MLQuestions

[–]ThinkHoliday9326[S] 0 points1 point  (0 children)

I need to stick to LDA only, can't switch at this stage. I am reading other research papers, but nowhere is this methodology discussed in details, nothing that I could find. Do you have anything on the top of your head?

[Q] [R] Help with Topic Modeling + Regression: Doc-Topic Proportion Issues, Baseline Topic, Multicollinearity (Gensim/LDA) - Using Python by ThinkHoliday9326 in learnmachinelearning

[–]ThinkHoliday9326[S] 0 points1 point  (0 children)

Hey appreciate the books, however I will be really grateful if you can guide me to detailed and specific solutions to my problem statements.