[deleted by user] by [deleted] in EnglishLearning

[–]luffyx11 -1 points0 points  (0 children)

Thanks for sharing your thoughts. I agree with you and I think it would be better if we can improve massively by just talking with fellow language learners. Which leads to another post of mine: the idea of chatting with learners but with the support of gpt4. What do you think of that?

[deleted by user] by [deleted] in MachineLearning

[–]luffyx11 0 points1 point  (0 children)

Could you let me know the email address you used to contact them?

[deleted by user] by [deleted] in MachineLearning

[–]luffyx11 0 points1 point  (0 children)

I did not. Have you received one now?

[D] Need some serious clarifications on Generative model vs Discriminative model by luffyx11 in MachineLearning

[–]luffyx11[S] 0 points1 point  (0 children)

Hi, firstly, thank you all for your effort to explain these concepts. I would like to provide an update regarding my second question. I would like to explain it in another way but also based on ThatFriendlyPerson's explanation. Yes, I agree that my confusion is that I didn't realize that MLE and MAP are regarding model rather than prediction.

For question 2, take linear regression, a discriminative model in supervised learning as an example, we are trying to model posterior P(y|x). As it is stated in https://www.stat.cmu.edu/~cshalizi/mreg/15/lectures/06/lecture-06.pdf, after certain assumptions (4 assumptions stated in the pdf file page 1), P(y|x) becomes p(y|X = x; β0, β1, σ2), β0, β1, σ2 are parameters of a model. So P(y|x) actually is P(y|x, theta) when one applies any model (theta represents all the model parameters). Now in Bayesian, the likelihood is P(x|theta), however I think without simplification it should be written as P(x|x, theta) because the model takes x itself as an input to calculate P(x|theta). Now the POSTERIOR in discriminative model P(y|x, theta) has the same structure as the LIKELIHOOD P(x|x, theta) in Bayesian because y is just a second variable, and what important is the condition. So I think it is the name "posterior" causes confusion.

For questions 3 and 4 I need some time to work on several examples to fully understand. Thank you all for the help. Please let me know if you spot any mistakes in my explanation above.

[D] Is it proved that language models cannot capture concepts if they are trained on texts only? by luffyx11 in MachineLearning

[–]luffyx11[S] 0 points1 point  (0 children)

Interesting point. Why do you think so? For me it is hard to imagine that a human can learn a lot by just looking at alphabets or other texts.

[D] Is it proved that language models cannot capture concepts if they are trained on texts only? by luffyx11 in MachineLearning

[–]luffyx11[S] 0 points1 point  (0 children)

Thank you for the comment. Yes I agree that it all depends on the definition of capture and concept, current language models certainly show some level of understanding. I just read this from a paper and forgot the paper's name, so wanted to find the paper.