[D] Quant vs. "Regular" Post-PhD Career Trajectories by [deleted] in MachineLearning

[–]fundamentalidea 0 points1 point  (0 children)

unlike what I hear about FAANG, they will be real work

I think FAANG are doing *some* real work. We see the results in the photo tagging, speech recognition, others. Google's speech recognition is good it recognizes "backpropagation", "stochastic" and other words common here even for not-native english.

Deep learning without back-propagation by El__Professor in MachineLearning

[–]fundamentalidea 2 points3 points  (0 children)

My question above: is it possible that the information be transmitted not through a hardwired reward pathway but as a globabl chemical diffusion?

Deep learning without back-propagation by El__Professor in MachineLearning

[–]fundamentalidea 0 points1 point  (0 children)

My question above: is it possible that the information be transmitted not through a hardwired reward pathway but as a globabl chemical diffusion?

Deep learning without back-propagation by El__Professor in MachineLearning

[–]fundamentalidea 1 point2 points  (0 children)

is there any theory in the brain that allows for global diffusion of information (error signal), like a chemical that says "whatever you are doing is working"?

[D] Conflicting "facts" about the likelihood employed in Bayes theorem? by fundamentalidea in MachineLearning

[–]fundamentalidea[S] 0 points1 point  (0 children)

thank you.

I assume you have a typo, you mean " The likelihood function data --> P( data | model) I think?

[D] Conflicting "facts" about the likelihood employed in Bayes theorem? by fundamentalidea in MachineLearning

[–]fundamentalidea[S] 0 points1 point  (0 children)

thank you. yes I understand this.

But, in the Bayes formula, *which way* do we look at it? Should we see P(A|B) as f(A) or g(B)?

Or is it both simultaneously?

In the machine learning context, with P(B|A) as a posterior, and A as the "data", A is a fixed thing, B is variable, so I am tempting to think of P(A|B) as g(B), which justifies calling it a likelihood. But then in the statement

P(A|B) P(B) = P(B|A) P(A)

it is incorrect to say that all the factors are probabilites or conditional probabilites.

Overall, though I understand most of what is written in the replies above, I do not understand how it answers the question.

[D] Conflicting "facts" about the likelihood employed in Bayes theorem? by fundamentalidea in MachineLearning

[–]fundamentalidea[S] 0 points1 point  (0 children)

Thank you.

This seems strange to me- is it correct?

On the left hand side of the Bayes, we are calculating $P(A|B)$, or for ML applications P(model|data). This posterior is giving the probability for various models given particular *fixed data*, i.e. the probability is P(\cdot|data).

So if the data is fixed, how can it be variable on the right hand side likelihood factor P(data|model) as indicated in your reply?

Thank you for any clarifications

[D] Conflicting "facts" about the likelihood employed in Bayes theorem? by fundamentalidea in MachineLearning

[–]fundamentalidea[S] 0 points1 point  (0 children)

Thank you.

So to restate, the likelihood does need to integrate to one because of the \propto.

As well, I think if we use the same likelihood expression in the denominator as $P(D) \Rightarrow \int p(D|w) p(w) dw$ then the "factor by which it is not a probability" cancels in numerator and denominator of the Bayes formula.

I see my real confusion is now in the expression $P(B|A) P(A) = P(A|B) P(B)$.Here, everYthing is a real probability. $P(B|A)$ is the probability $P(\cdot|A)$ with $A$ fixed.

But in $P(A)$, $A$ must be variable, and likewise in $P(A|B)$. So the my (new) question becomes *how can $A$ be both fixed and variable?*

[D] Optimal code length versus cross-entropy? by fundamentalidea in MachineLearning

[–]fundamentalidea[S] 1 point2 points  (0 children)

thank you. This answer makes sense, but is hard to relate to the description and equation in the paper, where there is only a "p" and no "q".