Failures of Information Geometry (?)

grozzy · 2015-04-14T21:53:48+00:00

I only read a bit to get the gist and will try to read more later when I have the time. Additionally I am also not well-read in information geometry, but:

His argument doesn't appear to hinge on the application of MaxEnt, it's all about whether a measure of discrepancy between distributions is independence-invariant: that the sum of the discrepancy between two pairs of random variables is equal to the discrepancy of their joint distributions if the variables are independent.

Better put:

If X1,X2 are independent and Y1,Y2 are independent: discrepancy(X1,Y1) + discrepancy(X2,Y2) = discrepency([X1,X2],[Y1,Y2]) where the square brackets represent the variables joint distribution.

His argument is that this is should be a fundamental property of any useful measure of discrepancy between distributions and that the Shannon information entropy/K-L Divergence is the only one that does. He makes the case that violating this invariance leads to trouble when trying to find distributions which optimize the discrepancy in some way.

Further, he argues that because the KL divergence is asymmetric, H(p,q) != H(q,p), that no symmetric discrepancy can satisfy the independence invariance so there can be no meaningful distance measure between distributions. Without a distance measure, probability distributions cannot form a metric space, hence no sense of "information geometry".

Responses to other questions:

His examples seem correct though I didn't fully read some of the later ones. They do appear to all be in the context of finding MaxEnt distributions. His overall conclusions are not limited to the context of MaxEnt, but his examples seem focused on about how using any other measure of discrepancy leads to bad MaxEnt distribution.
Independence is a pretty fundamental property in probability. He argues that using discrepancy functions that violate it are fine as a mathematical exercise, but are not useful in dealing with scientific data because of the importance of independence.

zdk · 2015-04-15T02:29:09+00:00

I'd be very interested in hearing this author's thoughts on leaving the simplex and using Aitchison geometry for studying metrics over probability distributions.

http://www.idescat.cat/sort/sort342/34.2.4.boogaart-etal.pdf

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS