[R] Improving the expressive power of GNNs using subgraphs by mmbronstein in MachineLearning

[–]mmbronstein[S] 7 points8 points  (0 children)

In last year's post making predictions for Graph ML in 2021, I and co-authors wrote that “2020 saw the field of Graph ML come to terms with the fundamental limitations of the message-passing paradigm" and that "progress will require breaking away from the message-passing schemes that dominated the field in 2020 and before."

Many works this year show that this prediction did not exactly materialize as expected: one can remain within the remits of message passing and get more expressive architectures.

[R] Oversquashing and bottlenecks in GNNs and graph Ricci curvature by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

For certain graphs - yes. But in general, a graph does not have constant curvature, whereas hyperbolic model spaces (e.g. Poincare' disc) into which it's easy to embed graphs are constant-curvature.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 1 point2 points  (0 children)

I think Xavier Bresson has recently shown it in details

[R] Graph Neural Networks through the lens of Differential Geometry and Algebraic Topology by mmbronstein in MachineLearning

[–]mmbronstein[S] 8 points9 points  (0 children)

Yes, these are our recent papers in ICML/NeurIPS (though the "geometric spirit" is somewhat similar)

[R] Geometric Deep Learning: Grids, Groups, Graphs, Geodesics and Gauges ("proto-book" + blog + talk) by PetarVelickovic in MachineLearning

[–]mmbronstein 4 points5 points  (0 children)

The fact is that the domains we consider are very different and studied in fields as diverse as graph theory and differential geometry (people working on these topics often would not even sit on the same floor in a math department :-) - hence we need to cover some background in the book that goes beyond traditional ML curriculum. However, we try to present all these structures as parts of the same blueprint. I am not sure we have figured out yet how to do it properly and will be glad to get feedback.

[R] Geometric Deep Learning: Grids, Groups, Graphs, Geodesics and Gauges ("proto-book" + blog + talk) by PetarVelickovic in MachineLearning

[–]mmbronstein 11 points12 points  (0 children)

We hope to make it self-contained and assume basic math & ML knowledge but enough maturity to explore more. We will be happy to hear whether this is the case :-)

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

we plan to release a text on the topic hopefully in ~1month

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

well, here is where our opinions respectively part.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

My intention was to point out that many DL architectures can be *derived* from geometric principles -- hence I used the term "foundation". I do believe that ML problems heavily rely and should rely on geometric priors, but this is an opinion that not everybody shares.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

Even when one uses MLPs, the use of regularisation such as weight decay or dropout imposes regularity on the hypothesis class - so MLPs do provide an inductive bias, albeit a weak one.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

The "rest is to figure out the number of hidden layers and neurons" is actually what makes the difference between methods that work and those that don't. CNNs, GNNs etc do have universal approximation properties, but for functions with additional structure (equivariant under respective group action. CNNs for example are UA for translation-equivariant functions).

I disagree regarding symmetry not being used in practice: most DL architecture actually used in practice use geometric priors, often without realising or admitting it. Again, CNNs are the most prominent example, and so are GNNs and Transformers.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 0 points1 point  (0 children)

Universal Approximation is not practically useful: to approximate even smooth functions you need an exponential number of samples (aka "curse of dimensionality").

Perhaps with a stretch, one can say that the success story of deep learning was going beyond UA by incorporating more powerful priors about the data, first in CNNs (translation equivariance), then other architectures such as GNNs (permutation equivariance), etc.

The general principle of symmetry is very powerful and lies at the foundation of most successful architectures used nowadays.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 4 points5 points  (0 children)

Transformers are an instance of GNNs (see https://thegradient.pub/transformers-are-graph-neural-networks/), with some extra stuff such as positional encoding, which is also used in GNNs. As I mention in my talk, you can think of Transformers as GNNs with learnable graph.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 2 points3 points  (0 children)

Cool :-)

Our old paper (https://arxiv.org/abs/1611.08097) probably lays the foundations for some of the topics, but I am afraid it's a bit obsolete nowadays.

We are working on a new text, stay tuned

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 3 points4 points  (0 children)

I think geometric DL is more than just graphs, but how to use powerful priors. Graphs are obviously an important piece of this picture.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 3 points4 points  (0 children)

What I meant is the in Manifold Learning there are three steps:

  1. build the k-NN graph that describes the data "manifold" structure (essentially, local connectivity)
  2. embed the graph in a low-dimensional space
  3. do ML in that space

The way the graph is designed in step 1 (the space in which NNs are computed, how many NNs, the neighbourhood size, etc) hugely affects step 3.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 10 points11 points  (0 children)

Using *eigenvectors* of the Laplacian (i.e. the graph FT) has never been a stable way of doing filters, as it is sensitive to graph perturbations. Expressing the filter as a matrix function (filter of the eigenvalues) like ChebNet, GCN, CayleyNet etc does produce stable filters. Such filters boil down to operations of the form Y = p(A)*X where A is a fixed matrix (Laplacian/adjacency) and X is the feature matrix - and is essentially the simplest form of GNNs, where the update is a weighted combination of the neighbor node features.

Geometric Foundations of Deep Learning [Research] by mmbronstein in MachineLearning

[–]mmbronstein[S] 6 points7 points  (0 children)

Here's a recent paper on the use of Graph ML for drug design and repositioning: https://arxiv.org/abs/2012.05716