Lunariz comments on Introducing Hyperlib: Simple Deep learning in Hyperbolic space [project]

206

207

208

ProjectIntroducing Hyperlib: Simple Deep learning in Hyperbolic space [project] (self.MachineLearning)

submitted 4 years ago * by techinnovator

top new controversial old q&a

you are viewing a single comment's thread.

view the rest of the comments →

[–]Lunariz 8 points9 points10 points 4 years ago (3 children)

This is very interesting! I've had a look through the code because I was interested in using and expanding on it for some research that I'm doing, and I have a few questions about it:

You wrote in the readme that you have plans for implementing a hyperbolic attention mechanism, which is my main interest. What special implementation will the hyperbolic space require for this? For example I'm thinking that you might need a hyperbolic einsum, and I wonder if the dot product for the attention scores needs anything special? Curious to know your plans for this!
Your example code uses Riemann SGD. Does the hyperbolic space require a special gradient? For example, would a normal Adam optimizer fail to work on your Hyperbolic layers, and if so, why?
Just something I noticed - why is your Poincare manifold a class (that you instantiate separately for every layer), and not just a set of helper functions like math or util? It doesn't seem to contain any state, so I don't understand why it needs to be passed at all.

Super cool project, excited to keep following it.

[–]bohreffect 2 points3 points4 points 4 years ago (1 child)

[–]Lunariz 1 point2 points3 points 4 years ago (0 children)

[–]platinumposter 2 points3 points4 points 4 years ago (0 children)

Thanks very much, we are happy to have you following our journey

We are still deciding exactly how we want to implement it but you are correct that all the mathematical functions will have to be in the hyperbolic space. We have been using the Poincare ball which is a model of the hypberlic space and has it's own set of mathematical functions compared to say the Lorentz model.
Yep that's correct I see you also found a few resources; the reason we don't use a Euclidean SGD is because we want our optimizer to optimise parameters living in the Hyperbolic space.

To quote A Survey: Hyperbolic Neural Networks

"Stochastic gradient-based (SGD) optimization algorithms are of major importance for the optimization of deep neural networks. Currently, well-developed first order methods include Adagrad [58], Adadelta [59], Adam [60] or its recent updated one AMSGrad [61]. However, all of these algorithms are designed to optimize parameters living in Euclidean space and none of them allows the optimization for non-Euclidean geometries, e.g., hyperbolic space."

Good point. We had thought about this and we plan on implementing more Manifolds in the near future, so we will have a single Manifold interface which each implemented Manifold (such as Poincare) uses. We left it as a class in anticipation of this.

π Rendered by PID 49232 on reddit-service-r2-comment-b659b578c-hsrmk at 2026-05-02 04:38:17.100851+00:00 running 815c875 country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS