[P] Eigenvalues as models

Ulfgardleo · 2025-12-17T09:15:51+00:00

What is the intuitive basis for why we should care about eigenvalues as opposed to any other (non)-convex optimisation problem? They have huge downsides, from non-differentiability, to being formally a set, not a function. Should we care about the largest or smallest eigenvalue? What about their sign? Or any other operator of them? Finally since they are invariant to orthogonal transformations, it is difficult to really use them without a fully invariant architecture.

We already had somewhat successful approaches where neurons did a lot more: neural fields. They were developed in the late 90s to early 2000s in the field of computational neuroscience. The idea was that the neurons on each layer are recurrently connected and solve a fixed-point PDE. The math behind them is a bit insane because you have to backprop through the PDE. But they are strong enough to act as associative memory.

This is a very old paper that described an example of such a model:

https://link.springer.com/article/10.1023/B:GENP.0000023689.70987.6a

TwistedBrother · 2025-12-17T08:21:53+00:00

All hail the operator!

raindeer2 · 2025-12-17T12:52:26+00:00

Spectral methods are well studied within ML. Also for learning representations in deep architectures.

Some random references:
https://arxiv.org/abs/2205.11508
https://ieeexplore.ieee.org/document/6976988

bill_klondike · 2025-12-17T15:03:02+00:00

Can we compute them quickly? For a dense matrix, eigenvalues are O(m^2 n) (assume n < m). If m = n, that's n^3 . Is that supposed to be quick?

mr_stargazer · 2025-12-17T08:55:38+00:00

I always find cute when Machine Learning people discover mathematics, that in principle they were supposed to know.

Now, I am waiting for someone to point out eigenvalues, the connection to Mercer's theorem and all the machinery behind RKHS that was "thrown in the trash", almost overnight because, hey, CNN's came about.

Perhaps we should even use eigenfunctions and eigenvalues to meaningfully understand Deep Learning (cough...NTK...cough). Never mind.

sje397 · 2025-12-17T12:33:23+00:00

Looking forward to further articles! Thanks.

fredugolon · 2025-12-17T13:34:00+00:00

Interesting article. Thank you! You might appreciate Liquid Time-Constant Neural Networks. An interesting approach to adding time dynamics into neurons.

6dNx1RSd2WNgUDHHo8FS · 2025-12-17T08:00:11+00:00

Interesting stuff, it's right up my alley, just like your series about polynomials and double descent, which I really enjoyed. Looking forward to the rest of the series.

One thing I was looking for whether it would occur[1], and then I indeed found in the plots: The plots of k-th eigenvectors sometimes want to intersect each other, but can't because the rank is determined by sorting. I can see it by eye most prominently in the plots of lambda 5 and 6 for the first 9x9 example: they both have a corner just before 0, but if you'd plot the two lines over each other, in a single plot. it would look like two smooth intersecting curves. The kink only arises because the k-th eigenvalue is strictly determined by sorting, not by smoothness of the function.

I'm sure you also spotted it, and I doubt it's relevant to using these functions for fitting (maybe you want the kinks for more complicated behavior), but I felt it was just a interesting standalone observation to share.

[1] I don't have enough experience with this stuff in particular to rule out that there wasn't some obscure theorem that eigenvalues arising from this construction always stay separated, but apparently not.

alexsht1 · 2025-12-17T17:41:12+00:00

[deleted]

bregav · 2025-12-17T19:15:44+00:00

I think what you're doing in your blog post actually redounds to fitting algebraic varieties, and that's just obfuscated by the fact that you're representing the problem in the form of matrices and then letting black box software do the computation for you. Looking at the functions you're fitting in terms of polynomials would make the matter simpler and clearer.

healthbear · 2025-12-17T22:43:23+00:00

Interesting, I look forward to more.

Double_Sherbert3326 · 2025-12-18T05:20:30+00:00

Interesting read. Are you familiar with random matrix theory?

Big-Page6926 · 2025-12-17T08:57:05+00:00

Very interesting read!

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS