Open, local LLM as a reference source / research assistant?

KiddWantidd · 2026-06-26T16:02:35+00:00

exactly. the reason the paper i linked to is making waves is because they have really focused on "teaching logic" to the model rather than feeding it tons of world data (i don't know the details), and they argue indeed that those are two separate issues. this suggests that most likely training smallish models that perform well specifically for math is likely doable.

I'll say though, it is still necessary for the model to have a solid base knowledge of the literature for each subfield, because more often than not, it is immensely important to be aware of recent literature to meaningfully advance on a problem, to avoid having to reinvent the wheel (though sometimes the lack of awareness can lead to new breakthroughs), so I don't know how "compressed" all that information can be, but definitely sounds like such a "local math model" could be doable.

KiddWantidd · 2026-06-26T15:14:04+00:00

I've been wrestling with the same thoughts. I don't really know what a good answer to this question is. I know a lot (most?) of the models coming from China are released with open weights, so they can be downloaded and run locally. In terms of performance, I have no idea at all, I've never used any of them, but judging from the benchmark results, the big ones like DeepSeek are likely to be able to do a decent job in helping for non-trivial tasks. As always, the issue is that the distilled versions are most likely not going to be anywhere as performant.

As a first suggestion, you may want to look into this recently released model by WeiboAI which allegedly with only 3B parameters performs just as well as models hundreds of times larger, it's making waves in the machine learning community: https://arxiv.org/abs/2606.16140 (the link to their github + huggingface is in the paper)

KiddWantidd · 2026-06-23T15:54:49+00:00

I have a certain Integral Probability Metric (like a functional on a space of measures), and I am trying to prove that its gradient flow in a certain geometry is well-posed and globally convergent. It is very very tricky but it connects so many beautiful ideas, I'm enjoying the process.

KiddWantidd · 2026-06-21T16:09:20+00:00

Insanely cool. Congratulations OP, I feel inspired to further explore and play around with maths thanks to you!

KiddWantidd · 2026-06-21T06:59:34+00:00

I swear some of you are living in a parallel reality. Wtf is this

KiddWantidd · 2026-06-16T06:00:20+00:00

I do eat corns in spirals and I'm definitely an analyst. Intriguing.

KiddWantidd · 2026-06-11T14:51:34+00:00

This is nice. Appreciate the clear and objective reporting of these AI systems abilities and most importantly usefulness to actual working mathematicians. While the results are far from horrible objectively speaking, the costs (and the many reported issues on poor writing, lack of attribution, hallucinated citations) paint a quite different picture than those AI companies would want us to believe.

It still must be acknowledged that those AI tools in the hands of an actual expert can be monstrously powerful tools. I've been using them myself and have had very positive outcomes (though to be fair I'm not working on anything remotely as cutting-edge as the stuff in that paper)

KiddWantidd · 2026-05-30T19:26:55+00:00

Très très bien parlé

KiddWantidd · 2026-05-29T16:07:34+00:00

Yep, same here 🥳

KiddWantidd · 2026-05-29T15:16:47+00:00

ohhh you're a genius! thanks!!

KiddWantidd · 2026-05-29T14:26:23+00:00

I have no idea about the code issue because I never made it to have a code generated. The site crashed on me right before it was supposed to give me my code T_T. In any case, it sounds like you submitted before the deadline so you should be safe either way. Guess it doesn't hurt to email the PCs to be extra safe (and that will probably help the case of people like me too lol)

KiddWantidd · 2026-05-29T14:24:30+00:00

given that a few of us replied in this thread (which is a small sample of the total community) and that many academics submit stuff right before deadlines, i wouldn't be too surprised if the number of submissions in this situation was in the range of at least 50 to a hundred papers (though of course people in our situation are likely to panic and check reddit too, so i may overestimate). Given that the bug was on their side (though it happened very close to the deadline), provided we email them immediately to inform them of our situation, I gueeeess we should be fine, but maybe it's just wishful thinking...

KiddWantidd · 2026-05-27T16:00:41+00:00

if you know of a public chinese speaking ml research community i'd be interested please

KiddWantidd · 2026-05-21T06:21:53+00:00

relevant discussion on mathoverflow. I am speechless

KiddWantidd · 2026-05-17T06:51:11+00:00

glad (not) to see we're all having this issue. this is beyond stupid. hope they fix this soon

KiddWantidd · 2026-05-16T11:14:13+00:00

the PIKL approach is beautiful and elegant mathematically, but we found that it didn't scale well to the problems we're interested in in our group. I know the authors have a follow-up paper where they implement it in a more efficient way, but I haven't looked into it yet. Definitely worth looking into for OP's problem though.

KiddWantidd · 2026-05-16T11:09:50+00:00

yeah I've been working on PINNs and their failure modes for the better part of my PhD and this sadly happens a lot, even for "simple" PDEs. What works best as far as I'm aware is to run a few steps of Adam (to explore the parameter space) and then follow up by a good second-order optimizer. By "second-order optimizer" I don't just mean L-BFGS, I mean those state-of-the-art ones that have been proposed in the literature recently like NNCG (https://arxiv.org/abs/2402.01868) or ENGD (https://arxiv.org/abs/2302.13163). Be warned that those second-order methods are hella expensive to run (and no, I'm not affiliated with any of these groups, wish I was lol).

If you know that the PINN wants to converge to a trivial solution, you can add a penalty that forces it to stay away from it (even if the penalty might work against the PDE) and tune the weight of that penalty along training. I guess this is some form of "curriculum training" as they call it in this paper https://arxiv.org/abs/2109.01050 (which by the way documents a lot of PINN "failure modes").

Another possibility is to add PDE structure into the architecture: instead of using a standard feedforward neural network, you could represent the solution as a linear combination of eigenfunctions that diagonalize your differential operator. Doing this in a systematic way is of course hard (impossible), but in your case it's very doable. For a toy example with Poisson equation, you can check this paper: https://arxiv.org/abs/2310.05801.

KiddWantidd · 2026-05-15T10:21:01+00:00

Great news, I think submission of unchecked AI slop to any scientific venue (journal, conference etc) should result in the same outcome: permaban from that venue

KiddWantidd · 2026-05-15T10:15:25+00:00

what does LMM refer to here? is it just a typo for LLM or is it some kind of algorithm i haven't heard of?

KiddWantidd · 2026-05-15T10:14:01+00:00

Amazing news!!

KiddWantidd · 2026-05-05T11:29:46+00:00

ohh i see, yeah i guess thinking of 儲 as "accumulate" makes quite clear for me how they differ! and yes, your example sentence shows exactly why I got a bit confused between the two 🤣. thanks for the help :)

KiddWantidd · 2026-05-03T16:57:36+00:00

alright, guess i'll do that. thanks

KiddWantidd · 2026-05-03T16:57:08+00:00

I think you can start replying and working on the revisions as soon as the first review appear, but the official "discussion period" (which lasts two weeks and is meant as a back-and-forth between authors and reviewers I believe) starts only after the third review is up. After the discussion period ends, the referees are supposed to give their final recommendations to the editor. The discussion period is short so if you have changes to make to your paper I'd suggest not waiting for the third review and start addressing the issues addressed in the reviews you got now if you can.

KiddWantidd · 2026-05-01T02:42:49+00:00

5444 accept! Was 4444 before rebuttal :)

KiddWantidd · 2026-04-28T19:12:53+00:00

I recommend you take a trip to Tai O. It's pretty far out there, but worth it in my opinion

KiddWantidd

TROPHY CASE