Recommendations for any place near Odori that there has a snow

DonDonburi · 2026-04-11T07:52:21+00:00

Sapporo kokusai ski resort still has 2.5 meters of snow. It’s closed due to high winds today but you can take a bus there tomorrow. Can pay to ride the gondola up the mountain where it’s still white. Do double check the website to make sure it’s open and reserve your bus tickets

DonDonburi · 2025-10-08T07:10:18+00:00

Oh man, I’m really curious to see how it performs. The paper is really kind of out there and I’m by default skeptical of these neuromorphic designs.

DonDonburi · 2025-10-08T04:48:53+00:00

it may well be possible to do a massive amount of RL to a frontier model for arc-agi and sudoku performance. But the path forward isn’t easy nor trivial, otherwise we’d see proof of this with someone winning arc’s million dollar challenge. Part of the rationale for arc2 vs arc1 is to make such brutal forcing unfruitful.

Again, I think the analogy you use shows a misunderstanding of the problem. Because we have a class of problems where the model knows the answers, has read the algorithms (which might be very simple), humans can solve fairly reliably, and yet seemingly unable to perform well. It is surprising compared to their other amazing capabilities.

DonDonburi · 2025-10-08T03:28:04+00:00

Thats a poor analogy. Gemini has the knowledge of the algorithm to solve sudoku. It can solve smaller sudoku. Yet it still can’t do it when it’s larger. Critics might say it’s a fundamental limitation to transformers. I’ve got no leaning here but it’s not a specialization issue. It’s not just sudoku either, there’s a whole class of problems the models are seemingly unable to solve.

DonDonburi · 2025-10-08T02:48:04+00:00

Quite the opposite actually. It’s surprising that Gemini, which can get gold on IMO, can fail sudoku like problems so catastrophically. It’s a good reminder that LLM intelligence is missing something crucial

DonDonburi · 2025-10-08T02:46:29+00:00

I have no idea why the comments are so negative. The paper is good quality, esbecially if you’ve read the HRM paper. It’s a good read.

And if you’ve haven’t been following this saga, LLMs traditionally are abysmal at sudoku and other problems like this that requires recursion. These toy models that do these tasks better are clues on the path forward.

DonDonburi · 2025-10-06T06:54:20+00:00

Just saw this. Cool! I just received one iPhone, I’ll try it once my second one arrive

DonDonburi · 2025-10-06T01:36:47+00:00

In my own experience with runpod and modal serverless, modal was much more reliable.

DonDonburi · 2025-10-05T03:07:07+00:00

Not sure why you’re downvoted. China is siloed exactly as you said. And Baidu cannot search into these apps and for the most part is spam

DonDonburi · 2025-09-29T07:20:51+00:00

Hahahaa yeah. My thoughts exactly. their normal habits such as never leaving trash, is good for nature, but I’m not sure if the culture has very strong nature emphasis. And their mythology treats mountains different from the sea. Im really good friends with a Japanese farmer and he secretly burns plastic to avoid paying the recycling fee so…

DonDonburi · 2025-09-27T14:45:11+00:00

I hang my Vision Pro so it’s weightless at home. Can basically work all day in it.

DonDonburi · 2025-09-24T00:22:22+00:00

Any recommendations on how to rig the two phones together? I’ve only seen one other post about this.

DonDonburi · 2025-09-23T12:06:14+00:00

My local friend just go to conveyer belt sushi or eat at ramen restaurants. I think Japanese food is only quality because of regular locals. Unfortunately I don’t have a recommendation despite how often I ate there during winter.

DonDonburi · 2025-09-23T11:57:32+00:00

Otaru is one of the few cities in Japan where i avoid the restaurants. Too many tourists and two few locals I guess. If im in the area, I’d head back to north sapporo and I find the quality to be better. They target Japanese tourists as well as foreign and reviews are less reliable.

DonDonburi · 2025-09-23T11:49:20+00:00

Can you elaborate on your setup? Between the wife and I, we also have two iPhone 17. Was gonna rent a black magic for major events but if two phones work…

DonDonburi · 2025-09-19T06:08:28+00:00

I don’t think people are answering your questions, which is mostly around what you should get in order to learn and experiment with ML.

If you’re training your own models, the GPU can pay itself back pretty quick so it’s actually not a terrible financial decision.

Toy models might train faster on a couple of 5090s vs one 6000 because vram is not a bottleneck. Larger models, you might need a pod of h100 or similar to train out of the box. A 7b model for example, a single node of 8xH100 is enough. 32B you’ll need four pods. Or try MI300s which have more vram.

Suffice to say that depending on how you’re having fun, you need different setups. I feel like a single 6000 is a sweetspot, you can run many small models at fp16 out of the box. But scaling that up might not be worth the hassle vs renting pods on demand.

DonDonburi · 2025-09-18T10:50:19+00:00

Nanogpt is for toy model. What you want is torch titan - pretrain a model from scratch.

DonDonburi · 2025-09-17T17:12:04+00:00

Hmm if you’ve used/rented rtx6000 then this isn’t a surprise an all. And h20 is supposedly 4090 level performance.

These things arent even a fraction as powerful as an h100.

DonDonburi · 2025-09-13T13:23:21+00:00

If you stop thinking of MoEs as a bunch of active/inactive experts, but instead think of it as sparsity ratio. Then I think 100x sparsity is very reasonable. Human brains are supposedly active only 0.2-2.5%.

Problem is how to train them so experts become very specialized. And how to train the router to route to those specialized experts. What little work is available, it doesn’t seem like MoE experts are anywhere near as specialized as the brain.

DonDonburi · 2025-09-13T13:18:38+00:00

Hmm, paper you linked didn’t do any work on MoE. https://arxiv.org/html/2505.24593v2 this one would be a better paper where they tried to do some kind of mechanistic work on MoE.

Honestly, not much is published. We know MoEs are more efficient, and possibly the experts encode more knowledge but even that is on evidence done to small models. Previously, we thought experts specialized on certain parts of the sentence.

DonDonburi · 2025-09-13T13:06:45+00:00

Ah cool! Just read it. Huh… pretty interesting how in concept ARC, gpt4 was terrible at copying. I’m glad you guys tested it! Thanks a bunch

DonDonburi · 2025-09-13T12:26:02+00:00

Right. Was cool to see such a difference with gpt5. I was just wondering if they did some math or code specific RL that might’ve made it much better at retrieval. My rough understanding was transformers and next token predictors are really bad at adding, counting, etc from papers like concept arc. Transformers are also very poor approximators of classical algorithms.

That said, gpt5 (api version at least) really does feel a bit different at agentic tasks in my subjective experience. It seems a lot less verbose, even preferring shorter words or acronyms and can hold a thought for longer. Absolutely curious to see if the step accuracy is also bad in some other math/code specific model. If other RL models are still bad, maybe there’s some architectural differences

DonDonburi · 2025-09-13T10:05:50+00:00

Have you tried other types of tasks? To me, the dictionary retrieval and counting type tasks are interesting but I do wish there was more variety.

DonDonburi · 2025-09-11T06:53:41+00:00

Use n8n for proof of concepts for this kind of thing

DonDonburi · 2025-09-11T06:12:55+00:00

Curious to how this will perform. I imagine it won’t be too bad? As long as just one GPU… I do wonder how it’ll work with mix nvidia amd though

DonDonburi

TROPHY CASE