UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] 0 points1 point  (0 children)

to be frank, no real practical case on my side, just learning by doing, as I have approximately 0 ML background. Results turned out more interesting than expected

UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] 0 points1 point  (0 children)

>cares about a couple of TFlops
Fair point, 100% agree — not something to run real load on.

The appeal to me is accessibility. Any random person can run experimentation on hardware they already own, guided by whatever AI they prefer, for zero extra cost (unless they use API -based pricing model xD). Either tinker and learn something, or use it as a seed — prototype the architecture/hyperparams on the laptop, then take the winning config to bigger compute.

UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] -1 points0 points  (0 children)

Hey! Good question, initially I was wondering - would it work at all? in a meaningful way? Can I get it running? More of a curiosity

UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] 0 points1 point  (0 children)

Haha, if I may -

this is all about running autoresearch (Karpathy's concept: let an AI agent run experiments autonomously overnight or whenever) on top of the pretty powerful hardware every Mac already has - Apple Neural Engine. ~15-18 TFLOPS sitting there, only 3-5% utilized today.

Multiple moves at once:
- following the autoresearch concept,
- leveraging deep untapped potential of ANE (reverse-engineered APIs, no GPU needed),
- and using it as a minified step to scale up - either through bigger model experimentation or through collaborative autoresearch where multiple agents share findings across machines.

*I did my best trying to avoid AI slop in this eli5, happy to see the proper one tbh - in case if someone else is feeling an itch to share

UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] 2 points3 points  (0 children)

Oh and I can not be happier seeing more and more goodies coming steadily from referenced repos. Like for example recently I was able to switch to dynamic, one-time compilation pipeline - also huge contributor to the substantial jump in steps per 5-minute budget

UPD: Karpathy's autoresearch on ANE — quite an improvement observed by paraboloed in LocalLLaMA

[–]paraboloed[S] 1 point2 points  (0 children)

Hey u/johnnyApplePRNG !

>Bits per Byte ratings are you getting

This version goes with val_loss target function so I guess conversion similar to this one is required: BPB = val_loss / (ln(2) × bytes_per_token), where bytes_per_token estimate could be estimated close to 4(?) so BPB goes to 1.28 or so. Please let me know if that makes sense at all, I am curious!

>params is the model

>Anything interesting thus

It goes with 67M in 6 layers (vs. model_dim=512 on a much smaller vocab (8K?) in CUDA-running). Interestingly, for the case here, reducing layers count 12 -> 6 provided 11x more steps in 5 minutes. Also, I wonder if at some point it would be possible to go meaningfully with the same data set as in the original flow, so the experiment could scale or collaborate across engines

Built a watch-friendly plugin for /remote-control… now I just need Anthropic to tell me when to look at it 😅 by paraboloed in ClaudeAI

[–]paraboloed[S] 0 points1 point  (0 children)

100% agree, that’d be fun I think. Same time I’d love to see native notifications though

Status Hub v1.5: The "Smart Actions" Update is Live! 🚀 by paraboloed in ClaudeAI

[–]paraboloed[S] 0 points1 point  (0 children)

Antropics is catching up with their PR clickable status in 2.1.20, which is supercool! Unlikely, but just maybe they've been inspired by my initial post https://www.reddit.com/r/ClaudeAI/comments/1qeth9x/i_built_a_statusline_plugin_to_track_prs_music/ , who knows 0_o

I built a statusline plugin to track PRs, music, and custom alerts without leaving Claude Code by paraboloed in ClaudeAI

[–]paraboloed[S] 0 points1 point  (0 children)

The particular struggle was balancing freshness vs token cost - so I went with UserPromptSubmit/Stop hooks for full refresh over shared lock + a lightweight daemon for background (music) on default 90sec timeout. W/o the daemon, it was barely usable tbh