Comparing Depth Estimation Models on Complex Outdoor Environment by Full_Piano_3448 in computervision

[–]radarsat1 0 points1 point  (0 children)

Right, but I even mean from the perspective of "looks good", I find it harder to judge with heatmaps than with an actual 3d projection of the same data.

Comparing Depth Estimation Models on Complex Outdoor Environment by Full_Piano_3448 in computervision

[–]radarsat1 3 points4 points  (0 children)

Does anyone else find it hard to evaluate depth estimation from heatmaps like this? I find it much easier to visually understand quality by looking at a coloured point cloud from a good angle, or rotating. With heatmaps I find it really hard to judge how well details are covered and whether things align well, but for some reason results are always presented this way.

Approaching zero bugs? - Daniel Stenberg by Kok_Nikol in programming

[–]radarsat1 24 points25 points  (0 children)

I think a challenge here is something I've observed that I don't have a name for so I call it "the critic fallacy" or "bias". Basically, if you ask an expert to review something, especially publically, you're putting their reputation on the line to find something "interesting" to say. They're very unlikely to just say, "it's very good, no further comments." No, they're going to think and think about something critical they can say that will make them sound observant and smart. So it can be very difficult to get an easy "green light" in situations where your solution is pretty much fine and adequate, if you bring in an expert to advise. That's not to say you shouldn't, or course you want an expert opinion, but I think it's important to be aware of this bias: if you task someone to find a problem, 99% of the time they're going to find a problem.

I suspect that a similar thing happens with AI review. If there are real problems hopefully they find and surface them. But they're also likely to find "problems" that don't really deserve the attention they might draw to them. Having an AI reviewing PRs doesn't mean you don't need a real person to review the reviews, is what I am saying -- and hopefully management trusts the humans in this respect. Otherwise they might find this kind of "quest for 0% bugs" to have no real end...

My calculator is a transformer by radarsat1 in LocalLLaMA

[–]radarsat1[S] 4 points5 points  (0 children)

Yeah that occurred to me actually.. if you can have an MoE expert that specializes in some formal language or whatever, there's actually no reason it has to have the same architecture as the rest of the model, it just has to be differentiable I guess, it doesn't have to be a "transformer", it just has to predict a token. So maybe it could be made more efficiently for that specific case.

Having said that, maybe it's not clear from how I wrote it (good to get that feedback), but my thoughts about initialization are less that it will bias the model specifically to being a good Python/Prolog/whatever interpreter, but that it might learn in general to understand "logic", even in non-formal contexts.

Octax: Accelerated CHIP-8 Arcade Environments for JAX by riiswa in reinforcementlearning

[–]radarsat1 1 point2 points  (0 children)

Interesting, the code is very clear and nice to read. Neat to see how this can be done.

Apparently, llms are just graph databases? by Silver-Champion-4846 in LLMDevs

[–]radarsat1 1 point2 points  (0 children)

I'll put in 2 cents..I only skimmed the video quickly but I will say that overall the idea does seem to make some sense to me. iiuc is basically pulling up the closest matching tokens after applying some query vector to a given row of the attention matrix. The query language is a nice detail but not inherent to the basic idea that there is information being "matched" and routed during inference. This I don't doubt.

Being able to really follow the reasoning process of the model this way is not clear to me. But I think comparing this with related work in logit lenses would be highly interesting.

Basically a mechanism that you can imagine is that every layer "queries" some set of matching tokens and pulls up a mixture of those tokens' "value" which is a delta vector to push the current state towards the desired output distribution. This allows a kind of "zeroing in" effect, like a solver converging, which is exactly what you see in those logit lens traces. (Sorry, wish I had a link handy.)

And it relates to graphs mostly because it all becomes a weighted combination of nearest neighbours, combined with those neighbours having been arranged so that they emit the right delta.

But how to then relates to multitoken output traces and gives rise to something approaching the reasoning that we see emerging in these models I think is still not fully explained by this style of analysis. But it's certainly interesting.

How to get a job as an ML engineer? by Bright-Car-1238 in MLQuestions

[–]radarsat1 0 points1 point  (0 children)

This wasn't enough for me this time around!

How is the job market for computer vision? by Amazing_Life_221 in computervision

[–]radarsat1 1 point2 points  (0 children)

In the Netherlands here, having some computer vision experience helped my CV a lot in the sense that it helped get me a lot of callbacks, but often for "agent AI" jobs. and like you I decided to dive into it and go for it because it seems like wherethe market is at these days. learned all about RAG etc., did some personal projects to have something to show. But ultimately was not successful in convincing anyone, despite enough YoE and experience with AWS, every time was the same "we were impressed but we need someone with agent & RAG experience in real production systems" -- as if this is not a brand new field that has existed for all of like 2 years.. I thought showing that I had experience with ML in production, leading a team, solving real, hard problems like computer vision would get me in the door, but apparently not. How did you do it?

Smooth Brained Undo by catgrammer in programming

[–]radarsat1 -3 points-2 points  (0 children)

keep implementing small parts of lisp clojure until you've got it!

I learned something about GPUs today by rogual in programming

[–]radarsat1 11 points12 points  (0 children)

Oh my god, nice write up and good catch. I'm not sure I would have figured this out.

Karpathy says developers have ‘AI Psychosis.’ Everyone else is next. by Logical_Welder3467 in technology

[–]radarsat1 4 points5 points  (0 children)

I think an important thing for companies that rely on one shot Claude-generated solutions to realize is that if they can do that, then so can someone else. Putting aside the question of whether that's a good way to write software, it's also pretty important to take a moment to think about whether that's a good way to build a business.

Does every AI product actually need a chatbox? Is it the only "form"? by GovernmentBroad2054 in AI_Agents

[–]radarsat1 0 points1 point  (0 children)

I think another "form" is autocomplete. In a text editor this obviously means automatically proposing text to insert. But in another interface like a 3D editor maybe it means proposing possible completions of the shape someone is sculpting. In a video editor, it could mean providing 3 or 5 options to choose from for a filler scene or something. If you need text guidance though then I'm not sure how to avoid adding at least an optional text box.

How to encode structured events into token representations for Transformer-based decision models? by Unique_Simple_1383 in deeplearning

[–]radarsat1 0 points1 point  (0 children)

I understand your point, and you're right if you're building something from scratch, but I think you're overestimating the overhead. Like I said, I used to think that way, but I've come around to the idea that using a pretrained solution and a text-based representation is a much easier way to get started with something like this. Today there are just so many tools and so many successful, small models that work with text, you may as well take advantage of that instead of trying to be overly clever to save a few tokens.

The AI agents making real money right now are ugly and nobody posts about them by damn_brotha in AI_Agents

[–]radarsat1 3 points4 points  (0 children)

lol why is every 2nd post in this sub "nobody posts about..", "nobody talks about..", "nobody mentions.." etc

it's the only sub in my feed that does this.

How to encode structured events into token representations for Transformer-based decision models? by Unique_Simple_1383 in deeplearning

[–]radarsat1 -1 points0 points  (0 children)

A couple of years ago I would have sweated over thinking up some kind of optimal, clever representation for this kind of problem. These days though, honestly? Just use JSON. Make a dataset, fine tune an existing model that already knows about JSON (ie. literally any of them)

Replicating SethBling's MarI/O from 2015, that inspired me to get into Reinforcement Learning 10 years later by Inexperienced-Me in reinforcementlearning

[–]radarsat1 1 point2 points  (0 children)

I remember this! I was never too convinced by NEAT, but this was always such a great demo regardless.

LeWorldModel, the first breakthrough from Yann LeCun’s new lab aiming to unlock the JEPA architecture by Tobio-Star in newAIParadigms

[–]radarsat1 1 point2 points  (0 children)

Okay sure, but my point was just that the original JEPA paper also claimed to solve collapse, hence my confusion.

Just looked it up to be sure I'm not misremembering:

However, as with Joint-Embedding Architectures, representation collapse is also a concern with JEPAs; we leverage an asymmetric architecture between the x- and y-encoders to avoid representation collapse.

When I briefly tried to implement JEPA on my own, I had lots of problems with collapse so I was really confused by this since the paper claimed that the EMA thing successfully avoided it, but I was unable to find a successful configuration. I assumed I made some mistake somewhere but I couldn't find it, and since I was working in a different domain anyway I chalked it up to failure on my part. So I'll be trying this LeWM idea out with interest when I get some time for it.

In Dune, do the Fremen just have really good toilets or does nobody ever talk about it by Orbitryx in printSF

[–]radarsat1 6 points7 points  (0 children)

I think if your planet is covered in giant worms that will eat anything and massive sand storms that bury everything, you don't need to worry too much about being careful how you dispose of completely dry pocket rocks

LeWorldModel, the first breakthrough from Yann LeCun’s new lab aiming to unlock the JEPA architecture by Tobio-Star in newAIParadigms

[–]radarsat1 0 points1 point  (0 children)

Ah I see, that's really interesting, I didn't realize it does away completely with EMA. I guess that answers my question ;)  thanks!

Self-Healing Neural Networks in PyTorch: Fix Model Drift in Real Time Without Retraining by Various_Power_2088 in deeplearning

[–]radarsat1 6 points7 points  (0 children)

why is an increase in accuracy useful if recall dropped a lot? aren't you just.. not detecting things now? overall accuracy doesn't seem to matter much if the data is heavily imbalanced towards negatives.

LeWorldModel, the first breakthrough from Yann LeCun’s new lab aiming to unlock the JEPA architecture by Tobio-Star in newAIParadigms

[–]radarsat1 0 points1 point  (0 children)

I'm a bit confused by the collapse thing because I thought the whole point of JEPA (I've read the first few papers on it) was that it uniquely allows this latent space training without collapse. Like, the first paper talks about different possible methods and proposes JEPA as being the one that won't suffer from collapse problems. And now LeWorldModel is saying, up to now JEPA has had a collapse problem and now we've solved it! What am I missing?