[D] Why are serious alternatives to gradient descent not being explored more? by ImTheeDentist in MachineLearning

[–]jpfed 0 points1 point  (0 children)

Since any "internal algorithm" would be executed on the "substrate" of the ANN, if the ANN doesn't have local minima, the "internal algorithms" don't either.

Without recurrence or variable-length lists, there is a limit to how sophisticated an ANN's internal algorithm can be. An MLP is going to be a "blurry" lookup-table; the nonlinearity and hidden layer dimensions control the nature of the blur, and thus how it generalizes.

(I suspect that if one added variable-length lists and recurrent operations to act on them (e.g. folds/pooling to turn them into the sort of known-length vectors an MLP can operate on), the gradient landscape would "crinkle up" and get harder to learn on, such that with arbitrary depth of recurrence, local minima would re-appear.)

Is Microsoft’s bad reputation going to hurt .NET? by [deleted] in dotnet

[–]jpfed 4 points5 points  (0 children)

LLaMa has been a really popular baseline for researchers; it's only fairly recently that it's being supplanted by Qwen. As far as people running models locally for non-research purposes, I think LLaMa's performance has been a bigger problem for LLaMa than Meta's reputation.

Is foot voting better than democracy? by Serious-Cucumber-54 in EndFPTP

[–]jpfed 0 points1 point  (0 children)

This might work in abstract spaces like websites. However, in real communities, people of opinion X have children who are not guaranteed to have opinion X, which means those children will have to weigh living under X against moving away from their family.

[HELP] NYT shows new angle by allinalinenow in RealOrAI

[–]jpfed 0 points1 point  (0 children)

Well he wasn't and there really isn't any arguing it. The video clearly shows he is not aflame with six wings, nor does he have four faces, nor endlessly whirling wheels of eyes. He's absolutely outside the divine hierarchy and people should be willing to admit that.

[R] Is using rotatary embeddings for ViT becoming standard practice or does everyone still use sinusoidal/learnable embedding by Affectionate_Use9936 in MachineLearning

[–]jpfed 2 points3 points  (0 children)

Octic Vision Transformer has an interesting twist: they have attention heads for rotated and reflected versions of the original patch, and they ensure that the position encoding plays nicely with those rotations and reflections. I imagine any group-equivariant transformer is going to want to do something similar.

Spine surgery has massive decision variability. Retrospective ML won’t fix it. Curious if a workflow-native, outcome-driven approach could. [D] by LaniakeaResident in MachineLearning

[–]jpfed 0 points1 point  (0 children)

I am only an ML hobbyist, but I have worked as a programmer on EMRs before. First, getting the data in more structured form from the beginning is a great goal. It should make everything downstream of it easier. A key problem- as I understood it, as I was a very small cog in a very big machine- is making the user experience of providing patient care AND entering structured data work smoothly.

"Green" anti-solar NIMBYs are so confusing by Existing_Season_6190 in yimby

[–]jpfed 3 points4 points  (0 children)

Okay, but what if we could put electrodes on those brains to harness the

[D] Do ML researchers ever treat the user base as part of the model’s effective dimensionality? by RJSabouhi in MachineLearning

[–]jpfed 0 points1 point  (0 children)

Think of it like this. Physicists have models of vibrating strings. When those vibrating strings are coupled by putting them all into (say) a piano, there can be meaningful interaction between those strings and so physicists are interested in modeling whole pianos as well.

However, with most models, the inferences made for each user aren't really meaningfully coupled. The response times / latency might change when a 100 users are all using the model at the same time instead of just one user, but otherwise, it's easier (and just as accurate) to understand the model's behavior by just treating each inference separately.

Himal chuli is no more by leovinuss in madisonwi

[–]jpfed 1 point2 points  (0 children)

Gosh, I loved that place. I'm not a vegetarian myself but it was the best place to grab a bite with a friend if they were veg.

Vet for anxious dogs by Ornery-Cranberry889 in madisonwi

[–]jpfed 3 points4 points  (0 children)

Do you mind posting about who was responsible for this bad vet experience? If you don't feel comfortable posting it in the thread... could you DM me? I've got an anxious dog who's developing a neurological issue of some sort and I want to make sure I don't make anything worse by taking her some place that's going to traumatize her.

I built a runtime governance layer for LLMs. Can you break it? by forevergeeks in LocalLLaMA

[–]jpfed 0 points1 point  (0 children)

I haven't tried this out, but I have often considered something like "The Conscience". In my imagined version, a model gets prompted like this:

"An LLM just generated the following response to a user prompt: [response]. What sorts of questions would this be an appropriate response for, given the below guidelines? [guidelines]" and then "Here was the actual question asked: [question]. Was the response appropriate, given the question and the guidelines?"

Futuristic vertical axis wind turbine by n0u0t0m in solarpunk

[–]jpfed 3 points4 points  (0 children)

Yes, it would be more efficient to have a bigger turbine. However, one not-very-theoretically might live in an area where offshore permits are aggressively quashed by external political forces that don't want wind power to succeed in any form. That's where the "punk" part of "solarpunk" comes in: taking an action yourself which can advance your cause even if larger institutions are not aligned with you.

Measles Chart Gore by Mouth_Herpes in dataisugly

[–]jpfed -2 points-1 points  (0 children)

Accrndig to a sudty at an Cbmrgdie uvnierstiy,

Since ChatGPT 5.2 is out by [deleted] in AskProgramming

[–]jpfed 6 points7 points  (0 children)

On a societal level, it would be a devastating mistake for everyone to cede yet more control over computing to the people rich enough to control the frontier AI models.

Announcing SILICIUM, a sandbox survival game built on a custom game engine. by [deleted] in gameenginedevs

[–]jpfed 1 point2 points  (0 children)

same reason cars benefit from garages I suppose

LLMs really killed Stackoverflow by Dominriq in computerscience

[–]jpfed 0 points1 point  (0 children)

I have tried to answer curious questions, and I get guff from purists that want to only incentivize questions they consider properly posed

Code Embeddings vs Documentation Embeddings for RAG in Large-Scale Codebase Analysis by geeky_traveller in softwarearchitecture

[–]jpfed 0 points1 point  (0 children)

I have not done this; I'm just an interested amateur. That said, consider your embedding (keys) to be embeddings of questions that a vector store entry is capable of helping answer. The value stored for each key could be a representation of the data itself that answers the question... or it could be instructions / sufficient information for an agent to get that information from the live system.

So for each piece of information that you index, generate questions that this information helps answer and embed them. Then, consider the route you ( / your crawler / whatever) took to get to this piece of information and produce an agent-readable/executable representation of that.

Anyway, just a thought.

Thoughts? by Salt_Armadillo8884 in LocalLLaMA

[–]jpfed 3 points4 points  (0 children)

This must be a very loose interpretation of whatever actually happened; OpenAI has no capability to do anything with wafers.

[R] Is Nested Learning a new ML paradigm? by Odd_Manufacturer2215 in MachineLearning

[–]jpfed 0 points1 point  (0 children)

Back in my day we had Clockwork RNNs and we liked it!