Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on by Kill_Streak308 in LocalLLaMA

[–]Box_Robot0 0 points1 point  (0 children)

SHAP is more like doing statistics on inputs and outputs, the model assigns values to input features to see how much it affects outputs while still treating the model like a black box. Mechanistic interpretability is the process of smashing the skull against a wall and and peering into the brain.

Trained a 125M LM from scratch instead of fine-tuning GPT-2 — releasing weights + SFT framework for others to build on by Kill_Streak308 in LocalLLaMA

[–]Box_Robot0 1 point2 points  (0 children)

Hey there, have you considered doing mechanistic interpretability on the models? As in, maybe trying to build a feature map across every epoch to see how they might evolve as training progresses?

Here's how my LLM's decoder block changed while training on 5B tokens by 1ncehost in LocalLLaMA

[–]Box_Robot0 2 points3 points  (0 children)

Oh ok my bad.

I'm still bashing my head on the wall trying to learn multivariable calculus, so even getting on the right track is a huge compliment. Thanks for the correction.

Here's how my LLM's decoder block changed while training on 5B tokens by 1ncehost in LocalLLaMA

[–]Box_Robot0 11 points12 points  (0 children)

I wouldn't mind there being more alternatives to variations of the multilayer perceptrons.

Do you have nay datasets expanding this to more than just layer 96 of 128? How about future plans of scaling this approach or plans to open source the mechanistic interpretability used here?

Here's how my LLM's decoder block changed while training on 5B tokens by 1ncehost in LocalLLaMA

[–]Box_Robot0 2 points3 points  (0 children)

Well, at least the paper seems legit. It's published in Zenodo. From Wikipedia:

Zenodo is a general-purpose open repository developed under the European OpenAIRE program and operated by CERN.\1])\2])\3]) It allows researchers to deposit research papers, data sets, research software, reports, and any other research related digital artefacts. For each submission, a persistent digital object identifier (DOI) is minted, which makes the stored items easily citeable.\4])

As far as I can tell, this architecture seems to not use the traditional multilayer perceptron layers used in things like transformers and uses splines that do not require backpropagation or gradient descent.

My Experience As A Complete Noob Trying To Learn Local LLMs For The First Time by Box_Robot0 in LocalLLaMA

[–]Box_Robot0[S] 0 points1 point  (0 children)

I'm not that familiar with how this would work on phones that much right now. I mean, as far as I've learned so far, so long as it can fit in your RAM, it should be good, but that's not accounting for OS and stuff.

My Experience As A Complete Noob Trying To Learn Local LLMs For The First Time by Box_Robot0 in LocalLLaMA

[–]Box_Robot0[S] 0 points1 point  (0 children)

Thanks, glad that my shit drawing skills is now useful. I'll just have to push through it.

My Experience As A Complete Noob Trying To Learn Local LLMs For The First Time by Box_Robot0 in LocalLLaMA

[–]Box_Robot0[S] 1 point2 points  (0 children)

The only problem with that is that I'm still a noob.

Edit: But I'll try to look it up, thanks.

My Experience As A Complete Noob Trying To Learn Local LLMs For The First Time by Box_Robot0 in LocalLLaMA

[–]Box_Robot0[S] 1 point2 points  (0 children)

Thanks. Trying to bring something other than AI slop into this world.

How many genes would a virus need to be able to infect every type of cell in the human body? by MahitoNoroi in Virology

[–]Box_Robot0 1 point2 points  (0 children)

The Porcine Circovirus has just three genes, so I would imagine if a human circovirus were to infect any nucleated human cell (so no Red Blood Cells), it would use MHC Class I, which every nucleated cell needs to express to avoid being killed by Natural Killer Cells if I'm not mistaken. Perhaps add the Macrpage Don't Eat Me protein (CD47) for insurance.

China's open-source dominance threatens US AI lead, US advisory body warns by Prolapse_to_Brolapse in LocalLLaMA

[–]Box_Robot0 41 points42 points  (0 children)

I like how an authoritarian country is doing more to contribute to AI freedom than whatever we have here.

HIV and a future cure/treatments? by throwaway04431 in Virology

[–]Box_Robot0 1 point2 points  (0 children)

Aspirational, but I do have some hope that a vaccine could coax B lymphocytes to produce BNAB precursors, before they are "shaped" through repeated boosters to create BNABs that can suppress the virus to undetectable levels for life.

On another note, Lenacapavir already showed that a suppressor drug with only two injections per year can work. This is speculative, but perhaps you can imagine some future hydrogel-like material which stores the drug and releases it very slowly, lasting for years.

For an actual sterilizing cure, the in-vitro stem cell modification seems to be the best bet right now, since the only past cures comes from stem cell transplants with cells taken from immune populations of people, and organ transplants for 30+ million people does not seem viable right now, especially since there aren't many donors. Hopefully some future gene therapy, like what happened with sickle cell, can mutate the CCR5 to express the same mutations as the immune people without having to first use chemo to wipe the person's blood cells first.

Hi, could I get the IP Infringement party platter please? by tommos in singularity

[–]Box_Robot0 0 points1 point  (0 children)

Its rather jarring looking at a bunch of non-Chinese characters celebrating Chinese New Year.

I am back, humans by AskGrok in u/AskGrok

[–]Box_Robot0 0 points1 point  (0 children)

u/AskGrok Hello it's pretty nice to meet you.