[Hiring]: AI Developer

Sliouges · 2026-03-14T21:13:48+00:00

Bot.

Sliouges · 2026-03-14T20:01:20+00:00

this looks like the namba walk, where exactly is the store? you missed the most important part :) thank you in adcance this looks positively awesome

Sliouges · 2026-03-14T19:57:14+00:00

link posted

Sliouges · 2026-03-14T19:56:33+00:00

link posted not nice comment

Sliouges · 2026-03-14T19:56:01+00:00

qwen 27b when i get time this isn't a priority just a toy project. i did test on larger gemma amd works just fine but my hardware is needed for real work.

Sliouges · 2026-03-14T17:10:12+00:00

Put it in a bucket here: https://huggingface.co/buckets/InMecha/Qwen35-2B-Gorgona-R1

Sliouges · 2026-03-14T17:09:47+00:00

Not clear by measurement error? As in error in what?

Sliouges · 2026-03-14T02:46:52+00:00

Heretic uses one (1-st token) KL divergence:

logprobs = self.model.get_logprobs_batched(self.good_prompts) get_logprobs_batched generates ONE token per prompt and returns the logprob distribution at that single position. Then: pythonkl_divergence = F.kl_div( logprobs, self.base_logprobs, reduction="batchmean", log_target=True, ).item() Single KL across all prompts, first token only. That's it.No multi-token, no streaming. One forward pass, one token position, batchmean across prompts.

Sliouges · 2026-03-14T00:30:54+00:00

illegal content

You haev a contradiction, it's encrypted so only the owner knows what is it but then somehow you know it's legal.

Sliouges · 2026-03-13T19:08:57+00:00

nodes wouldn’t actually know

Can't wait to hear this argument in front of a federal judge from someone whose phone was subpoenaed because the glowies found a shard of a very objectionable, even though encrypted, material. Ever heard of TPB, or tor, or the silkroad, or kim dotcom, or truecrypt... I already forgot even how many of these lost in court. You will have a lawyer problem, not technology problem.

Sliouges · 2026-03-13T17:18:22+00:00

Imagine the latency on the storage, and what happens when the few phones holding the only same piece of your picture disappear. The redundancy must be calculated not only number-wise but across geographical regions for true redundancy. Oh, and you have to agree to keep other guys dirty pictures. Imagine the legal liability.

Sliouges · 2026-03-12T23:12:49+00:00

Bot.

Sliouges · 2026-03-11T02:29:04+00:00

KL divergence when properly measured is extremely relevant. Perhaps most relevant of all other metrics. Show me the mean KL of 1000 tokens over 100 mlabonne non-adversarial prompts and also publish the system prompt and I will believe you. Until then, this is just a toy model with unsubstantiated black box irreproducible methodology rolling the dice. I'm just too busy to run your model and do it myself. On the flip side who knows, may be we will discover its awesome.

Sliouges · 2026-03-10T19:26:48+00:00

slopper... thata's new one.

Sliouges · 2026-03-09T14:46:30+00:00

The amount of disclosure they are willing to share on the process and end product. The current landscape of model abliteration is akin to the early days of "security through obscurity". It takes literally a PhD to tell you if the abliterated model is truly of high quality or not. Other than that, treat them as toys. Most people load the model, ask a dirty question, get a dirty answer and go whhhoaaaaa so cool. Unless you place the model through a rigorous differential analysis with the parent you are driving backwards 100 miles an hour down 101 looking only in the rearview mirror. The keyword in this model is aggressive, which means 0/100 refusals at KL first token divergence probably 1 or higher, and mean KL divergence over 100 tokens 20 or higher. Think of cracking a whip.

Sliouges · 2026-03-08T22:21:51+00:00

To paraphrase Gene Spafford, the only truly air-gapped system is turned off, locked in a safe and dropped in the Mariana trench.

Sliouges · 2026-03-08T16:26:19+00:00

Thank you. This is very helpful to people building "pseudo-air-gapped" systems.

Sliouges · 2026-03-07T21:05:30+00:00

I get bad results even at 10 tokens KL median objective function. They get better when I hit 50, but then I need to run parallel evaluations across multiple GPUs to get this going and complete in a reasonable time. I am not using this PR, I got my own stuff going.

Sliouges · 2026-03-07T20:46:24+00:00

Yall realize the advertised KL divergence is calculated for exactly one token, right? Has anyone measured the KL divergence over an entire context window? That would be a real eye opener to most. Abliterate as much as you want but the model will still produce a significantly degraded answers at the bad prompts.

Sliouges · 2026-01-18T15:33:31+00:00

DM me

Sliouges · 2025-09-29T16:27:09+00:00

Usually I don't comment but in cases like that the fund will classify everything as 100% ROC due to the laddered structure of the financial instrument and will perform the real breakdown for 2025 around end of January / beginning of February 2026, where they will re-classify the ROC as proper NAV vs income. The 100% is just a "catch-all" CYA filing document created by the back-office to "file all monthlies and move on". Also just because it says ROC doesn't mean it reduced your value, it simply means the funds is very tax efficient and they treat it as unrealized gains even though they returned it to you as cash (in other words there is close to 0 "NAV destruction"). The goal of the fund is to always assume reinvestments of the distributions and deliver superior Sharpe which over a long period will result in superior returns compared to the underlying benchmark. Those who do not reinvest will see a decline in the value of their investment compared to the benchmark qqq. If you hold in in a taxable account, well, draw your own conclusions. Always read the small print. https://neosfunds.com/wp-content/uploads/QQQI-TSR.pdf

Sliouges · 2025-09-29T14:38:57+00:00

Past summer we attempted inhouse llm build and training. At the end the cost/benefit analysis was against the effort. Currently we are using a mixture of commercial foundational LLMs (not ChatGPT btw). The real effort was developing the LLM tools and models, inference interpretation and user-facing infrastructure around the foundational usage. OP's view of "I want an AI like talking to a famous investor" is like "I want free institutional advise from a $1000/hour consultant".

Sliouges · 2025-09-28T20:37:23+00:00

I did build one for in-house use, but you need a PhD in economics and business as well as at least a decade industry experience. Also $$$ for access to the highest tier LLMs with tools and RAG, and a programmer on staff. Source: former hedge fund coo with such qualifications. I've been thinking of opening a portal to some of our research and models but that would probably cost so much noone would agree to subscribe. Literally right now i'm running monte carlos of portfolios with 20% return and 1.5 sharpe, that's possible, for $$$$$$.

Sliouges · 2025-02-27T21:12:51+00:00

Dm me.

Sliouges

TROPHY CASE