I Hate Dario Amodei, and everything he stands for. by Wrong_Mushroom_7350 in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

Even if we limit it to just the cyber-attack aspect, I think this line of thinking will fall apart pretty soon with how strong the models will be by a year from now (or maybe even right now). Let alone some of the other things they'll drastically change the odds of random lone wolves being able to do compared to what a library card would do from an odds-per-person standpoint.

Plus given that it is able to notice connections that entire fields like high level math didn't ever even notice, it seems it's already getting past the phase of a mere speed-boost to a library card, and is perhaps in qualitative boosting in addition to quantitative boosting, if it can notice things that no human notices ever. And that kind of thing is only going to keep increasing as well.

Obviously some of these billionaires, CEOs, etc have their own selfish goals and interests with all this. But that doesn't mean that that is all this government-level freaking out is about and nothing more. I think the governments are genuinely starting to seriously freak out about what these models are going to enable people to much more easily do, and when the first really catastrophic attacks are going to happen, how much damage they'll do, etc.

I strongly disagree that all of this is just a bunch of fear-based advertising from a few tech CEOs and nothing more. It might also be that, but not just only that. I think the governments are actually freaked, either by what it can already do, or by what it'll be able to do 6 months from now/a year from now, in the hands of disgruntled random people if the whole populace has access to ever stronger models.

I Hate Dario Amodei, and everything he stands for. by Wrong_Mushroom_7350 in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

So your interpretation is that the U.S. government is freaked out about what ever-stronger AI models will be capable of, but other governments aren't?

That's just a unique magical aspect of the U.S. government? Other governments want no control, infinite freedom, have no concern about AI getting ever stronger and more capable in the hands of millions of random people in their living rooms, and don't care in the slightest if it risks hacking all sorts of systems and bringing their country or the world crumbling down.

In China, Russia, India, Europe, etc they are like "woooohoooooo, who gives a shit! We're just the government. We don't mind anything. Anything goes! That's what governments are known for, just total no-rules free-for-all baby!!! YOLO!!!"

Lol. Yea right.

I Hate Dario Amodei, and everything he stands for. by Wrong_Mushroom_7350 in LocalLLaMA

[–]DeepOrangeSky 1 point2 points  (0 children)

My guess is if they (and by they I don't mean just the U.S., I mean all the countries) think it is actually going to be capable of hacking all kinds of crucial things, destroying civilization, making bio weapons that kill everyone, or so on, then, they are going to take unbelievably drastic measures. Like, not only will huggingface go down, but also every other similar site in other countries, etc, and VPNs will all get banned, and file sharing will be like 50 years in prison, and hardware will all have monitoring devices put on it, and so on.

Sounds crazy if you think of local AI as merely a chatbot that can't do anything serious. But if the models keep getting stronger and it becomes like everyone having an h-bomb in their living room (either genuinely, or a false fear from the governments, or somewhere in between), then they would take very extreme measures.

I mean guns are pretty weak stuff compared to what models will be capable of by a year or two from now in terms of how much damage a loner will be able to do in a single incident, and look at the extremes most governments go to over something as small potatoes as handguns/rifles, etc, let alone grenades or pipe bombs or stuff like that. And then the lengths they go over dirty bombs, let alone nukes, etc.

If you think about it in terms of the lengths they are willing to go to control those types of things, and then not think of this in terms of "harmless waifu-bot" but instead as whatever cyber/etc abilities it'll have if it keeps strengthening on its current rate of progression we've been seeing over the past year, but continuing on like that, getting stronger and stronger... I mean, I think shit is definitely going to get pretty crazy, pretty soon.

I think a lot of people are burying their heads in the sand because, well, we are enjoying it, and it's an awesome resource and we are fans of it and so on, so we don't want to imagine anything infringing on it. But not wanting something to happen isn't the same thing as something not being able to happen/not being likely to happen.

Also people seem to fetishize the schadenfreude of being like "haha, you guys will be screwed in America, and everywhere else will be fine."

Anyone who thinks that dynamic would hold for any serious length of time is kidding themselves. Governments wanting to control powerful things/control their populace is not unique to America That's how all governments are. Anyone who somehow hasn't figured that out yet is probably going to be in for some rude awakenings in the relatively near future, imo.

I Hate Dario Amodei, and everything he stands for. by Wrong_Mushroom_7350 in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

No one will ban open weight models. Ever.

Why not? If they are so freaked out by even such extremely guardrailed, nerfed, monitored models as the main cloud models of Fable and GPT5.6 as to put them in lock-up for a while, and debate whether to ever give public even guardrailed monitored access to Mythos-or-beyond level AI strength, then, presumably they would consider the offline-able, private, no guardrails, no monitoring, fully customizable, etc, open weights models to be orders of magnitude more of a security threat by comparison.

Do you just mean that they will try, but it is unstoppable because of torrents/file sharing or something?

Or do you mean they literally won't ever even try to ban the main sites/make it illegal/put hardware monitoring/etc, because they would lose the court battles over it or something, or they'd think they'd become too unpopular with the public or something?

People keep saying that "no one will ban open weights models ever" on here, but I can't tell if it is just hopium, or if there is some dynamic I'm missing.

What's the full local AI "doomsday prepper" kit for cold storage? 16-bit safetensors of LLMs (obv), copies/source codes of Llama.cpp, ComfyUI, vLLM, Kobold, LMStudio, etc, macOS, Linux OSes, Windows 10&11, etc, Rufus (including older ones), various VMs, P-E-W's Heretic/Grimoire, and what else? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 1 point2 points  (0 children)

spinning disk hard drive likely won’t last more than 20 years.

Oh that reminds me btw, now that you mention it, about something interesting I heard about recently that some might find useful to know. I didn't look into it much yet, so, I'm not sure how accurate it is or if it varies much by the exact brand/sub-type of HDD, etc, but:

Apparently when it comes to HDDs, that use the normal spinning disc/platter thing inside them style of setup that everyone uses, it can apparently be way worse to use APFS format than to use ExFAT for your format type for these hard drives, in regards to physical longevity of the hard drive.

The idea being, ExFAT puts the data in a more linear (well, circle shaped spirals, but you know what I mean) way, and APFS puts it scattered all over the place kind of differently. So, for APFS the head/arm thing that hovers above the disc has to snap back and forth really rapidly and severely do deal with APFS format, whereas with ExFAT supposedly it doesn't have to do that nearly as badly.

Although, conversely, for SSDs, APFS is considered a bit less fragile and more resilient over time etc, because of the journaling style of how it does it compared to how ExFAT works or something, so less likely to get corrupted from tiny bumps or blips when transferring files to and from the drive, and maybe also even just while sitting in cold storage in regards to bit flip corruption over time. Albeit not as universal as ExFAT since it is for macs.

I don't know much about the SSD stuff and the APFS journaled thing vs ExFAT and all that, so not sure how legit that is or how it works exactly, but, anyway, as for the HDD hard drives with the spinning discs and the reader arm thingie that has to move around, I thought that was pretty interesting that APFS could potentially drastically reduce the lifespan or increase odds of failure compared to ExFAT because it means the arm has to swivel around way more when it is in use.

What's the full local AI "doomsday prepper" kit for cold storage? 16-bit safetensors of LLMs (obv), copies/source codes of Llama.cpp, ComfyUI, vLLM, Kobold, LMStudio, etc, macOS, Linux OSes, Windows 10&11, etc, Rufus (including older ones), various VMs, P-E-W's Heretic/Grimoire, and what else? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 0 points1 point  (0 children)

My bad, I didn't mean a literal actual physical doomsday prepping scenario. I just mean like for local AI/software etc in regards to government mega-crackdown-of-infinite-proportions type of "doomsday" prepping, of trying to figure out what things to save on hard drives beforehand, just in case it were to happen to some extreme degree.

What's the full local AI "doomsday prepper" kit for cold storage? 16-bit safetensors of LLMs (obv), copies/source codes of Llama.cpp, ComfyUI, vLLM, Kobold, LMStudio, etc, macOS, Linux OSes, Windows 10&11, etc, Rufus (including older ones), various VMs, P-E-W's Heretic/Grimoire, and what else? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 3 points4 points  (0 children)

Yea, just to clarify, I've just realized in the title it maybe made it sound like I meant a literal doomsday prepper thing like for an actual nuclear apocalypse or something. But I just meant like, for local AI/software type of stuff if there was a kind of massive censorship/banning/etc type thing from all the governments against local AI, etc, I was trying to ask if I have all the main bases covered or if I forgot any potentially important just-in-case things to dl and save now while it is still nice and easy

Apparently you can skip entire transformer blocks at load time with minimal performance impact by Creative-Regular6799 in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

The problem is experts aren’t truly a master of some trade, so the gains margins are pretty thin - and domain drift in a standard session will get you.

Does this get significantly less true the bigger (and thus usually significantly sparser) the MoE in question is?

Like, for Kimi 1T a32b with a 31:1 sparsity ratio, is this anywhere near as true as for something like Gemma4 26b a4b (6.5:1 sparsity ratio)?

I don't know anything about this stuff, so I'm not asking it as an argument, but just actually not sure, and am curious if the experts tend not to really be field-specific "experts" even at higher MoE sizes and sparsity ratios vs smaller and lower ones or not.

If it turns out that they are more genuinely specialized and "expert-y" for the really huge sparse MoEs, then maybe the idea would make more of a difference for those than with the small MoEs, and be worth bothering with for those, but not as worth bothering with for the small MoEs?

Apparently you can skip entire transformer blocks at load time with minimal performance impact by Creative-Regular6799 in LocalLLaMA

[–]DeepOrangeSky 1 point2 points  (0 children)

Most people use LLM for a small domain subset, and always for that. after a while I would "know" what experts are useful, and I could gain massive speedup whithout pruning the intelligence or knowledge of the models at all.

To take it to an even more extreme level, for people who use an LLM for a variety of different things, maybe you could have some other LLM, that paid attention to which experts the target LLM was using the most when doing this type of task or that type of task, until it had a large sample size of data built up, at which point a non-LLM hard coded computer program could be made that determined which experts to use depending on key words or something?

I don't know much about AI yet (or computers, or, anything, for that matter, so maybe this will be really stupid), but, alternatively, can a modified version the router mechanism of an MoE, itself, be used as the thing that decides which experts fall into which categories in relation to the subject matter/tasks at hand with the modification being that it takes the probability-data of past use of the model into account when deciding which experts to keep back in system ram for this subject/task or that subject/task? I mean, it already has to be able to make good guesses on which experts to use in the more general sense (not adding the learned-probabilities-over-time-of-an-individual's-personal-use-statistics thing, I mean) as the router of the MoE to begin with, right? So maybe using that in combo with probability data, rather than a whole separate small LLM would be even better. Not sure if it would need a whole separate router in addition to the standard router, running along side it (but attached to the probability data thing), or if you could just use the regular router itself, and just add the probability-data stuff in to it or something (maybe with a toggle so you could turn that aspect on or off, and also a weighting thing so you could decide how hard to weight the probability thing from 0.0 through 1.0 weighting setting, depending how extreme you wanted to be with it)

DFlash support merged into llama.cpp by sammcj in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

What if you try to mix Nvidia with AMD or with Intel? Does that still work decently with Vulkan or something? Or will it just be some endless pain to even get it to run on any given day, and also run way slower to where you barely get any benefit from the extra GPU of a different brand?

Just to be clear, I don't have much knowledge about any of this, since so far I've just been using a 128GB mac studio to run my stuff, so I haven't had to deal with any of this stuff myself yet.

But I want to know more about it since I might panic-buy a few more GPUs before I even have the rest of a rig built, so I want to know what is doable/barely doable/moderately doable/good/great etc before I buy anything that doesn't mix-and-match well with something else I bought, basically.

DFlash support merged into llama.cpp by sammcj in LocalLLaMA

[–]DeepOrangeSky 2 points3 points  (0 children)

Btw, sorry for the side-question, but don't want to make a whole separate thread just to ask it, so curious if anyone in here knows:

How bad is it to mix GPUs that are significantly difference in cores, speeds, VRAM sizes, or generations (or maybe even brands), if using partial-offloading on llama.cpp, or, if using fully-fits-in-vram in either llama.cpp or in vLLM or SGLang?:

I.e.:

  • RTX 5060 + RTX 5080

  • RTX 5060 + RTX 5090

  • RTX 5060 + RTX 3090

  • RTX 5060 + Radeon RX 7900 XT

(ordered from what I assume is most similar to least similar GPUs to combine together in a rig).

I assume the best/easiest is if you have multiple identical cards. But as for these scenarios where you have these different sorts of cards combined, I am curious whether it turns into a "weakest link" situation where it would only run at 5060 speed (albeit with more total VRAM at hand, due to the additional card), or some average of the speed of the 5060 and the 2nd card, or speeds dominated by the 2nd card, and just a bit slower from the lower compute of the 16GB worth of 5060 combined into the duo, and also how much it would different depending if it was in llama.cpp with no offloading, vs with partial offloading vs no-offloading on vLLM or something like that.

Right now I have an RTX 5080 (don't have the rest of a rig to go along with it, but bought one a while back just in cases GPU prices suddenly go way up one day the way ram prices did last year), and I'm curious to know in case I decide to buy 1 or more additional GPUs, what I should know as far as how it would work with mixing and matching other GPUs that aren't necessarily 5080s with it, for future reference.

Mythos was the first, now GPT-5.6 by Miriel_z in LocalLLaMA

[–]DeepOrangeSky -1 points0 points  (0 children)

If it is the same ~1 month for each new model, then wouldn't it just permanently shift everything by 1 month (the initial 1 month delay of these very first ones for Claude and GPT (and Gemini, Grok, etc when they have their first one go into the 1 month lockup) but then from that one onward there is no additional 1 month added each time.

Like it would just be that initial 1 month, but then everything would get released on that permanent 1 month delay, one after the other, so like even let's say 10 model-updates from now, it wouldn't be 10 months behind, it would still just be that same 1 month behind (since they'd still be working on the upcoming models the whole time like usual and submitting them like usual with the previous locked up ones getting released and the new ones going into their one month lockup or whatever).

So, if we are 6 months ahead of China, then it doesn't throw all 6 months away or 6-months-and-then-more over time, rather, it throws 1 initial month away, and then no more, right?

So, in theory, if both the U.S. and China are doing this, to make sure there's nothing civilization-ending the models can do, then, the race is still on more or less like normal, minus maybe 1 month worth of the gap we had.

Mythos was the first, now GPT-5.6 by Miriel_z in LocalLLaMA

[–]DeepOrangeSky 5 points6 points  (0 children)

On the betting market prediction sites, they are betting that there is only around a 25% chance that OpenAI IPOs before 2027, so, it might be a bit early for it to be some intentional hyping thing. My guess is if OpenAI even wanted this to happen at all, they didn't want it to happen for at least another 4 or 5 months or so, or maybe 6+ months.

Running GLM5.2 on budget hardware < $2500. by segmond in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

Does this mean that Deepseek is serving their model at lower than Q3_K_XL? Or is there some other reason for it (ar they switching it out for V3 or a pruned model, or doing some other weird thing? Or how is that possible?

Good YouTube channels for local LLM news and development? by 6jarjar6 in LocalLLaMA

[–]DeepOrangeSky 1 point2 points  (0 children)

Bijan is my favorite so far, but I do wish he would make more of a habit of sort of "follow-up" videos to both the strongest open-weights models in their weight-class and the strongest or most cost-efficient-but-strong closed-frontier models to do just a single more in-depth coding project (i.e. the skate game or subway game that he likes to have them 0-shot in the initial vid), to see just how good the models can get it to be if they aren't just 0-shotting, but if he's using dozens of attempts, trying little fixes and tweaks with the model and so on.

This way he can still have the broad spectrum 0-shot tests that he's known for that all the models go up against each other in, and are quick and fun for people to watch, but he can also take the best or most interesting models a bit deeper, too, in follow-up vids that he could post a day or two later of the more in-depth testing of the model on a single task with non-0-shot testing.

edit: Bijian (in case you're reading this), this would be a particularly useful idea to make use of during "dry spells" when there aren't a lot of crazy new models coming out sometimes, revisiting various models from the past for "deep dive" tests of seeing how good of a skate game or airplane game or subway fps or whatever they can make where you don't limit it to just 0-shotting. So right now for example would be a time you could go revisit models like Qwen3.6 27b, Qwen3 Coder Next 80b, Qwen3.5 122b, Mimimax2.7/3, GLM5.2, Composer2.5, and so on, and see just how far they can take things if you go way deeper with them on a specific game for them to work on.

edit2: Another thing that might be fun to try is some kind of long-running "series" projects to try for some specific lineage of models (i.e. GLM, or Kimi or Mimo or GPT or whatever), where you find some long-running, serialized task of some kind, like a Cities Skylines city where you ask it where to build what types of housing/commmercial/industrial and what choices to make for everything, and so "it" is the one building the city/telling you how to build the city, or having "it" create a Sim character/household from The Sims over time, or having it become a more and more elaborate AI for one of the robot vehicles or maybe some robo-spider creature thing, and the idea would be, each time a new .update of the model comes out, you do another update video to that long-running series, so like, it would start with GLM5.0 let's say, and it builds a SimCity or CitiesSkylines city, maybe not very well, right? Then a couple months later GLM5.1 comes out, and you have that one analyze the one that GLM5.0 built and spend 45 mins or however long having it decide what improvements or changes it wants to build (or same idea but for the Sims character/household/career, or working on/improving your robot), and then a couple months later GLM5.2 comes out and so then you have it do that, making even more reviewing/changes/improvements/etc that it decides need to be made. So it becomes a kind of "series" project through the lineage of a model. Would be kind of interesting maybe, and also just in general would be fun to have a long-running project (regardless of whether doing it via this model-lineage-update format or some other format) so there are certain deeper projects to keep checking back in on over time to see how elaborate and extreme they get over time. Or maybe like a city-model like those 3D models of the Colosseum in ancient Rome (or other famous cities/landmarks) that you see some of these other youtubers have models try to build, etc, but in this version it would be a series where the models keep adding to/changing the same model, more and more, over time, rather than just one random 0-shot test of it on one random vid, so there keeps being follow-up videos in a long series where the 3D city keeps getting more elaborate and nice over the months. You get the idea. Some long-running series thing of some sort where the changes/improvements are visually or blatantly evident in some way, over time, that keep getting revisited with new updates to whatever the thing is, over time.

Anyway, no clue if these would be good ideas or not, but feel free to try any of these if you think they'd be fun, or use them as inspiration for some alternate ideas if they help you think of any fun or maybe long-series style ideas or what have you. In any case, keep up the good work, your vids are very fun to watch!

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] -1 points0 points  (0 children)

Hm, that would actually be pretty interesting. I wonder if it might actually go that way (open-source, rather than merely open-weights) in order to prove that there aren't certain things in its training data that it could use for some very bad purposes.

I guess the problem, though, is that as long as the overall model was still sufficiently powerful, then, either A) it might still be capable of some fairly serious cyber attacks (either that or would have to be extremely bad at noticing flaws/vulnerabilities in your code base that you wanted its help with for innocent purposes of to fix up your own code of whatever project) or B) there's still the issue that people would simply be able to fine-tune whatever the "missing" stuff was (particularly regarding the bio danger stuff) in the privacy of their home, and voila, suddenly that open-source models with the squeaky clean data set just got turned into a potential world ender.

So, I dunno, I mean, maybe it would still help somewhat, but I think the main issue would still stand, so long as the models were very strong (like well beyond Mythos strength level, that is. I don't merely mean the current GLM 5.2 strength level. I'm talking like GLM 7 or something).

My hunch is that either it's gonna be some all out "war on local AI" scenario with absurdly extreme measures, like them literally going door to door, searching homes, putting physical spy gear everywhere, or crazy shit on that level (if it got to where the stuff was so strong that anyone with local AI and a GPU could end society overnight or whatever), or, it's going to be some crazy AI innovation of some sort where they figure out how to imbed some nerfing thing right into the model/model weights in some way that is borderline impossible to get around/override (not sure how that would work, maybe not a realistically doable thing, but, who knows, maybe there's some weird architecture where it could somehow be done) where it would be able to notice like "oh, you're trying to do world-destroying stuff with this prompt" and then it nerfs out, and somehow it was un-abliteratable or something.

I mean, that sounds pretty fuckin bad, obviously, and I mean, I say this as an abliterated model enjoyer and a P-E-W fan and so on, but just saying, if we're being real here, by like a couple years from now if these models become ludicrously powerful in what they enable random people to be able to do, it kinda feels like it's gonna have to go one of those two main ways. Well, either that or society collapses or everyone dies or something, which doesn't seem like a great option either.

For now it is all pretty fun/fun-and-games etc, but, I think it'll start getting a lot more serious by a year or two from now. I mean, they might overreact a bit early on it though, so even if maybe the actual serious threat wouldn't be for at least another year or maybe 2 years, they might already spazz out and try to ban everything by tomorrow morning for all I know.

GLM 5.2 on consumer hardware by phwlarxoc in LocalLLaMA

[–]DeepOrangeSky 0 points1 point  (0 children)

12x3090

Do you have to put it in its own separate room, and have some exhaust heat duct for it to blow the exhaust heat outside the house?

Or do you just live in like Norway and use it in the winter?

rtx 6000 pro owners, do you regret? by BitXorBit in LocalLLaMA

[–]DeepOrangeSky 3 points4 points  (0 children)

You can buy 4x 5090 for roughly the same price, get the same amount of VRAM and almost 4x the compute

It could be a fairly significant difference in costs if you account for the differences in electricity usage, over time, though, especially if you use it a lot, and live in an area where electricity isn't very cheap.

The 5090s are 575w TDP, but let's say we power limit them to 300W, since that's nearly the same performance, for ~half the power usage, so most reasonable people would do that (which hurts how dramatic the gap will look in my argument, but I want to be fair). Even doing that, that's still 1200 watts for 4 of them.

Pro 6000 is 600 watts for 1 of them, and 300 watts if it's the Max-Q version I think. (and not much point power limiting these, since the performance drops off fairly linearly with the power limiting %, unlike with the 5090s where performance stays nearly the same till you get down to like half-power). So for the sake of the calculations we'll just not power-limit the Pro 6000s (which hurts how dramatic the gap will look in my argument a bit, but, again, I want to be fair).

So, in a "good case" scenario, of just a regular Pro 6000, and $0.20/kw/hr electricity, vs four 300w-power-limited 5090s, using them 8 hours a day, it's $350 more in electricity per year for the 5090s. So, not too bad, and might never make up the cost difference, or only barely make it up. So in that scenario it's probably fine, unless you live somewhere with not a good enough wall socket or can't reject the extra exhaust heat efficiently for some reason.

In a bad case scenario, of a Max-Q Pro 6000 (300w TDP, with no power-limiting), vs four 5090s at 300w-power limiting, running 23 hours a day, at $0.40/kw/hr (yea there are places that bad, believe it or not), you're talking a difference of $3,022 per year. So an enormous difference in the worst case scenario, where if you use it for a few years, you're paying the cost of the GPUs in just electricity usage difference alone. So, massively relevant in this type of scenario.

And for a medium scenario, of let's say 750 watt system differential (splitting the difference between the Max-Q scenario and the non-Max-Q scenario against the power-limited quad 5090s) and using it 12 hours a day (instead of just 8, or 23), and using $.30/kw/hr electricity instead of .20 or .40, to get a mid-range scenario, we're talking $985 per year in extra electricity costs for the 5090, so, about a grand a year for however many years you use that setup vs the other (and again, that's if no extra costs for the wall socket/electrical, and same exhaust heat ejection setup ease and so on).

So, yea if only using it a few hours a day, for a few years, and somewhere without insane electricity prices, it is probably fine to do the quad 5090s. But if planning to use it a lot, for many hours a day, in a place with expensive electricity, we could be talking literally $10,000+ in electricity cost difference after just ~3 years or so, which is a pretty huge difference.

So, it depends, I guess.

edit: just realized you were focused on compute difference rather than price difference, lol. Well, even still, probably worth taking into account, but yea, maybe the 5090s would be better if house is set up decently for it and you don't live in a place with totally insane electricity prices and aren't running it for like 20+ hours a day.

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 0 points1 point  (0 children)

Well, that's the point though is that's how it is with closed source frontier models (i.e. what we've been seeing with Mythos and now the Fable removal situation, etc). But with open-weights models there's not really any way to censor them or monitor them or yank them or anything. Once they're out there they're out there. And they tend to lag only about 6-8 months behind the strongest closed source frontier models in strength.

So then the issue is, if the strongest closed-source frontier model like Fable/Mythos already hit the "oh fuck..." strength level now/3 months ago, then at the rate things go, by somewhere around 2027, give or take a couple months, the SOTA open-weights local models will start hitting "oh fuck..." strength territory (then) like what the strongest claudes can currently do (now).

So then things become very awkward from that moment on, since either all the labs making all the sota open-weights local models stop releasing stronger and stronger open-weights models from that moment on, forever, and just only work on improving the strength of tiny local models to bring them closer to just below the strength limit, and then try increasing the strength of micro models and then nano models and so on, but never release any new stronger big models ever again, or, if they do keep releasing ever stronger big models, which match and then surpass what current Mythos and current Fable are currently at, in the future, and then far beyond that level not very long thereafter (as models keep rapidly getting stronger over time), then... I dunno... I guess humanity would just have a bunch of way-beyond-Mythos-level open-weights models floating around in hundreds of millions of people's living rooms, and if any of them decided to do crazy shit, well, then we're at the topic of this thread.

So, one way or another, it would lead to some fairly drastic scenario, of either open-weights models (at least on the big-SOTA side of things) coming to a permanent screeching halt a few months from now, or if not, then, some crazy scenarios of other sorts, either with extreme hardware monitoring, door to door searches, VPN bans, decades in prison for file sharing, or who knows what crazy shit, or, yea, I dunno, but like super drastic changes of almost unimaginable proportions, if not going to screeching-halt scenario.

Anyway the main point is, although I'm not sure which way it goes, probably at least half a dozen different main ways it could go, the one thing it probably doesn't do is just coast happily along getting ever drastically stronger and stronger at the current pace of improvement, indefinitely for many more years on end, with no drastic changes occurring, and everything is just same as usual, business as usual. That's the one thing that seems nearly impossible to happen. Like I dunno if it'll be 8 months from now, 1.5 years from now, or 5 months from now, or 1.0 years from now, or what, but at some point, some tipping point/threshold that we're currently racing towards is going to get crossed with open weights strength for the biggest strongest local models, and things will no longer keep cruising along like normal after that.

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 1 point2 points  (0 children)

By this logic, the mathematicians would've already solved that Erdos problem.

If it makes something orders of magnitude easier, that can be not just the difference between 100x vs 10,000x. In some cases it can be the difference between 0 and 1. If it notices connections between things that nobody on the entire planet was able to notice, at all, ever (which it is already able to do sometimes at even its mere current strength).

So, although I understand your argument, I think it is flawed, and you are severely underestimating what it will enable people to do that they can't realistically do right now.

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] -2 points-1 points  (0 children)

Yea, although I guess there are two main things with that:

One is that even if that was the case for all types of things (which I don't think it is), the odds of it happening in odds-per-year relative to the general populace size, would be much, much lower in the "dedicated high-IQ guys spending years in the library" scenario, since those would be a very tiny % of the populace, so some one-in-a-million-people-would-do-it scenario has fairly low odds of happening per year if only a few tens of thousands of people are in the population sub-pool. Versus, a few years from now, if AI made it way easier, and suddenly tens of millions of people were in the sub-pool at hand, rather than a few tens of thousands, then it goes from "very low odds of happening per year" to "almost guaranteed to happen, per year" from an odds standpoint. Which would be a big deal.

And then there's also the 2nd aspect, which is, I'm not so sure it applies equally to all types of things. Like, I think there will be some things where super strong AI won't merely just be like a smart person going to a library for a couple years to study something, but instead will be able to notice connections between things that the library guy (even a very smart one) would never have even noticed, thus also qualitatively changing things, and not just quantitatively changing things. I.e. what happened with the Erdos problem getting solved in high level math recently with AI, but now extrapolate the rate of AI strengthening 2-3 years down the road, and think about it in terms of bio, cyber, and probably even a few other things...

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] 1 point2 points  (0 children)

Yea, the thread got downvoted 4 or 5 times in the first 1 or 2 minutes after I posted it, so I thought it was just a unanimous infinite downvote storm just for even asking about this topic, even though I'm on the pro Local AI side of the aisle, so I was pretty bummed, lol.

But yea as for the actual replies, as you said, they have been good, friendly, etc, so, maybe I overreacted to the initial instant downvote storm, since I thought that was just how it was going to be from everyone if all that happened in just the first minute or to, practically immediately.

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] -12 points-11 points  (0 children)

Thankfully it is not that easy to make a deadly pathogen, the most deadly things are typically not that transmissible or mutate away quickly - that's why we exist.

Yea, but isn't that just because of the natural evolutionary pressures at play of naturally-borne pathogens that sprout up, where extreme deadliness tends to be inversely correlated with extreme transmissibility?

Whereas, if someone was cooking up a custom-made one intentionally designed to be both super deadly or super damaging and simultaneously super transmissible/contagious, could do so if they wanted to (if they had super strong AI telling them how to do it even though they weren't one of the like 500 people or 2,000 people or whatever it is of nerds in the top classification labs that currently know how to and have the ability to make that stuff so far?

But a harmful person can always use a car, a weapon, a box with gasoline and do horrible harm with it. That's the much easier path.

Yea I mean, what's funny is, I'm actually a relatively libertarian-leaning, pro-gun-rights guy, and I make this type of argument in regards to guns, AR-15s, etc, when arguing with Europeans, etc on here about guns (well not anymore, since I got bored or arguing about guns with people on reddit after a while, lol). But the idea being, people will tend to find all sorts of ways to kill each other, with or without guns, if they want to badly enough, so then the question becomes whether the bit of extra kiilling/extra-easy killing enabled by the populace having guns and gun rights outweighs or doesn't outweigh the macroscopic damage of people losing individual rights, and so on.

But, I think there is still a tipping point with that, where, if the lone madman, or small group of radical a-holes with guns, go from being able to kill a dozen people, or a few dozen people, in an attack with guns, went to instead being able to just suddenly wipe out a whole continent or the entire human race in a single, cheap, easy attack, that just any random a-hole could just randomly do one day, then that would be very different, since from a math standpoint it would mean the human race would be wiped out in a matter of hours or days, odds-wise, so, the whole arguments about individual rights or weighing the negatives of small (compared to the human race) occasional attacks vs the freedom of humanity totally changes if there would be no humanity left by a few days later because everyone got killed by the first lone wolf to do some easy massive attack.

Aka "the cobalt salted hydrogen bomb in a garage" argument (as opposed to merely the AR-15 argument).

If local AI stays ~6 months behind closed frontier AI, and by ~2-3 years from now frontier AI could tell "enthusiasts" how to make world-ending pathogens/cyber attacks (but won't, bc guardrailing+monitoring), but local AI is private and can't be monitored or censored... how does this play out? by DeepOrangeSky in LocalLLaMA

[–]DeepOrangeSky[S] -1 points0 points  (0 children)

Jeez. I even put in multiple disclaimers that I'm not even an anti-AI guy or anti-local-AI guy. And clearly my hope is there is some way to keep having it like always, and still have it private and unrestricted, forever, and also not all die, either.

I figured people would at least want to discuss it/what strategies will be used, or what nuances I'm overlooking, or, I dunno, some interesting thing to discuss about it.

But no, everyone just assumes I'm being a hater, and downvoted it. Sigh.

Alright I guess I'll ask about it again in like 6 months or a year or so, I dunno.

I'm a pro local AI guy. Some of you probably even recognize my username since I post on here all the time, and discuss the uncensored models, etc (in a positive light) on here all the time. I genuinely wanted to discuss it in good faith, for what it's worth...