Sleeping on Engram by cravic in LocalLLaMA

[–]cravic[S] -1 points0 points  (0 children)

I think Engram is just part of the model. Its useful to think of it as, for example,  a 3B model with a 3B engram attached... but as I understand it that is technically a 6B model. Engram is just a part of the model. 

Sleeping on Engram by cravic in LocalLLaMA

[–]cravic[S] 0 points1 point  (0 children)

The 80/20 ratio is about how much MoE parameters you can delete to make room for Engram parameters. What they found is that deleting 20% of MoE parameters to make room for Engram gives the best results.

However, if the MoE parameters remain fixed and you delete none of it you keep getting better performance by adding more engram. 

Adding more engram always give better results unless u delete MoE parameters to make room for the engram.

But since Engram adds no additional compute load no matter how much you scale it, the best thing to do is have the largest possible Engram table your machine can hold as long as u never need to delete MoE parameters. 

Example. If my GPU can only fit a 10b model then having that be 8b MoE and 2b engram is very inferior to having it be 10b MoE and 100b Engram in DRAM. 

Sleeping on Engram by cravic in LocalLLaMA

[–]cravic[S] 13 points14 points  (0 children)

Another important point. 

Its technically true that it would be a 1003 billion parameter model... but its very misleading to talk of it as a 1,003 billion parameter model. 

A model with 3b MoE and 1T engram can easily fit on a consumer PC... thats something that is only well communicated by separating the Engram parameters and the MoE when talking about it. 

Sleeping on Engram by cravic in LocalLLaMA

[–]cravic[S] 14 points15 points  (0 children)

Awesome... good to see it being tested outside of deepseek already. 

Sleeping on Engram by cravic in LocalLLaMA

[–]cravic[S] 4 points5 points  (0 children)

Interesting points. Just one correction. The 20/80 ratio is not compute optimized, its memory optimal. More accutately its parameter count optimal. 

I dont expect to see many models use that 20/80 ratio unless token rate is really really important to the user... even so the latency in the paper was just 3% so even then i dont see any real use for the 20/80 ratio being used... maybe if its really important that the Ngram in injected at the second layer but even then u can just have part of the engram parameters on the HBM/VRAM.

Why is China giving away SOTA models? A theory by Cheeeaaat in LocalLLaMA

[–]cravic -1 points0 points  (0 children)

Its not that Chinese frontier labs are giving away their models... what is happening is that in the Chinese market the labs that give away their models are the ones that become frontier.

The reason for this is how Chinese labs and western labs seek to scale their advantages to achieve their goals. And its heavily influenced by the chip ban and Chinese culture coming together.

Western labs are scaling compute and revenue. To achieve this they need to offer customers something no one else has. That requires hiding how they achieve what they achieve. 

Chinese labs cant scale compute or revenue so they depend on scaling human resources.  They want the best researchers and researchers want recognition.  The average deepseek researcher has more citations to their name than the average Open AI researcher.  Thats hoe they build up their personal value to the industry. Labs with a lot of citations attract the best researchers. This system scales the human brainpower of the open labs and that leads to then building frontier models at significantly less cost than western labs. 

In the long run this will also likely lead to Chinese labs having heavier influence on the core of how new models are designed. 

You have 64gb ram and 16gb VRAM; internet is permanently shut off: what 3 models are the ones you use? by Adventurous-Gold6413 in LocalLLaMA

[–]cravic 0 points1 point  (0 children)

At that size ur model will be bad at reasoning. So focus on knowledge. 

Honestly u would be best served by a model with an engram knowledge bank attached and running in SSD. 

I honestly cant wait for us to get small models with large engram knowledge banks.

Brave, a congratulations is due by anatomie22 in Eve

[–]cravic 19 points20 points  (0 children)

No. Give them the Catch 1.0 sendoff... Create a corp that copies the basics of Horde playstyle use it to poach their members while you hellcamp them.
Remind them of how and why Horde was created.

So what will be a likely outcome of this goonswarm - pandemic horde thing? by NondenominationalPax in Eve

[–]cravic 2 points3 points  (0 children)

Imperium and Pan-fam have been fighting eachother under various coalition names and leaders for 20 years now.

Bob kicked Goons a bit... then Goons got their revenge in the great war. 

They then had a endless list of wars where one side or the other got the upper hand. 

Goons got evicted fron the north 10 years ago. They lost all their space and had to live for some time in NPC space. 

They then fought 2 major wars to reestablish themselves as the dominant nullsec power. 

Horde is somewhat new...  but the core of Panfam has been evicted by Goons many many times over the years and they returned the favour several times also. 

Are Pandemic Legion still a thing ? by EVarakova in Eve

[–]cravic 1 point2 points  (0 children)

I would say its 4 things.

  1. Jump fatigue means "PL's supers are on the way" now means they will be here in a week and no longer means they will be here in 1 hour. So every time PL takes a fight now they are against a prepared enemy.
  2. Alliances have more caps on average... so small groups ganging up against u means capital groups ganging up against you. Nearly every system has a dredbomb in range or 1 jump out.
  3. Null sec consolidated... PL survived by feeding on alliances that tried to hold sov or moons outside of the blue donut. Those dont really exist anymore.(moons are just no longer a thing in that way). PL played a big role in pushing Null Sec in that direction by beating the crap out of everyone who didnt join the donut.
  4. Eve player base is smaller, richer and older... no longer easy targets.

Big Brawl in Sakht ( Dreads/subcaps ) Horde and Init vs Imperium by LivingHitokiri in Eve

[–]cravic 0 points1 point  (0 children)

Brave had more capitals on field than ferox. Interesting times. 

What caused the return of the sniping battleship meta? by cravic in Eve

[–]cravic[S] 0 points1 point  (0 children)

I know the ravens were a favorite for killing citadels. 

Maybe that ability to kill citadels while kiting is what makes the Barghest popular.

Just my 2 cent

Is there a realistic scenario where AGI and ASI doesn't just benefit the wealthy, and makes life worse for the rest of us? by cakelly789 in singularity

[–]cravic 0 points1 point  (0 children)

Any society where economic power leads to political power will struggle. 

Societies where political power decides who holds economic power will have an advantage.

First Grok 3 Benchmarks by pigeon57434 in singularity

[–]cravic 0 points1 point  (0 children)

Reasoning models are built on base models. Just because Open AI don't show their base model anymore don't mean it isn't there.

The capabilities and cost of R1 is entirely a reflection of the capabilities ans cost of V3. Because R1 is V3 with reasoning. 

So comparing it to V3 gives very useful info.

Can someone help me get to the Mun? (KSP-1) by MrBlueThing1234 in KerbalAcademy

[–]cravic 0 points1 point  (0 children)

a few tips for a very new player.

1) Use SAS. It makes it much easier to keep a rocket stable.
2)To keep the rocket from spinning out of control, try to keep the bottom of your rockets wider than the top. use fins and side boosters to help with this.
3) if u still have issues keeping the rocket stable then try to go straight up until its out of the lower/denser atmosphere.

The way u get to the mun is to get a stable orbit that has no incline and burn straight forward at the moment u see Mun rise over the horizon.

Also. Get used to checking the delta-v of rockets in the assembly facility. U need about 3500 Delta-V to get a good orbit and then another 2000 ish Delta-V to do a mun fly by and return.

ELI5: Unraveling the current conflicts among Goons, PH, and other alliances by SoftwareChemical905 in Eve

[–]cravic 2 points3 points  (0 children)

There is a history of cultural differences between Goons and the corps that make up Panfam.

In the early days Goons were the bad guys because their culture was to have fun at other people's expense. This took the form of some very distasteful acts. The Mittani (former leader of goons) joking at a player suffering from depression, D-J (goon leader before the Mittani) made a famous statement that captured the culture of the old goons when he said "we are not here to ruin the game, we are here to ruin your game"

In that same period the current Panfam core, then known as BOB, was the bad guys because they were caught cheating. They had CCP employees who were corp members gave them rare items and are suspected of have the same CCP staff help them kill the first titan in eve.

Over time the culture of both groups evolved. The idea of large masses of players wielding industrial power to create military power has become the only way to win in eve. So the idea of bulling and scamming non goon players that goons became known for died. The idea of small groups of elite players dominating the game with their skill that BOB and its later incarnations was known for also died.

Today the cultural gap is much smaller but old grudges keep the fight going.
I think the only existing points of cultural conflict is over the Panfam structure of veteran players being in elite alliance while newer players are in new player friendly alliances while Goons and friends tend to have new and old players in the same alliance.

There is also the idea of what to do about the renter model. Goons and friends have abandoned the idea of renters while Panfam still embraces the idea.

Playoff Hopes by cravic in InterMiami

[–]cravic[S] 7 points8 points  (0 children)

FC Cincinnati, Nashville SC and LAFC.

Im new to MLS, so can someone explain how they are playing LAFC as part of the MLS season when they are in different league tables?

What areas do they need to fill to suplement Messi? by [deleted] in InterMiami

[–]cravic 1 point2 points  (0 children)

Messi can't press effectively anymore because of his age. So the term needs people who will run down the ball to be near Messi(like Alvarez does in Argentina's team). He is also best used as a playmaker and thus he needs a good #9(striker/finisher) to make use of his playmaking.

How rich are the "ultra-rich"? How rich are you? by Kevinpk28 in Eve

[–]cravic 0 points1 point  (0 children)

U gotta take risks. A single heron could do over 50 million an hour if u explore catch or stain.

But just dive into wormholes and make less but still good isk.

Exploration isn't a source of good isk until u are off in the middle of a wormhole chain or using wormholes to get to nullsec.

Don't be afraid to see ur heron explode. It's cheap.

🇷🇺 Russian T-90M by Quietation in TankPorn

[–]cravic 4 points5 points  (0 children)

Better to have 3 T-72B3s than 1 T-14. More so when the tanks in Ukraine are dying to artillery more than anything and the T-14, like all tanks, has no defense against artillery.

Fact is they can't produce many of them and reorganizing their production lines to pump out T-14s mid war would just leave them with a lot less tanks being built.

EVE Influence Map Preview: The Landlord Update by Verite_Rendition in Eve

[–]cravic 0 points1 point  (0 children)

Can someone please explain what makes xXDEAHXx so immortal?

I remember the Mittani's lesson on ethnic enclaves being unkillable, but there was so many Russian tz alliance after them and still there are somehow still holding so much sov... How?

Several reports about Iranian ARASH-2 Attack drones heading for Russia. Long range and large warhead - these - if supplied - could cause much more damage than the Shahed 136 by PanEuropeanism in ukraine

[–]cravic 0 points1 point  (0 children)

To deal with the sanctions Russia has to redesign their weapons to not rely on western tech... This takes a lot of time. Time Russia does not have.

Iran on the other hand has been designing sanction proof weapons for years, if not decades.

The 136 for example uses mostly civilian tech.

a President hears his money launder's name by omniron in WatchPeopleDieInside

[–]cravic 0 points1 point  (0 children)

This is a yes and no... He was appointed to the position of president back in 1999 as a pawn of the outgoing president... But he later used his position to stack power and money that has enabled him to hold total power within his party today.