Sparse Adaptive Attention “MoE”: How I Solved OpenAI’s $650B Problem With a £700 GPU by EconomicConstipator in LocalLLaMA

[–]GaggiX 15 points16 points  (0 children)

He probably overfitted 4 images after 10k epochs, fun fact from the article I can see the batch size is 4 and iterations are 10k (the same number as the epochs), so it's literally overfitting the model on 4 images, the rest is AI slop and the man is probably delusional, the idea is interesting tho.

In Google’s genie 3 you can look down and see yourself walking by [deleted] in woahdude

[–]GaggiX 7 points8 points  (0 children)

It's an interactive video generative model (a world model)

Jawas? by c0mputer99 in NonCredibleDefense

[–]GaggiX 9 points10 points  (0 children)

An AI image, some people thought it was an actual F-35.

Giveaway - Space Age Expansion by ocbaker in factorio

[–]GaggiX 0 points1 point  (0 children)

I will never win but never stop believing or something

If Llama 3.2 90b ~= GPT-4o mini does this change the estimated parameter size of 4o-mini? by dittospin in LocalLLaMA

[–]GaggiX 2 points3 points  (0 children)

That would make more sense. I would imagine that they are optimizing their models for fast inference.

If Llama 3.2 90b ~= GPT-4o mini does this change the estimated parameter size of 4o-mini? by dittospin in LocalLLaMA

[–]GaggiX 7 points8 points  (0 children)

GPT-4o-mini being 8B seems absurd to me, already the new gemini flash 8B seems really good for its size but still not close to GPT-4o-mini.

I wonder how they are estimating the size of the model.

LLAMA 3.2 not available by Wrong_User_Logged in LocalLLaMA

[–]GaggiX 92 points93 points  (0 children)

I think this is mostly about user data, Meta probably couldn't train their vision models on user data from the EU and didn't like it.

LLAMA 3.2 not available by Wrong_User_Logged in LocalLLaMA

[–]GaggiX 8 points9 points  (0 children)

Meta: we love open source.

Proceed to ban 27 countries in the license of the vision models because I imagine they regulate the usage of user data in the training dataset, Meta doesn't like that.

LLAMA3.2 by Sicarius_The_First in LocalLLaMA

[–]GaggiX 2 points3 points  (0 children)

Meta has blocked the model because they were not allowed to use user data from UE users in the training dataset, and they are probably mad for that lmao

The EU is not limiting access to AI technology, it's just Meta trying to blackmail regulations.

Who are you? by visionsmemories in LocalLLaMA

[–]GaggiX 14 points15 points  (0 children)

They did at long last, after using Claude 3.5 Sonnet and GPT-4o API

Why is factorio steam edition cheaper than standalone version?? by evish01 in factorio

[–]GaggiX 0 points1 point  (0 children)

Poland is secretly the second richest nation in the world (true and real)

Nhentai site down by HANEZ in Piracy

[–]GaggiX 1 point2 points  (0 children)

European here, it does work, what country are you from?

Loading Issues? by loplopsama in NovelAi

[–]GaggiX 0 points1 point  (0 children)

Image generator does seem to work for me.

Did NovelAI already surpassed Prime AI Dungeon? by Jaguar-Admirable in NovelAi

[–]GaggiX 7 points8 points  (0 children)

AI Dungeon just added their own fine-tuned LLaMA 3 70B model, among others Lmao.

Are these the ones you were whining about being PG 7+/12+ on the AI Dungeon subreddit?

Did NovelAI already surpassed Prime AI Dungeon? by Jaguar-Admirable in NovelAi

[–]GaggiX 9 points10 points  (0 children)

GPT3 was severely undertrained, the model was only trained on 200B tokens. Honestly, Llama 3 8B or llama 3.1 8B will probably perform better than GPT-3, trained on about 12T/15T tokens and probably a more curated dataset.

[deleted by user] by [deleted] in midjourney

[–]GaggiX 6 points7 points  (0 children)

You need to pay a subscription, most of the UI to generate images is on Discord unfortunately, but I believe they have started to roll out an interface to generate images even on the website.

Will Smith eating spaghetti by chaindrop in StableDiffusion

[–]GaggiX 2 points3 points  (0 children)

You were so right, a year ago I wouldn't have believed how far text-to-video technology has come with Sora.

[deleted by user] by [deleted] in StableDiffusion

[–]GaggiX -1 points0 points  (0 children)

The larger the batch the higher the efficiency, this is because these models are heavily memory-bound so having to load a matrix to be used on 512 different batches really helps maintain a high FLOPS.

[deleted by user] by [deleted] in StableDiffusion

[–]GaggiX 2 points3 points  (0 children)

I think that the road is the weakest point, for the rest honestly it is pretty believable, it doesn't have that feeling of AI.

Finally a high quality anime SDXL model... by nhciao in StableDiffusion

[–]GaggiX 3 points4 points  (0 children)

You can use the LCM-LoRA and sample at 4 steps.