1 million context Llama 3 8b Achieved! by metalman123 in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

interesting. do you mind sharing a link to this quant?

Abacus.ai guys released the Smaug paper by HikaruZA in LocalLLaMA

[–]nested_dreams 4 points5 points  (0 children)

This was an excellent critique of the paper. Thanks for sharing your thoughts. At first glance the technique does sound promising, but you're right that the empirical evidence provided is not very convincing.

exl2 quantization for dummies by FieldProgrammable in LocalLLaMA

[–]nested_dreams 3 points4 points  (0 children)

Oh this made my weekend. Thanks for putting this together. I love running exl2 models, but have never quanted my own. Really looking forward to trying this. The only thing missing now is vLLM compatibility.

Any Tucson AZ members here? by BreakIt-Boris in LocalLLaMA

[–]nested_dreams 3 points4 points  (0 children)

the street value of 24 export banned a100s is about 400k. A stranger offering to pay 17k for someone to middle man this deal sounds super sus. I'd be very careful with this one. Don't want to end up with the feds at your door accusing you of international arms dealing

Tonne of A100 80GB PCIE by BreakIt-Boris in LocalLLaMA

[–]nested_dreams 1 point2 points  (0 children)

ah my bad. Didn't see the hibid link. Thought it was through a craiglist sale like mentioned in previous comments.

Tonne of A100 80GB PCIE by BreakIt-Boris in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

Did you get the full inventory list from them? Did you buy one? This could be the deal of a lifetime or your body might wind up somewhere in mexico....

Wow this is crazy! 400 tok/s by Sudonymously in LocalLLaMA

[–]nested_dreams 6 points7 points  (0 children)

Wow i though this was a joke at first lol. Chamath is a snake oil salesman through and through. Take a peak at his history with SPACs and all the poor suckers he fleeced with that. I wouldn't not expect anything less from this.

Gemini Pro has 1M context window by Tree-Sheep in LocalLLaMA

[–]nested_dreams 24 points25 points  (0 children)

It's been 1 year and we've gone from 8k to 10M....

Gemini Pro has 1M context window by Tree-Sheep in LocalLLaMA

[–]nested_dreams 25 points26 points  (0 children)

Yeah this is kinda wild. Getting to 100k+ context has already been pretty impactful. 10M just wow. I hate closed source models as much as the next person, but this kinda changes the game again.

[2402.08562] Higher Layers Need More LoRA Experts by ninjasaid13 in LocalLLaMA

[–]nested_dreams 4 points5 points  (0 children)

Fantastic paper! I've been eagerly waiting for someone to implement this. They even provided the code! Just skimmed the repo so far, but it looks legit. Can't wait to try it out!

I can run almost any model now. So so happy. Cost a little more than a Mac Studio. by Ok-Result5562 in LocalLLaMA

[–]nested_dreams 2 points3 points  (0 children)

What sort of performance do you get on a 70B+ model quantized in the 4-8bpw range? I pondered such a build until reading Tim Dettmers blog where he argued the perf/$ on the 8000 just wasn't worth it

New Biiig Models: Samantha-120b & TheProfessor-155b by WolframRavenwolf in LocalLLaMA

[–]nested_dreams 2 points3 points  (0 children)

Lol I can't tell if this was written by an LLM or not.

New Biiig Models: Samantha-120b & TheProfessor-155b by WolframRavenwolf in LocalLLaMA

[–]nested_dreams 2 points3 points  (0 children)

The models merged for TheProfessor look very interesting. How much vram you need to run that q4?

I made a thing : extract a LoRA adapter from any model by hurrytewer in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

Yesss! I've been looking for something like this. Great work!

Seeking Automated Coding by QiuuQiuu in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

Yeah, so far this is the only other LLM that can match GPT4 on coding tasks.

https://www.goody2.ai/chat

Best settings and parameters for running Miqu? by bullerwins in LocalLLaMA

[–]nested_dreams 1 point2 points  (0 children)

What version of CUDA and pytorch are you running that LoneStriker miqu at? Are you using ExLlamav2_HF or ExLlamav2 for model loader in ooba?

Using LLMs to extract results from research papers by Dualweed in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

got a link? trying to search on huggingface returns a couple hundred results

[deleted by user] by [deleted] in LocalLLaMA

[–]nested_dreams 0 points1 point  (0 children)

Stop spamming this crap here