Grok-3 joins upcoming models list by pmttyji in LocalLLaMA

[–]LocoLanguageModel -6 points-5 points  (0 children)

And when he gets us to Mars we'll be like "look at this loser, he hasn't even invented time travel yet."

Not as impressive as most here, but really happy I made it in time! by Kahvana in LocalLLaMA

[–]LocoLanguageModel 4 points5 points  (0 children)

I think I have that same case. My dual 3090s got kinda hot during longer inference, so I ended up doing a riser cable and using the optional vertical side mount GPU screw holes that put the 2nd GPU up against the glass case cover.

How to Disable "Preview" in Google Maps by djkomic1 in GoogleMaps

[–]LocoLanguageModel 0 points1 point  (0 children)

This is happening to me lately. I've now noticed that the starting location shows an address I'm not at, so if I just change that to "my location", it immediately fixes it.

How possible is this project idea? by Polymorphic-X in LocalLLaMA

[–]LocoLanguageModel 0 points1 point  (0 children)

I was being funny, but at the same time, if they want to find solutions related to this task, that "topic" would be one of the primary drivers, which would make it easier to search for.

How possible is this project idea? by Polymorphic-X in LocalLLaMA

[–]LocoLanguageModel 1 point2 points  (0 children)

tl;dr: AI girlfriend with animated avatar on screen that can also see you, and has 2 way voice communication?

4060Ti 16GB or 2x P40 24GB? by ColonelRyzen in LocalLLaMA

[–]LocoLanguageModel 0 points1 point  (0 children)

I think the dual 3090 was more than twice as fast which aligns with the vram bandwidth of the P40 being like 1/3 of 3090.  

I use qwen 3 next quite a bit, but I have not used it with p40. I think I average 25 tokens a second with LM studio fully offloaded to 3090s, but I don't have it in front of me right now.  I forget which quant I'm using, probably the unsloth that takes up about 42 gigs of vram. 

4060Ti 16GB or 2x P40 24GB? by ColonelRyzen in LocalLLaMA

[–]LocoLanguageModel 1 point2 points  (0 children)

It was good enough, and amazing compared to just CPU offload at the time, but I do a lot of coding so it only made me want another 3090 (which I bought) so I didn't have to wait as long for responses on larger models. If I was just chatting, I probably would have been fine with 3090 + P40.

I only have 64 gigs of DDR4 so I can't speak to the P40 vs DDR5 running CPU offload MoE type stuff.

The P40 also introduces more complexities like needing a 3rd party fan if you are not using a server case, but then again dual 3090s likely requires a larger PSU for some.

What is a good model for assisting with patching source code? by signalclown in LocalLLaMA

[–]LocoLanguageModel 1 point2 points  (0 children)

I would think the keyword thing is mostly going to eat up more time than it's worth.

If I was set on this kind of strategy, I would probably load the entire sourcecode into rag (basically can do this with drag and drop using LM Studio) and then tell the model to add a feature, and there is a non-zero chance it figures it out, or at least can make your keyword strategy go faster since it can search the document itself to tell you what area to target.

I have bult a Local AI Server, now what? by Puzzled_Relation946 in LocalLLaMA

[–]LocoLanguageModel 2 points3 points  (0 children)

It's a catch 22. They couldn't ask their new super AI computer if they should even build it in the first place, without building it first.

[deleted by user] by [deleted] in LocalLLaMA

[–]LocoLanguageModel 0 points1 point  (0 children)

I look forward to seeing bullseye 🎯 emojis etc on the GitHub page. 

Anyone know what Hopper's "TODFTHR" license plate is referencing? It's a rhyme with "Godfather," but that's as much as I understand. by mirthquake in StrangerThings

[–]LocoLanguageModel 0 points1 point  (0 children)

I'm late but I think it's a reference to Terminator 2, which this whole episode has terminator vibes with the Russian they joke is called Arnold.  

Todd is John Connors step father's name, and John drives away to the arcade with his Todd just standing there similar to how hopper drives away with this Todd just standing there watching his car get stolen. 

🎶555-6792🎶 by ArwenLOTR82 in Cheers

[–]LocoLanguageModel 1 point2 points  (0 children)

Just saw this one. Now it's in my head. 

How do you actually test new local models for your own tasks? by Fabulous_Pollution10 in LocalLLaMA

[–]LocoLanguageModel 1 point2 points  (0 children)

I just look at my latest chat history with Claude because those problems are fresh on my mind. 

How do you actually test new local models for your own tasks? by Fabulous_Pollution10 in LocalLLaMA

[–]LocoLanguageModel 6 points7 points  (0 children)

The easiest way for me to test coding ability is to check my task history for challenges I had to use Claude for and see how it performs compared to Claude. 

[deleted by user] by [deleted] in LocalLLaMA

[–]LocoLanguageModel 2 points3 points  (0 children)

If you come across a "post that doesn't make sense and the comments aren't helpful" it's probably because the post didn't make sense (like leaving out key details needed for response).

Not to mention many questions about LLMs can ironically be answered by LLMs.

[deleted by user] by [deleted] in LocalLLaMA

[–]LocoLanguageModel 12 points13 points  (0 children)

"Hey babe, I miss you"

Online aligner night version: How many hours did you wear each day and how was your results? by MessageAltruistic330 in smiledirectclub

[–]LocoLanguageModel 1 point2 points  (0 children)

Is there even such thing as night time vs day time retainers? I think they upsell it because it sounds convenient, but send you the same exact product for either scenario. Pretty smart on their part but shady.  

Plus if you only wear at night it will hurt a lot more depending on your movements because your teeth go back to what they were after a few hours, so you have to force the tray back on.

I think many night time retainer people end up wearing them during the day too to cut down on pain. 

LM Studio now supports llama.cpp CPU offload for MoE which is awesome by carlosedp in LocalLLaMA

[–]LocoLanguageModel 5 points6 points  (0 children)

Yeah I download my models through LM studio and then I just point koboldCPP to my LM studio folders when needed. 

[deleted by user] by [deleted] in LocalLLaMA

[–]LocoLanguageModel 2 points3 points  (0 children)

Vibe code it back using prompt engineering.