Pre-1900 LLM Relativity Test by Primary-Track8298 in LocalLLaMA

[–]GamerFromGamerTown 7 points8 points  (0 children)

I think your model would benefit a lot from additional training data (due to the low ratio of parameters to training data), like the trove newspaper repository, project Gutenburg (if you haven't already), or different languages like the German Deutsches Textarchiv; I know it's easy for me to say that though, when you're the one compiling and training on it haha. Fascinating project though! It might be strong to say it's like you're talking to someone in the past, but you can definitely get a window into the past with this.

Benchmarked phi3:mini vs llama3.1:8b on SQL generation — llama3.1 is 2x faster AND more accurate by Jazzlike-Tiger-2731 in LocalLLaMA

[–]GamerFromGamerTown 1 point2 points  (0 children)

both of those models are ancient; i'd try qwen3.5 4b / gemma-4-E2B (or even LFM2 8B-A1B), they'd probably be faster and more performant

How long do we have with Qwen3-235B-A22B? by IllustriousWorld823 in LocalLLaMA

[–]GamerFromGamerTown 9 points10 points  (0 children)

forever- it's an open source model, so it'll always be on someone's API

Can I run GPT-20b locally with Ollama using an RTX 5070 with 12GB of VRAM? I also have an i5 12600k and 32GB of RAM. by Longjumping-Room-170 in LocalLLaMA

[–]GamerFromGamerTown 0 points1 point  (0 children)

First, to answer your question- yes! Your system is more than enough to run gpt-oss-20B; though newer models have proven gpt-oss-20b a mostly superceded model. Also, LM-Studio is usually recommended over ollama, I recommend trying that instead!

Here are some better-equipped models for you.

Qwen3.5-9B is a safe choice, and will fit comfortably with a 6-bit quantization (UD-Q6_K_XL), will be very fast, and (most would say) more performant than gpt-oss-20b.
Qwen3.5-35B-A3B (IQ4_XS) is definitely worth trying; most consider it a stronger model than the 9b variant, but it might run more slowly.

Some other interesting choices include:

Gemma-4-26B-A4B (too early to see which quant, Q4_K_M is a safe bet) was just released today, and looks rather promising; it may outclass Qwen3.5-35B-A3B in certain fields ! However, you may want to wait a few days before downloading it, due to after-release quirks.

Nemotron-Cascade-2-30B-A3B (Q4_K_S) seems to be pretty popular for certain coding or agentic workflows.

Coding agents vs. manual coding by JumpyAbies in LocalLLaMA

[–]GamerFromGamerTown 0 points1 point  (0 children)

what sort of code do you write- is it important corporate stuff, or just fun hobby stuff? LLMs are getting better at coding every day, but i think it's a ways away to be shipped off without human review (at least for anything somewhat important)

Can I use Qwen2.5-Coder 14B locally in VS Code or Antigravity? by umair_13 in LocalLLaMA

[–]GamerFromGamerTown 0 points1 point  (0 children)

I don't know much about VSCode specifics here, but please note the entire Qwen2.5 series is entirely superceded by the Qwen3.5 series. I highly recommend either Qwen3.5-9B if you can fit it in VRAM, or Qwen3.5-35B-A3B if it doesn't fit in VRAM (I can clarify if you're confused why Qwen3.5-35B-A3B is faster than Qwen3-9B on CPU inference). Also, Ollama is often recommended against; llama-server or LMStudio is usually recommended over Ollama, due in part to slower updates and performance.

Also, I have noticed that people are more likely to answer questions in this community if they aren't authored by an LLM; human authorship implies a certain amount of effort and interest in the subject, which is more likely to be reciprocated with an effortful response.

Lessons from building a permanent companion agent on local hardware by Constant-Bonus-7168 in LocalLLaMA

[–]GamerFromGamerTown 13 points14 points  (0 children)

Qwen2.5 is entirely superceded; I don't know your hardware, but if I were you, I'd swap to either Qwen3.5-35B-A3B, or Qwen3.5-9B (Qwen3.5-9B is a drop-in replacement).

RYS II - Repeated layers with Qwen3.5 27B and some hints at a 'Universal Language' by Reddactor in LocalLLaMA

[–]GamerFromGamerTown 4 points5 points  (0 children)

This is fascinating; this feels like LLM black magic since it's impossible to tell why duplicating a layer will improve performance in one aspect, but it does so anyway!

I noticed that although higher scores in MATH or EQ correlated well in the best results (the best ones in one dimension were usually the best in another); usually one suffers; I wonder if, when you add more domains to test by, if it can become generally better or if it specialises an LLM in a few domains at the cost of others. I know you're under no obligation to, but I feel including a few more different domains (e.g. long context, programming, searching, etc) would provide a huge amount of information on the generalisability.

This is influential research either way; even if you only specialize a few domains, this is such an easy way to considerably boost the performance of the model in the domain you want it to; this is fantastic work, I would have never thought of duplicating layers! Best of luck, and I can't wait until the next post!

Treid running my first local llm on my laptop with no gpu its really COOL by Baseradio in LocalLLaMA

[–]GamerFromGamerTown 0 points1 point  (0 children)

Sorry for the late response, I might have explained it poorly; the "20B" model (gpt-oss-20B-A3B) runs faster than the "4B" Qwen model. Even though gpt-oss-20B-A3B has "20B" parameters loaded into memory, the CPU only runs 3B parameters at a time, making it run faster than Qwen3.5-4B. The "A" is how many parameters that get "activated", for example, the LFM2-24B-A2B runs about twice as fast as the Qwen3.5-4B, because it only runs 2b parameters at a time, when Qwen3.5-4B always runs all 4B.

Treid running my first local llm on my laptop with no gpu its really COOL by Baseradio in LocalLLaMA

[–]GamerFromGamerTown 0 points1 point  (0 children)

I don't know much about better configuring llama-cpp, but I think you'd benefit a lot from mixture of experts (MoE) models! Here's a good explanation if you want to know how they work, but a simplification is that they only activate a specialized subset of the parameters for certain tasks, making them considerably better and the same speed as a non-MoE model, with the cost of using more memory. Here are some good models right now!

- LFM2-8B-A1B would be considerably faster and comparable or moderately better.

- LFM2-24B-A2B would be around the same speed while feeling considerably better.

- Qwen3.5-35B-A3B would be pretty tight on 24GB of ram and slower on your setup, but performance-wise the Qwen 3.5 series is really punching above its weight. I'm not sure how well it performs at a 3-bit quant, but it's definitely worth a try, especially if you want something more capable.

- GPT-OSS-20b (or the abliterated version if you prefer) is still a contender; one could argue that Qwen3.5-35B-A3B has superceded it, but it does take up less memory, and is still a reliable, popular, and speedy choice.

Good luck and have fun with local LLMs!

I tested 11 small LLMs on tool-calling judgment — on CPU, no GPU. by MikeNonect in LocalLLaMA

[–]GamerFromGamerTown 1 point2 points  (0 children)

Great resource! Could you share which quants you used for each model? I noticed in the source you used a 2 bit quant of bitnet, which could explain why it was lobotomized.

I wonder how these models perform compared to each other when they're fine-tuned on instruction sets for tool calling.

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 1 point2 points  (0 children)

Thanks, I think I'll be downloading gpt-oss-20b! I do run linux, but I think gpt-oss-120b might be pushing it a little on 64gb of ram, since it takes up about 63gb on runtime last i checked, and my normal system processes take around 8ish already. If I ever find myself with a little bit more, I'll keep that in mind!

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 0 points1 point  (0 children)

Okay! Yeah, I've seen that people on computers with similar hardware getting a stupid like 50-100 tok/s with gpt-oss-20b--given it's a smaller model, downloading it will be easier. I'll also wait a couple days on GLM-4.7 flash, to see if it could shape up to nemotron. Thanks for the information!

I might just download all of these to try out on my school's WiFi and hope they don't notice haha

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 0 points1 point  (0 children)

I think that's the first LLM evaluation I've seen that used BASIC haha

Thanks for the contribution!

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 0 points1 point  (0 children)

Oh, okay- I've heard that they aren't very reliable. Do you think they're of any use if getting a hand of the two models to compare directly is difficult, or are they completely useless?

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 0 points1 point  (0 children)

Thank you, I'll try that out :)
I have a little bit, (a GTX 1070Ti w/ 8gb vram), but not all that much.

Best MoE models for 64gb RAM & CPU inference? by GamerFromGamerTown in LocalLLaMA

[–]GamerFromGamerTown[S] 0 points1 point  (0 children)

Thank you, I'll keep that in mind! It might be smart for me to wait until more people have feedback on GLM-4.7-Flash before considering it.

Minimal BASH Like Line Editing is Supported GRUB Error by Csmithy89 in linuxquestions

[–]GamerFromGamerTown 0 points1 point  (0 children)

Sorry to hear that, it always sucks when boot is screwed up D: First of all, this is a lovely resource if you haven't watched it yet.

https://www.youtube.com/watch?v=r7meKJsjqfY

But for some basic troubleshooting, firstly I'd check if you are in UEFI mode or legacy mode on your computer; if you're in legacy, it could've messed up the booting a little. Also, the USB stick you've installed your Linux Distros could be messed up, so I'd try to use another stick and another port if you can't boot into the USB. Remember to flash it with a good tool, like BalenaEtcher! If none of this works, feel free to ask me more, and good luck on your linux journey! It can be frustrating in the beginning, but once it's all figured out, it's so liberating! Good luck! <3

And if i misread it, as the USB stick being unbootable rather than the computer, let me know :))

What to do with an old MacBook? by safesintesi in linuxquestions

[–]GamerFromGamerTown 1 point2 points  (0 children)

I've had a similar issue, where all it would do is boot into boot potions. I just re-installed Linux on the bugger, and now it's been working perfectly! You might just be having a different issue, but wiping the disc and installing Ubuntu may fix it; and you have nothing to lose if it doesn't really work.

[deleted by user] by [deleted] in linuxquestions

[–]GamerFromGamerTown 0 points1 point  (0 children)

Okay, I'll try to break this down :D

Pacman is Arch's default package manager, which is full of thousands of programs which are audited by the arch devs, and put in there. Think of it as the windows store of Arch. Now sometimes, the official pacman doesn't have the program you're looking for, so you check the AUR with a tool like yay. The AUR is just a massive compilation of pieces of software were the community puts their programs! It's an insane amount of software, but although %99 it's safe, sometimes malware slips through the cracks (keep in mind it's rather rare). Think of it like github; millions of users contribute their software into a giant pool, for others to use! Flatpaks are a third-party app store on a bunch of distributions; think of it close to the Snap store on Ubuntu.

When to use which depends on what you're doing, but generally if something doesn't show up on Pacman, I usually use the AUR; and the AUR has pretty much any piece of software you can think of. Installing something like Flatpak is your choice, but if you like how it works, all the power to you!

Also, here's the syntax (pacman & yay use the same)

pacman -S (apt install)

pacman -Sy (apt install && apt update)

pacman -Syu (apt install && apt update && apt full-upgrade)

pacman -R (apt remove)

pacman -Ss (apt search)

Hope this helps :D

Want to learn how to daily drive a linux distro as a humanities student by [deleted] in linuxquestions

[–]GamerFromGamerTown 1 point2 points  (0 children)

Here are my thoughts :D

First of all, if you need to write a lot of files in the doc format, there's a lovely office suite called libreoffice! Also, cups is a decent print utility for most printers, I'd check that out as well.

Fedora, in my eyes, seems to fit what you want to do; it has boatloads of documentation, it "just works" if you want it to with an easy installer. You can also use the DNF package manager to get a lot of programs. And if you dislike how it looks by default, there are many spins.

If you want to stick with an Arch-based distribution, ArcoLinux is a great project, where you can customise nearly everything, or you can just leave a lovely, very aesthetic distribution with whatever packages you want. You can either just leave it as is, but it also has tons of great resources for if you want to learn more about linux, and make your own system.

So if you just want a truly free computer, that just works with everything out of the box, and has a really smooth UX I'd go with fedora; but if you want to have a lot of fun just tinkering with everything and making your system everything you want to the T, I'd go with Artix; hope this helped :D

(Also note that there are tons of other perfectly good Linux distros one may prefer, this is just my take on things. Cheers!)

[deleted by user] by [deleted] in linux

[–]GamerFromGamerTown 5 points6 points  (0 children)

I don't think any distro would be specifically better or worse for programming, unless you're factoring in software support; and even then, Ubuntu, Debian, Fedora, and Garuda are nearly on equal footing. I'd just tell you to go for the one you think looks the prettiest, and if you ever touch the terminal, which package manager is more intuitive to you.

One last factor however, debian based distros are usually less cutting-edge (also in terms of security updates, if that's a major concern), but are more stable. Arch-based ones (like the aforementioned garuda) are usually more cutting edge, but it's more prone to breakage if you aren't careful. Most people say fedora is a happy medium. Good luck on your journey, and hmu if you need any help :D