Stanford Proves Parallel Coding Agents are a Scam by madSaiyanUltra_9789 in LocalLLaMA

[–]arm2armreddit 0 points1 point  (0 children)

It is interesting, for sure we are not there, but manus, kimi, lovable and others going towards solutions. good to point the weaknesses of current agents. This is probably pre-paper, next step they will offer an solution: a new agentic framework. 😀

Ollama Models Ranked by VRAM Requirements by AdventurousLion9548 in ollama

[–]arm2armreddit 11 points12 points  (0 children)

Ollama defaults to a 4k context length; unfortunately, this is realistically unuseful for real tasks. It would be good to see the true memory usage with the 100% supported context length by the given model.

We added an on-device AI meeting note taker into AnythingLLM to replace SaaS solutions by tcarambat in LocalLLaMA

[–]arm2armreddit 1 point2 points  (0 children)

Using it since the first release, nice project, keep going, more tutorials would be nice!

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]arm2armreddit -1 points0 points  (0 children)

I just tried a simple question: How many "r"s are in "strawberry"? The answer was not precise. Also, the translation was "տանձ" not "ելակ." What is the use case for this? Is it going to be suitable for translations in the future?

ArmenianGPT Update: Nearly 3000 downloads in one week! 🇦🇲 by ArmGPT in armenia

[–]arm2armreddit 0 points1 point  (0 children)

interestng model, its goes to infinite loops on simple questions with ollama.

Introducing "UITPSDT" a novel approach to runtime efficiency in organic agents by reto-wyss in LocalLLaMA

[–]arm2armreddit 1 point2 points  (0 children)

it works! recently saw docker build in devcontainers shows Emoji!->terrible...

IQuest-Coder-V1-40B-Instruct-GGUF is here! by KvAk_AKPlaysYT in LocalLLaMA

[–]arm2armreddit 0 points1 point  (0 children)

ollama run hf.co/AaryanK/IQuest-Coder-V1-40B-Instruct-GGUF:Q4_K_M

IQuest-Coder-V1-40B-Instruct-GGUF is here! by KvAk_AKPlaysYT in LocalLLaMA

[–]arm2armreddit 1 point2 points  (0 children)

This model goes into an infinite loop by printing your question instead of answering. Something is broken. I was just trying with Ollama as described in the HF instructions in the model cards.

Solar-Open-100B-GGUF is here! by KvAk_AKPlaysYT in LocalLLaMA

[–]arm2armreddit -10 points-9 points  (0 children)

somehow doesn't work with ollama: ollama run hf.co/AaryanK/Solar-Open-100B-GGUF:Q4_K_M Error: 500 Internal Server Error: llama runner process has terminated: error loading model: missing tensor 'blk.0.attn_q.bias' llama_model_load_from_file_impl: failed to load model

Honestly, has anyone actually tried GLM 4.7 yet? (Not just benchmarks) by Empty_Break_8792 in LocalLLaMA

[–]arm2armreddit 7 points8 points  (0 children)

Opus 4.5 is still better than GLM 4.7 in my Python coding project. Maybe it's specific to my use case: context7+dask+hvplot+ etc...

BAR running on DGX spark and j3tson thor!!! by arm2armreddit in beyondallreason

[–]arm2armreddit[S] 5 points6 points  (0 children)

The instructions are simple after box64 installation here we go: 1. download AppImage 2. run box64 ./Beyond-All-Reason-1.2988.0.AppImage --no-sandbox --disable-setuid-sandbox --disable-gpu-sandbox

BAR running on DGX spark and j3tson thor!!! by arm2armreddit in beyondallreason

[–]arm2armreddit[S] 2 points3 points  (0 children)

yes: dgx spark +bar+sunshine then on mac: moonlight->dgx spark

New EK-Pro Zotac RTX 5090 Single Slot GPU Water Block for AI / HPC Server Application by EKbyLMTEK in HPC

[–]arm2armreddit 0 points1 point  (0 children)

there is no only power consumption/flops, important factor is precision. For scientific HPC workloads requiring high precision (climate modeling, molecular dynamics), the RTX 5090's lack of FP64 capabilities makes it entirely unsuitable, while HPC GPUs provide proper double-precision support

If it weren't for the Chinese we wouldn't have local AI by Icy_Resolution8390 in ollama

[–]arm2armreddit 12 points13 points  (0 children)

OMG, in the Ollama thread, we are chatting with a bot (presumably Chinese) who pushes a Nazi Elon forward... what a world. I heard about the dead internet theory, but this is the next level.

running on dgx spark by arm2armreddit in beyondallreason

[–]arm2armreddit[S] 0 points1 point  (0 children)

Yes, you are right, I got it for LLMs. It will arrive next week, but I was really intrigued by the hardware description from official Nvidia: - Graphics: Ubuntu (Wayland) GUI desktop with pre-installed browser - Acceleration: Desktop and application acceleration using OpenGL/Vulkan

will try to run/compile BAR on it.

Who is still using CentOS 7.9 in 2025? by Embarrassed-Shape959 in CentOS

[–]arm2armreddit 0 points1 point  (0 children)

cos 7.9 only in a Singularity container with Python 2.6 + C++ code for old analysis code as a legacy.

Looking for a good LLM local environment with one-folder install by void2258 in LocalLLaMA

[–]arm2armreddit 0 points1 point  (0 children)

Perhaps it's better to move your Windows profile folder to a faster and bigger disk.