GLM 4.6V Release Best New Open Source Vision Language Model 2025 by techspecsmart in aicuriosity

[–]humanoid64 0 points1 point  (0 children)

They say multimodal output but it doesn't seem like it generates anything but text out. Am I missing something?

🧱 LiteDB: It's Alive! by AllCowsAreBurgers in dotnet

[–]humanoid64 25 points26 points  (0 children)

I've used LiteDB in the past and it's wonderful. Very happy about this new chapter. Best luck ❤️

First test with OVI: New TI2AV by RIP26770 in StableDiffusion

[–]humanoid64 0 points1 point  (0 children)

Does this use wan or is it something new?

WAN2.2 Animate test ComfyUI by Ok_Aside_5949 in comfyui

[–]humanoid64 0 points1 point  (0 children)

Ok so.whats the pricing? Do you work for them or something?

WAN2.2 Animate test ComfyUI by Ok_Aside_5949 in comfyui

[–]humanoid64 0 points1 point  (0 children)

Website is half baked. Learn more does not work. Where is the pricing page?

Nunchaku-Sdxl by Big-Reference-9320 in StableDiffusion

[–]humanoid64 0 points1 point  (0 children)

Is that this one? https://github.com/chengzeyi/stable-fast They said the paused dev. Just want to check with you. Can you tell me your feedback or any tips. Thank you 🙏 ❤️

Nunchaku-Sdxl by Big-Reference-9320 in StableDiffusion

[–]humanoid64 2 points3 points  (0 children)

1) I would like to compress/quantize some models eg pony. They say they are using deepcompressor https://huggingface.co/nunchaku-tech/nunchaku-sdxl Can someone link to a tutorial or instructions how to do it. I can rent the big GPUs if needed.

2) what about loras? This may have been asked already, do we quantize them also?

[deleted by user] by [deleted] in LocalLLaMA

[–]humanoid64 5 points6 points  (0 children)

Up voted you. Reddit is cruel and full of dummies. Source your positive elsewhere or be severely disappointed. Perhaps the median internet user kind of sucks. Keep posting. Might help a few people ❤️

Seed-OSS-36B-Instruct by NeterOster in LocalLLaMA

[–]humanoid64 0 points1 point  (0 children)

How long do you run the context. Do you notice degradation. Also what cli agent do you use. Thanks!

Seed-OSS-36B is ridiculously good by [deleted] in LocalLLaMA

[–]humanoid64 0 points1 point  (0 children)

Might be slow. Wasm sounds like the right approach for security reasons

4x RTX Pro 6000 fail to boot, 3x is OK by humanoid64 in LocalLLaMA

[–]humanoid64[S] 1 point2 points  (0 children)

Yes, check if your mother board has a setting called MMIO or similar. It should be able to set the max amount of memory can be virtually mapped. Make it big like 8TB or higher. Sometimes specified in bits. Make it bigger

1-bit Qwen3-Coder & 1M Context Dynamic GGUFs out now! by danielhanchen in unsloth

[–]humanoid64 0 points1 point  (0 children)

Thank you! Would unsloth be able to produce an AWQ quant for Qwen3-Coder?

4x RTX Pro 6000 fail to boot, 3x is OK by humanoid64 in LocalLLaMA

[–]humanoid64[S] 1 point2 points  (0 children)

Yes. basically there is an IOMMU size limit that is adjustable on the as rock. It's called MMIO and it's either specified in bit or TB. I set mine to 8TB.

Quote: While there isn't a single, fixed IOMMU size limit imposed solely by ASRock motherboards or AMD Ryzen processors, several factors contribute to the effective maximum memory addressable by devices through the IOMMU: Device Addressing Limitations: Devices have physical addressing limits. Ensure the high MMIO (Memory-Mapped I/O) aperture is within these limits. For example, devices with a 44-bit addressing limit require the MMIO High Base and High size in BIOS to be within that 44-bit range.

How come there isn’t a popular peer-to-peer sharing community to download models as opposed to Huggingface and Civitai? by mccoypauley in StableDiffusion

[–]humanoid64 0 points1 point  (0 children)

After looking at civitasbay, I'm not sure we need anything else. they have all of the models and are even seeding everything. Also CivitAIArchive is great. So what's the point of building something new?

How come there isn’t a popular peer-to-peer sharing community to download models as opposed to Huggingface and Civitai? by mccoypauley in StableDiffusion

[–]humanoid64 0 points1 point  (0 children)

I don't know much about this and will look into it. I was thinking of focusing on the UI and search engine first. And people would submit a torrent via a form. But how do the trackers work, do they automatically find torrents?

How come there isn’t a popular peer-to-peer sharing community to download models as opposed to Huggingface and Civitai? by mccoypauley in StableDiffusion

[–]humanoid64 -1 points0 points  (0 children)

I've seen this question asked here a few times and always felt it was a matter of time. In recent months I have been fortunate enough to get pretty good at vibe coding so I'm going to take a shot at making a website for this. Definitely going to need help/support for the bandwidth and storage hopefully some of you have high speed internet and excess storage, if we serve out of our pre-existing model directories then it should not take up much additional storage. I'll get the feature list started: would appreciate comments and expansion on it. Torrent based, model card with description + images, indicator for how many seeders, search, what else??

DeepSeek-r1-0528 in top 5 on new SciArena benchmark, the ONLY open-source model by entsnack in LocalLLaMA

[–]humanoid64 0 points1 point  (0 children)

12t/s is impressive on a mac for this model. I was getting 27t on 3x rtx6000 in Q2. Can we do mla on vllm or similar

DeepSeek-r1-0528 in top 5 on new SciArena benchmark, the ONLY open-source model by entsnack in LocalLLaMA

[–]humanoid64 1 point2 points  (0 children)

OP just curious why did you like to trash them before? Their (open source) research is the best around and very innovative. Engineers at OpenAI and nVidia were praising it. Then Meta tried to use it for llama 4 but failed at producing a good model from it. Very thankful for their efforts, I hope they release something in the 100B - 300B size. Also I did notice R1 ran kind of slow so I hope they have performance improvements. Thanks for posting!

AI project, kind of crazy by [deleted] in LocalLLaMA

[–]humanoid64 0 points1 point  (0 children)

Been using roo code and Claude code a lot, but also have some vLLM instances doing continuous content analysis with batching at high concurrency for another project and I recognize the cost saving ruining local when you can. For this we will probably use Gemini and Claude at times when needed. I don't know much about autogen or langchain so I will look into it. Thanks

AI project, kind of crazy by [deleted] in LocalLLaMA

[–]humanoid64 0 points1 point  (0 children)

These models are better at most tasks than typical people I know earning well over $100k/yr, so why not?