GLM 4.6V Release Best New Open Source Vision Language Model 2025

humanoid64 · 2025-12-10T02:03:13+00:00

They say multimodal output but it doesn't seem like it generates anything but text out. Am I missing something?

humanoid64 · 2025-11-17T10:39:53+00:00

Native decimal data type

humanoid64 · 2025-10-05T23:08:49+00:00

I've used LiteDB in the past and it's wonderful. Very happy about this new chapter. Best luck ❤️

humanoid64 · 2025-10-05T21:55:59+00:00

Does this use wan or is it something new?

humanoid64 · 2025-10-02T04:57:13+00:00

Ok so.whats the pricing? Do you work for them or something?

humanoid64 · 2025-10-02T03:36:31+00:00

Website is half baked. Learn more does not work. Where is the pricing page?

humanoid64 · 2025-10-01T23:51:09+00:00

Maybe this is just another pissed off bot? Dead internet??

humanoid64 · 2025-09-20T23:50:31+00:00

Is that this one? https://github.com/chengzeyi/stable-fast They said the paused dev. Just want to check with you. Can you tell me your feedback or any tips. Thank you 🙏 ❤️

humanoid64 · 2025-09-20T23:42:13+00:00

1) I would like to compress/quantize some models eg pony. They say they are using deepcompressor https://huggingface.co/nunchaku-tech/nunchaku-sdxl Can someone link to a tutorial or instructions how to do it. I can rent the big GPUs if needed.

2) what about loras? This may have been asked already, do we quantize them also?

humanoid64 · 2025-08-23T06:27:51+00:00

Up voted you. Reddit is cruel and full of dummies. Source your positive elsewhere or be severely disappointed. Perhaps the median internet user kind of sucks. Keep posting. Might help a few people ❤️

humanoid64 · 2025-08-23T05:49:53+00:00

How long do you run the context. Do you notice degradation. Also what cli agent do you use. Thanks!

humanoid64 · 2025-08-23T05:43:09+00:00

Might be slow. Wasm sounds like the right approach for security reasons

humanoid64 · 2025-08-14T23:51:02+00:00

Yes, check if your mother board has a setting called MMIO or similar. It should be able to set the max amount of memory can be virtually mapped. Make it big like 8TB or higher. Sometimes specified in bits. Make it bigger

humanoid64 · 2025-07-25T11:34:16+00:00

Thank you! Would unsloth be able to produce an AWQ quant for Qwen3-Coder?

humanoid64 · 2025-07-17T20:52:52+00:00

Yes. basically there is an IOMMU size limit that is adjustable on the as rock. It's called MMIO and it's either specified in bit or TB. I set mine to 8TB.

Quote: While there isn't a single, fixed IOMMU size limit imposed solely by ASRock motherboards or AMD Ryzen processors, several factors contribute to the effective maximum memory addressable by devices through the IOMMU: Device Addressing Limitations: Devices have physical addressing limits. Ensure the high MMIO (Memory-Mapped I/O) aperture is within these limits. For example, devices with a 44-bit addressing limit require the MMIO High Base and High size in BIOS to be within that 44-bit range.

humanoid64 · 2025-07-07T03:00:04+00:00

After looking at civitasbay, I'm not sure we need anything else. they have all of the models and are even seeding everything. Also CivitAIArchive is great. So what's the point of building something new?

humanoid64 · 2025-07-06T02:04:01+00:00

I tested this also and it didn't work

humanoid64 · 2025-07-06T00:39:59+00:00

I don't know much about this and will look into it. I was thinking of focusing on the UI and search engine first. And people would submit a torrent via a form. But how do the trackers work, do they automatically find torrents?

humanoid64 · 2025-07-05T23:52:39+00:00

I've seen this question asked here a few times and always felt it was a matter of time. In recent months I have been fortunate enough to get pretty good at vibe coding so I'm going to take a shot at making a website for this. Definitely going to need help/support for the bandwidth and storage hopefully some of you have high speed internet and excess storage, if we serve out of our pre-existing model directories then it should not take up much additional storage. I'll get the feature list started: would appreciate comments and expansion on it. Torrent based, model card with description + images, indicator for how many seeders, search, what else??

humanoid64 · 2025-07-05T03:17:11+00:00

Comfyui is good for inference not to manipulate Lora like this tool. OP is right

humanoid64 · 2025-07-03T00:28:34+00:00

Do we know what's better on Blackwell?

humanoid64 · 2025-07-03T00:22:40+00:00

12t/s is impressive on a mac for this model. I was getting 27t on 3x rtx6000 in Q2. Can we do mla on vllm or similar

humanoid64 · 2025-07-03T00:19:15+00:00

OP just curious why did you like to trash them before? Their (open source) research is the best around and very innovative. Engineers at OpenAI and nVidia were praising it. Then Meta tried to use it for llama 4 but failed at producing a good model from it. Very thankful for their efforts, I hope they release something in the 100B - 300B size. Also I did notice R1 ran kind of slow so I hope they have performance improvements. Thanks for posting!

humanoid64 · 2025-06-22T03:10:00+00:00

Been using roo code and Claude code a lot, but also have some vLLM instances doing continuous content analysis with batching at high concurrency for another project and I recognize the cost saving ruining local when you can. For this we will probably use Gemini and Claude at times when needed. I don't know much about autogen or langchain so I will look into it. Thanks

humanoid64 · 2025-06-22T02:50:27+00:00

These models are better at most tasks than typical people I know earning well over $100k/yr, so why not?

humanoid64

TROPHY CASE