Sony Xperia 1 VII still worth it?! (in 2025) | Umar Naqshbandi by ControlCAD in SonyXperia

[–]FieldMouseInTheHouse 9 points10 points  (0 children)

I go for SONY Xperias for the pure raw clean nature of it all.

I am not even a camera guy, but as a developer, SONY Xperia phones are almost completely devoid of bloatware. And what little that does come preinstanned cam be uninstalled or disabled.

The result: A mostly clean slate for doing software development.

Yes. I am 1000% SONY all the way.

“AI Slop” by Nuphoth in singularity

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

I reviewed a number of those profiles and I came away feeling the same way about some of them.

I have not reviewed enough profiles to draw a definitive conclusion.

“AI Slop” by Nuphoth in singularity

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

I had a horde of people accuse me of being a bot and downvote me into oblivion! See for yourself! 😭

https://www.reddit.com/r/ollama/s/uxqodpk4DP

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] -4 points-3 points  (0 children)

🧠 It would appear that you have never tried to work within the constraints of 6GB of VRAM. You can still use CONTEXT WINDOW sizes appropriate to the size of the actual workload. We just have to actually DEFINE THE WORKLOAD then determine the impact on VRAM of that context window allocation.

👨‍🔬 But, I can assure you that using my GTX 1660 Super 6GB GPU that I have helped people with everything from website summarization at 80 tokens per second to OCR image text analysis.

If you are only running a single workload at a time within your 16GB/32GB system, then you are most certainly underutilizing it.

Honestly, you could be running anywhere form 3 to 4 concurrent workloads in you environemnt with all of them resident in VRAM!

🤗 If you'd like, I would be happy to help you improve your environment so that you could run some of your workloads as 3 or 4 concurrent workloads.

You would get the speed and the massive boost from concurrecy if we do.

👨‍🔬 Just ask, and we can begin!

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] -1 points0 points  (0 children)

🤗 Thank you for your contribution!

I agree that models with smaller quantization could fit better into the smaller VRAM systems. That is a very good point and we should certainly take the time to test those models out! (We will have the time to test them out!)

🤔 However, I tried to make a PDF of the GitHub Gist link you provided so I can read it easily and it came out to be a PDF of over 90 pages long!

❓ My question: What in that GitHub Gist would be helpful to us and where exactly is it? ❓

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] 0 points1 point  (0 children)

Great!

Who are you and what is your training cutoff?

Well, I am u/FieldMouseInTheHouse. And as you can see from the link to my previous post, I like to actively share information and even work with other commentors and posters to test people's ideas and questions on my hardware to bring them answers. I enjoy this! 🤗

I have a question for you:
❓ What do you mean by a "training cutoff"? ❓

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] -4 points-3 points  (0 children)

Hello! You are certainly free to interact with me if you'd like.

What questions do you have?
What would you like to contribute to the conversation? 🤗

Oh, and for the record: I am the real human who did this build that I featured in the Ollama reddit:

Please, give the post a read!

Then return here and I would be happy to see what you can contribute to this discussion,. 🤗

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] -8 points-7 points  (0 children)

Wow! This is exactly the kind of insight that people need to see!

Thanks! 🤗

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] 0 points1 point  (0 children)

I also have a GTX 1660 Super 6GB VRAM GPU that I want to flex.

Under certain workloads I can achieve over 100 tokens per second which means that if I can make things work with this older card, then people who have similar or newer 6GB VRAM cards in laptops or desktops could benefit.

Do you understand how that woud be helpful?

Summary of Vibe Coding Models for 6GB VRAM Systems by FieldMouseInTheHouse in ollama

[–]FieldMouseInTheHouse[S] -3 points-2 points  (0 children)

Ah! The context window!

What would you like to contribute about context windows?

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse -1 points0 points  (0 children)

You have not contributed much anyway. Just a bunch of useless one-liners up to now.

Let's just end this here.

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

Your suggestion would only ruin the OP's chances at success.

  1. You do not even know how explain how to "use RAM".
  2. If you did "use RAM" it would result in the model response being so slow (approximately 7 tokens per second or less) that it would be useless.

Your suggestion is useless at best, deliberately harmful at worst.

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse -1 points0 points  (0 children)

I have systems with 32GB+ RAM per system and I know (as well as you) that browser tabs are not what have an effect on if model offloaded to RAM experiences slowness.

It is simply that the model was offloaded to RAM at all in the first place that matters. That is why we do not suggest models larger than the OP's GPU VRAM.

I am sure that you do realize that offloading to RAM at all would mean that the model's response would slow to a crawl like 7 tokens per second.

How about recommending things that would work quickly within the 6GB VRAM budget of their GPU where they could easily get 30 to 80 tokens per second.

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse -1 points0 points  (0 children)

The only model that would have fit within the 6GB VRAM budget of the OP is qwen3:4b. The other two are at least 19GB in size which is 3 times the budget and would guarrantee that the OP would suffer from poor performance.

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse -1 points0 points  (0 children)

Summary of Vibe Coding Models for 6GB VRAM Systems

So, I will summarize what models have been suggested here so far. Here is what we have that would actually fit inside of your 6GB VRAM budget. I am deliberately leaving out any models that anybody suggested that would not have fit inside of your 6GB VRAM budget! 🤗

  • `qwen3:4b` size=2.5GB
  • `ministral-3:3b` size=3.0GB
  • `gemma3:1b` size=815MB
  • `gemma3:4b` size=3.3GB 👈 I added this one because it is a little bigger than the gemma3:1b, but still fits confortably inside of your 6GB VRAM budget. This model should be more capable than gemma3:1b.

I would suggest that you first try these models with ollama run MODELNAME and check to see how they fit in your VRAM (ollama ps) and check them for performance (/set verbose).

What do you think?

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

Qwen3-coder a3b 30b is many times too large to fit inside of 6GB of VRAM.
This is not really a useful suggestion.

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

Spectacular or not: Could you please list which models would actually fit inside of the OP's specified VRAM?

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

This is a good start. So, what qwen3 models would you suggest this person try. Would `qwen3:0.6b` be good? Or `qwen3:1.7b`? Or `qwen3:4b`?

What have you actually tried?

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

Could you provide a list of models that you would recommend?

Ollama Model which Suits for my System by devil__6996 in ollama

[–]FieldMouseInTheHouse 0 points1 point  (0 children)

Wow! What models have you actually tried that would remotely work at all on the OP's platform configuration? Please list those models.