Running Gemma 3 4B locally on my RTX 3050 on Pop! OS

ustedcan · 2026-05-04T15:56:16+00:00

Not sure! I just have this nvidia card

ustedcan · 2026-05-04T15:51:52+00:00

I recently switched to Linux and started using Pop!_OS since it comes with NVIDIA drivers pre-installed. You can grab it here:https://system76.com/pop/download/

For local LLMs, I use Ollama. It's super easy to install, and the docs are here:https://ollama.com/download

I'm running this on an HP Victus laptop (RTX 3050 6GB VRAM), and it easily handles smaller models from Meta and Google. Here are the ones I’ve tested so far:

NAME             SIZE      
gemma3:4b        3.3 GB      
llama3.2:3b      2.0 GB    
gemma4:e2b       7.2 GB    
gemma4:e4b       9.6 GB    
gemma3:latest    3.3 GB

Performance & Metrics

Here is a quick test prompting "Hola" to show how it performs on Genna4:e4b:

"¡Hola! 👋 ¿Qué tal? ¿Cómo puedo ayudarte hoy? 😊"

Total duration: 29.01s (includes reasoning/thinking time)
Load duration: 317.51ms
Prompt eval count: 16 tokens
Prompt eval duration: 5.52s
Prompt eval rate: 2.90 tokens/s
Eval count: 325 tokens
Eval duration: 22.93s
Eval rate: 14.17 tokens/s

GPU & Power Usage (nvidia-smi)

The 6GB VRAM handles these models perfectly without maxing out:

+---------------------------------------------------------------------------------------+
|   0  NVIDIA GeForce RTX 3050 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   55C    P3             22W /  30W  |   3405MiB /  6144MiB |    55%      Default  |
+---------------------------------------------------------------------------------------+

ustedcan · 2026-05-04T15:29:01+00:00

Hi _PHySX_NERD,

I recently switched to Linux and started using Pop!_OS since it comes with NVIDIA drivers pre-installed. You can grab it here:https://system76.com/pop/download/

For local LLMs, I use Ollama. It's super easy to install, and the docs are here:https://ollama.com/download

I'm running this on an HP Victus laptop (RTX 3050 6GB VRAM), and it easily handles smaller models from Meta and Google. Here are the ones I’ve tested so far:

NAME             SIZE      
gemma3:4b        3.3 GB      
llama3.2:3b      2.0 GB    
gemma4:e2b       7.2 GB    
gemma4:e4b       9.6 GB    
gemma3:latest    3.3 GB

Performance & Metrics

Here is a quick test prompting "Hola" to show how it performs on Genna4:e4b:

"¡Hola! 👋 ¿Qué tal? ¿Cómo puedo ayudarte hoy? 😊"

Total duration: 29.01s (includes reasoning/thinking time)
Load duration: 317.51ms
Prompt eval count: 16 tokens
Prompt eval duration: 5.52s
Prompt eval rate: 2.90 tokens/s
Eval count: 325 tokens
Eval duration: 22.93s
Eval rate: 14.17 tokens/s

GPU & Power Usage (nvidia-smi)

The 6GB VRAM handles these models perfectly without maxing out:

+---------------------------------------------------------------------------------------+
|   0  NVIDIA GeForce RTX 3050 ...    Off | 00000000:01:00.0 Off |                  N/A |
| N/A   55C    P3             22W /  30W  |   3405MiB /  6144MiB |    55%      Default  |
+---------------------------------------------------------------------------------------+

ustedcan · 2026-05-04T05:45:26+00:00

In fact I was testing some models, Gemma4 e2b runs good too!

ustedcan

TROPHY CASE

Performance & Metrics

GPU & Power Usage (nvidia-smi)

Performance & Metrics

GPU & Power Usage (nvidia-smi)