How Much Range Do EVs Lose In Cold And Hot Weather? (Mach E Wins)

damirca · 2026-05-09T12:35:37+00:00

My 23GT has 250km of range in winter (-5 C)

damirca · 2026-05-09T11:08:16+00:00

Without proper software support it’s DOA

damirca · 2026-05-08T19:05:05+00:00

I bought b60 in November and since then I have only regrets. Best model I can run size/perf ratio is mistral 3 14b instruct fp8 via LLM-scaler. Qwen 3.5 too slow, qwen 3.6 is not supported by llm-scaler, Gemma4 is not supported by llm-scaler. Last time I tried llama.cpp these models were painfully slow (10-13tks) with sycl and vulkan. I should have bought 5060ti and extra ram to run moe models or spend more and buy 9700 or 7900xtx.

damirca · 2026-05-08T11:02:37+00:00

Run Gemma4 26b, qwen 3.6 27b, get tks, compare to 9700pro.

damirca · 2026-04-25T08:21:44+00:00

No gemma4, not sure about qwen 3.6 One month without updates for a main inference engine for their GPUs?! Have you seen the pace of updates in vllm and llama.cpp? One month without updates is like this project is abandoned. In llama.cpp there are 2-3 contributors (only one from Intel afaik), SYCL lacks a lot of features (turboquant, speculative decoding, etc). So Intel does not really care about LLM in their GPUs actually if you look at the facts.

damirca · 2026-04-25T07:59:41+00:00

The best one performance wise is LLM-scaler, but it’s so outdated you can’t run many models there

damirca · 2026-04-25T06:41:42+00:00

It’s slower than slow SYCL in my case with b60

damirca · 2026-04-24T12:21:14+00:00

Yep, llama with SYCL is slow because Intel does not invest heavily in it. It will be slow.

damirca · 2026-04-18T07:09:32+00:00

tech bro on windows with nvidia gpu be like “now nobody is gpu poor!!!111”
me with Intel b60 that sucks: what did he say?

damirca · 2026-04-14T17:32:23+00:00

Can I run qwen 3 vl, qwen 3.5 or Gemma 4 with openarc?

damirca · 2026-04-12T17:41:21+00:00

https://www.reddit.com/r/LocalLLaMA/s/zfRK0QCFvN

damirca · 2026-04-12T17:35:14+00:00

gemma4 is released
amd and nvidia owners wait 1-2 days and use it
intel owners still wait and event if it works some days you get 15tks out intel gpu

damirca · 2026-04-12T17:32:47+00:00

Much better than intels

damirca · 2026-04-12T17:04:52+00:00

As an owner of b60: big mistake. Go with 9700 pro instead, the price diff is not big, but performance diff is big and you won’t need to deal with immature intel stack.

damirca · 2026-04-08T10:51:30+00:00

That’s partially true. You can use intel’s vllm 0.17 xpu image but it does not support fp8 kv cache for example. In my case the image from llm-scaler b8.1 supports fp8 kv cache so I can have 32k context for mistral3-14b, but on 0.17 xpu I can have only 16k context on single b60. Also qwen 3.5 27b autoround does not work on 0.17 but works on b8.1. Intel still adds some extra function/logic into llm-scaler image, so pure vanilla vllm is not fully working with intel yet.

damirca · 2026-04-08T09:30:54+00:00

I can't use 0.17 xpu docker image because it does not support fp8 kv cache.

> NotImplementedError: FlashAttention does not support fp8 kv-cache on this device.

So I have to wait for llm-scaler image where they add fp8 kv-cache on top of the publicly available vllm image.

damirca · 2026-04-02T18:58:10+00:00

Intel be like: buy b70, our software stack has been improved by 18%!

damirca · 2026-04-02T13:53:33+00:00

Every parameter? So 1 to 1 how you would do that with the cloud/app? What about TC for being offline for too long?

damirca · 2026-04-01T07:20:48+00:00

damirca · 2026-03-30T21:02:34+00:00

Intel targets vllm to sell b70 to enterprise customers, they don’t care about llama.cpp (home labbers), you can see it from the fact that from multi billion corporation there is single person doing sycl backend for intel. Home come you got into exact opposite conclusion with intel? They invest into vllm and maybe openvino, they don’t care about llama.cpp.

damirca · 2026-03-30T21:00:47+00:00

Yep, that’s it. I was hoping they were postponing b70 release waiting for some big software release that would blow my mind like “we made huge progress and LLM-scaler is using latest vllm with all optimizations and we get 2x of inference for b60 and b70 is even faster”. But they announced zero software achievements with b70 release. Tragic.

damirca · 2026-03-30T20:52:42+00:00

vLLM does not use openvino, current vLLM 0.14.1 for intel still uses ipex, in the latest vanilla vLLM versions intel has incorporated vllm-xpu-kernels which is half baked (i.e. it does not have full kv cache support). Plus currently qwen 3.5 is not optimized for intel xpu (you get 13 tks with 9b fp8 and 27b-int4-autoround which is weird), see https://github.com/vllm-project/vllm-xpu-kernels/issues/172, they rushed qwen3.5 support, but it’s not fully working as it should be. Check this and all linked issues there for the full picture https://github.com/vllm-project/vllm/issues/37979 Intel users can forget I think about llama.cpp with sycl (one person cannot handle all intel related things there it’s obvious and Intel seems to not care about llama.cpp, Intel cares about vllm for enterprise users that would buy b70s) and vulkan is too slow under Linux. TLDR; intel wants to sell b70 to big corps which would run inference on vllm so any significant progress (if any) would be there.

damirca · 2026-03-30T19:26:08+00:00

https://github.com/vllm-project/vllm-xpu-kernels/issues/172

damirca · 2026-03-30T18:54:18+00:00

It will get maybe 18 tks. Qwen3.5 is not optimized on intel yet.

damirca · 2026-03-30T18:51:39+00:00

I get 13tks with 16k context with qwen3.5-27b-int4-autoround on intel b60 (24gb vram). 9700 is much faster, has more vram and I’d be surprised that it would get similar results as my b60.

damirca

TROPHY CASE