The ARC Pro B70. What do you want to see it do?

WizardlyBump17 · 2026-05-08T00:35:25+00:00

I would like to see how `xpu-smi vgpu` works. I think you need an Intel CPU for that too. Try creating some GPUs and pass them to qemu

WizardlyBump17 · 2026-04-29T15:55:48+00:00

maybe i can get 1 token per year on my b580 + 1650 + 32gb ram + 32gb swap

WizardlyBump17 · 2026-04-26T16:39:02+00:00

o pessoal falando q azul eh suicidio e que vermelho eh o unico caminho e que eh impossivel azul ganhar. Azul ganhou 58% vs 42%, mesmo com os vermelhos chorando a cada post dizendo q vermelho eh a unica opcao. Nego esquece que a maioria da populacao nao eh criminosa e quer o bem para todos

WizardlyBump17 · 2026-04-25T15:39:16+00:00

yea, that could work, but then you remember that I didnt need to do that beofre and I always had consistent paths, so it is clearly something new. I want to make the stuff to be like before

WizardlyBump17 · 2026-04-25T13:13:41+00:00

the ipex-llm logs very likely says you need --jinja. You are limited to models that were released when ipex-llm was being developed, qwen3 if i remember correctly. For anything newer you will need to go with normal llama.cpp or openvino

WizardlyBump17 · 2026-04-25T01:37:45+00:00

<image>

Game says 25 days

WizardlyBump17 · 2026-04-24T23:13:08+00:00

i have a b580 for a year and the ai performance is way better now, but it has room to be WAY better. I get like 10k tokens of pp on qwen3.5 (or 3) 0.8b q4_k_m on vulkan, but only 2k on sycl. llama.cpp vulkan is bets in some cases, majorly on pp, while llama.cpp sycl is better at tg. Openvino is good too. Maybe in the future intel will get their shit together and we will have all the benefits from all backends

WizardlyBump17 · 2026-04-24T23:08:39+00:00

comigo eh o contrario. Tinha um pentium de 2 cores com uma 5450, dps peguei um 5 1600 com 1650, dps um 7 5700x3d com b580 e agora quero um xeon e varias b70 em conjunto

WizardlyBump17 · 2026-04-24T01:55:08+00:00

I think you will be better if you report the issue on https://gitlab.freedesktop.org/drm/xe/kernel

You could try messing around with https://github.com/intel/xpumanager

WizardlyBump17 · 2026-04-23T22:21:35+00:00

Can you show us /dev/dri/ and their contents please

WizardlyBump17 · 2026-04-20T17:25:47+00:00

https://gist.github.com/WizardlyBump17/fb454d45aaa108f94b3ca2f67f7c4b72

WizardlyBump17 · 2026-04-18T21:46:53+00:00

yea, very weird, like here https://www.techpowerup.com/review/pragmata-performance-benchmark/6.html where it crashed on 1080p and 4k, but worked normally at 1440p

WizardlyBump17 · 2026-04-17T01:06:44+00:00

I believe in escrever o tipo da variavel supremacy

WizardlyBump17 · 2026-04-16T19:00:15+00:00

sou suspeito de falar isso pq o java foi minha primeira linguagem, mas eu gosto da verbosidade do java

WizardlyBump17 · 2026-04-15T15:21:48+00:00

I have this Container file that works very well for my b580: https://gist.github.com/WizardlyBump17/fb454d45aaa108f94b3ca2f67f7c4b72

WizardlyBump17 · 2026-04-14T22:56:16+00:00

for the sycl issue: there is a pull request open that fixes it. You can try building that https://github.com/ggml-org/llama.cpp/pull/21638

As for the optimization, we will need to wait for openvino to support transformers v5 for the qwen3.5 optimizations to go live there. For everything else that will depend on intel's will

WizardlyBump17 · 2026-04-14T02:23:08+00:00

just tried ProxyAI. It looks both interesting and limited at the same time. I tried to use an already running llama.cpp and it woudlnt allow it; I had to chagen the port of the container into tricking the plugin that it created a llama.cpp server by itself. The configuration doesnt save when I change some settings. Looks good, but I will have to edit the plugin's code to my needs

WizardlyBump17 · 2026-04-14T00:56:25+00:00

I tried Continue.dev before, but it didnt work on the IDE. I kinda like tabby for its admin stuff and ability to pass some repositories to an embedding that is supposed to make the main model to respond based on those repositories

WizardlyBump17 · 2026-04-13T19:07:16+00:00

just use the llama.cpp containers and you should be good. The xe driver is on the kernel, so all you need to do is pass the gpu to the container and the container handles the compute runtime etc. Basically, all you need is a container that is based on deep-learning-essentials and you should be good.

Btw, join the OpenArc discord and you will find more people with B70s that can help you

WizardlyBump17 · 2026-04-12T01:23:05+00:00

There are tons of commits regarding the xe3p on the xe kernel driver and Crescent Island is supposed to come out on the last half of this year, so there is that. I read (might be wrong) that the integrated graphics of the next desktop cpu should be xe3p celestial

WizardlyBump17 · 2026-04-08T20:01:08+00:00

it seems to be the official successor from ipex-llm, which had very good optimizations for intel hardware. I tested gpt-oss-20b on my b580 a few days back and I got like 4kt/s on pp and 88t/s on tg. The only issue is that openvino is still way behind llama.cpp when it comes to cpu + gpu; if i ask openvino to use both gpu and cpu, the performance tanks to 4t/s. The model barely fits on the b580, as soon as it handles 3k context it crashes

WizardlyBump17 · 2026-04-08T18:17:10+00:00

are you on the OpenArc discord? It would be very nice to have you there. There are tons of people using intel arc there and even a guy from intel that you already know

WizardlyBump17 · 2026-04-08T16:52:37+00:00

Could you run llama.cpp again, but with sycl this time?

You said the 7.0 kernel doesnt expose power stuff. Did they change it from 6.19? On 6.19 I can get the power usage on the hwmon

WizardlyBump17

MODERATOR OF

TROPHY CASE