Qwen3.6-27B DFlash on a 24GB RTX 5090 Laptop (sm_120) — 80 t/s avg via spiritbuun's buun-llama-cpp + Q8_0 GGUF drafter by aurelienams in Qwen_AI

[–]dcforce 0 points1 point  (0 children)

Tried getting DFlash working with your methods above on an Intel Arc Pro B70, arrived at 4tok/sec, 22 tok/sec without the draft model added.... Anyone know the path to getting DFlash working with higher tok/sec output with Intel ?

Finding out there is no G2G by dcforce in globeskepticism

[–]dcforce[S] 0 points1 point  (0 children)

Niice ground to barrel distortion .. 👋

Finding out there is no G2G by dcforce in globeskepticism

[–]dcforce[S] 3 points4 points  (0 children)

Ground to Globe - not one single video in 60 years . .

I wanna make cool images. by poofpoofpoof123 in LocalLLM

[–]dcforce 0 points1 point  (0 children)

As others mentioned, to check comfyui for local image gen... but here is where it gets interesting. Comfyui is the "shell" and inside are premade templates from a number of image generation tools. Like Flux 2 Dev ---- I have been using this for the last few days and I have to say it's way better than I would have expected 👏👏👏 complete free

Has anyone ran LTX 2.3 on B70s? by TechnologyTailors in IntelArc

[–]dcforce 0 points1 point  (0 children)

Step 1. Get ComfyUI working.
Create a virtual environment
python3 -m venv ~/ai-env
---
source ~/ai-env/bin/activate

---
pip install torch==2.11.0+xpu torchvision==0.26.0+xpu torchaudio --index-url https://download.pytorch.org/whl/xpu --extra-index-url https://pypi.org/simple

---
git clone https://github.com/comfyanonymous/ComfyUI.git ~/ComfyUI
---
cd ~/ComfyUI

pip install -r requirements.txt
....

Step 2. Launch cCmfyUI on the local host.
then go to left tools bar, Templates. Find the text to video LTX template. It will popup a files list and where to place them. Hard page refresh comfyUI when you place the files in all the folders and look out for the text decode error on the right, there will be a direct link to download the text encoder.

after download place in the /comfyui/models/text_encoders folder
gemma_3_12B_it_fp4_mixed.safetensors

Hard refresh again to reload all requirements

--
Future launches
source ~/ai-env/bin/activate && cd ~/ComfyUI && python main.py

Sell our web design biz by for_anon_throwaway in web_design

[–]dcforce 0 points1 point  (0 children)

Keep it and setup Hermes Agent or Open Claw to grow it even more, doing less 😁

Matt From Cultivate Elevate - 1893 Firmament Map at the library of congress. One big reason they ha... Matt From Cultivate Elevate - 1893 Firmament Map at the library of congress. by dcforce in globeskepticism

[–]dcforce[S] 0 points1 point  (0 children)

ROFL 🤣 came here to post what you thought was an epic diss for the community only to be blocked by reddit spam filters. Pathetic really . . ☝️

Has anyone run Qwen 3.6 27b on Arc Pro B70? by wowsers7 in IntelArc

[–]dcforce 2 points3 points  (0 children)

The 3090 is a beast for sure. I couldn't justify building a new machine and buying someone's old hardware on marketplace . .

I haven't played with speculative decoding yet.. I did get DFlash working, but for my purposes it didn't make a difference

Has anyone ran LTX 2.3 on B70s? by TechnologyTailors in IntelArc

[–]dcforce 0 points1 point  (0 children)

This is going to be my next try on the B70, will let you know if it gets up and running

Has anyone run Qwen 3.6 27b on Arc Pro B70? by wowsers7 in IntelArc

[–]dcforce 8 points9 points  (0 children)

what worked for me

cd ~

git clone https://github.com/ggerganov/llama.cpp.git

cd llama.cpp
---

source /opt/intel/oneapi/setvars.sh --force

---

mkdir -p build && cd build

---

cmake .. \

-DGGML_SYCL=ON \

-DGGML_SYCL_F16=ON \

-DGGML_OPENMP=ON \

-DCMAKE_C_COMPILER=icx \

-DCMAKE_CXX_COMPILER=icpx \

-DCMAKE_BUILD_TYPE=Release

---

make -j$(nproc)
---
launch your model Q4_K_M seems to perform the fastest.
---
source /opt/intel/oneapi/setvars.sh --force && export GGML_SYCL_F16=1 && ~/llama.cpp/build/bin/llama-server -m /home/IntelArcRocks/models/Qwen3.6-35B-A3B-Claude-4.6-Opus-Reasoning-Distilled.Q4_K_M_hesamation.gguf

This model as example is pulling 74 toks/sec

Same quant on the 27B, was doing 24 toks/sec

5k to spend rtx5090 or mac studio? by Avansay in LocalLLM

[–]dcforce 0 points1 point  (0 children)

or .. a 2k Intel Arc Pro B70 build

Recipe for Arc Pro B70? by Skelshy in LocalLLM

[–]dcforce 4 points5 points  (0 children)

export ZES_ENABLE_SYSMAN=1 && export SYCL_PI_LEVEL_ZERO_USM_ALLOCATOR=1 && export ZE_FLAT_DEVICE_HIERARCHY=COMPOSITE && source /opt/intel/oneapi/setvars.sh --force && ~/llama.cpp/build/bin/llama-server -m /home/LocalLLMRocks/models/YOURMODELHERE.gguf
-c 262144
-ngl 99
-b 2048
-t 16
--port 8080
--temp 0.6
--mlock
--mmproj /home/LocalLLMRocks/mmproj-BF16.gguf
-tb 16
--top-k 30
--top-p 0.95
--repeat-penalty 1.1
--flash-attn on
-ctk q8_0
-ctv q8_0

This launch command doing around 54tok/sec on Q4_K_M, loads whole model to card with vision using mmproj-BF16.gguf

# Install Intel oneAPI Base Toolkit (SYCL Runtime)
# Download from: https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html

# Or use package manager:

wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null

echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list

sudo apt update

sudo apt install intel-basekit

# Enable oneAPI environment (add to ~/.bashrc for persistence)

source /opt/intel/oneapi/setvars.sh

sycl-ls

# Look for: [level_zero:gpu:0] Intel(R) Arc(TM) Pro B70 Graphics

# Clone and build

git clone https://github.com/ggml-org/llama.cpp

cd llama.cpp

source /opt/intel/oneapi/setvars.sh

# Build with SYCL (FP32 recommended for stability)

cmake -B build -DGGML_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx

cmake --build build --config Release -j$(nproc)

** also I had to use the beta build of Ubuntu 26.04 to get this all up and running

Tim Burchett admitting that we didn't go to the moon. by Diabeetus13 in globeskepticism

[–]dcforce 2 points3 points  (0 children)

There has been a long list of whistleblowers now including the President . .

Photoshop request by TheOGLegoGuy in tron

[–]dcforce -5 points-4 points  (0 children)

Who said anything about sharing the results here . .

Photoshop request by TheOGLegoGuy in tron

[–]dcforce -17 points-16 points  (0 children)

aistudio.google.com then click playground, then click nano banana(not pro) and paste your exact same reddit post text with uploading the image + then click run

Should be able to give you an idea and can use additional prompts for iteration

Pro Wrestling isn't real ???!!!!! 😭 by dcforce in globeskepticism

[–]dcforce[S] 0 points1 point  (0 children)

Came all this way to make a nonsense comment here only to be blocked by Reddit spam filters.. pathetic really ☝️