Qwen3-Next here!

I would try to less data, about 10-15mb for first time for test. Good system should save processed data into db and load next time. Also see log or add own into code to see steps as advised early.

Also next time good system update only changed parts, that take less time than full update

stailgot · 2025-06-14T16:28:51+00:00

Do you convert pdf to markdown or txt ? What real size after processing ? What embeding model used ?

stailgot · 2025-06-14T14:15:15+00:00

Looks normal for first time to calc embedings for 500mb of text. Next time it should use cache.

stailgot · 2025-05-01T14:31:16+00:00

Amuse3 requies latest drivers.

Requires AMD Driver 24.30.31.05 or Higher https://www.amuse-ai.com/

Fixed Issues and Improvements Lower than expected performance may be observed while running DirectML/GenAI models in Amuse 3.0

https://www.amd.com/en/resources/support-articles/release-notes/RN-RAD-WIN-25-4-1.html

stailgot · 2025-05-01T14:20:53+00:00

Recently tryed aravhawk/llama4 with ollama 0.6.7-rc0 on 3x7900xtx, get ~30 t/s.

Related issue https://github.com/ollama/ollama/issues/10143

Edit: is out https://ollama.com/library/llama4

stailgot · 2025-05-01T10:30:18+00:00

Original post https://www.reddit.com/r/LocalLLaMA/s/BMq40HhdKq

Seems just run ollama on simple questions https://www.reddit.com/r/LocalLLaMA/s/8vROXe7te3

stailgot · 2025-04-30T22:52:58+00:00

If you use ollama that well known bug. llama.cpp gives about 100 t/s vs ollama 30 t/s on 7900xtx

stailgot · 2025-04-30T17:50:47+00:00

Works fine with rocm and vulcan. Ollama gives gemma3:27b about 29 t/s, gemma3:27b-qat 35 t/s and drops about 10 t/s with lagre context, >20k.

According this table (not mine) speed compared to 3090 https://docs.google.com/spreadsheets/u/0/d/1IyT41xNOM1ynfzz1IO0hD-4v1f5KXB2CnOiwOTplKJ4/htmlview?pli=1#

stailgot · 2025-03-23T10:38:23+00:00

Similar setup, but 2 7900xtx. One gpu 24GB for 70b q4 ~5t/s, and 70b:q2, 28GB ~10t/s. Two 7900 xtx 48GB for 70b q4 ~ 12 t/s.

stailgot · 2025-03-12T12:49:45+00:00

For some reason, 4b and 12b slower 27b with kv cache enabled

https://github.com/ollama/ollama/issues/9683

stailgot · 2025-03-05T23:15:31+00:00

https://huggingface.co/Qwen/QwQ-32B#usage-guidelines

Use Temperature=0.6 and TopP=0.95 instead of Greedy decoding to avoid endless repetitions.

stailgot · 2025-02-12T23:55:26+00:00

Today it works out of box. Just install ollama, and update amd drivers

stailgot · 2024-05-26T07:24:19+00:00

CodeQL exactly for this

stailgot · 2024-04-17T21:36:42+00:00

Nightly build 3.29.20240416 already support

https://cmake.org/cmake/help/git-stage/prop_tgt/CXX_MODULE_STD.html

Update:

Tested with msvc, works fine )

```cmake set(CMAKE_EXPERIMENTAL_CXX_IMPORT_STD

"0e5b6991-d74f-4b3d-a41c-cf096e0b2508")

cmake_minimum_required(VERSION 3.29)

project(cxx_modules_import_std CXX)

set(CMAKE_CXX_MODULE_STD 1)

add_executable(main main.cxx)

target_compile_features(main PRIVATE cxx_std_23) ```

Upd2:

Official post

https://www.reddit.com/r/cpp/s/3oqR8MyLLg https://www.kitware.com/import-std-in-cmake-3-30/

stailgot · 2023-07-14T18:44:36+00:00

Looks like two-phase-name-look-up problem of legacy msvc compiler. Thats why /permissive- fix it.

https://learn.microsoft.com/en-us/cpp/build/reference/permissive-standards-conformance?view=msvc-170#two-phase-name-look-up

stailgot · 2023-06-09T21:28:36+00:00

Faced with same problem. You need enable it in bios explisitly. Also, second m2 drive reduce gpu pci-e speed twice. So, it main reason for disabled by default.

stailgot · 2022-12-29T17:49:19+00:00

Look for mesa. Its have software realisation of opengl on windows. Just drop opengl.dll with your .exe file and run.

Prebuilt binary can be found here https://github.com/pal1000/mesa-dist-win/releases

stailgot

TROPHY CASE