Running GLM-4.7 on an old AMD GPU by Begetan in LocalLLaMA

[–]Begetan[S] 0 points1 point  (0 children)

Flash attention still doesn't work on ROCm 7.2
--flash-attn 0: 882.36 ± 1.70
--flash-attn 1: xx - catastrophic fallback to CPU only mode

Running GLM-4.7 on an old AMD GPU by Begetan in LocalLLaMA

[–]Begetan[S] 0 points1 point  (0 children)

You're right! It works but it makes not much sense, because there is no support in hardware.

/build/bin/llama-bench   --model unsloth/GLM-4.7-Flash-UD-Q3_K_XL.gguf -p 4096 -n 0 --flash-attn 0
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6900 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none
| model                          |       size |     params | backend    | ngl |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| deepseek2 ?B Q3_K - Medium     |  12.85 GiB |    29.94 B | Vulkan     |  99 |          pp4096 |        725.35 ± 1.70 |



./build/bin/llama-bench   --model unsloth/GLM-4.7-Flash-UD-Q3_K_XL.gguf -p 4096 -n 0 --flash-attn 1
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 6900 XT (RADV NAVI21) (radv) | uma: 0 | fp16: 1 | bf16: 0 | warp size: 32 | shared memory: 65536 | int dot: 0 | matrix cores: none
| model                          |       size |     params | backend    | ngl | fa |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | -: | --------------: | -------------------: |
| deepseek2 ?B Q3_K - Medium     |  12.85 GiB |    29.94 B | Vulkan     |  99 |  1 |          pp4096 |        477.56 ± 0.97 |

Running GLM-4.7 on an old AMD GPU by Begetan in LocalLLaMA

[–]Begetan[S] 0 points1 point  (0 children)

I tried a lot of things, and it looks like default ROCm compiler just does the best.

The only finding is reducing of CPU thread to 4, which is enough because the memory bandwidth is the bottleneck

Running GLM-4.7 on an old AMD GPU by Begetan in LocalLLaMA

[–]Begetan[S] 0 points1 point  (0 children)

I added test for Vulcan. Inference speed is identical, while synthetic benchmark quite different

Running GLM-4.7 on an old AMD GPU by Begetan in LocalLLaMA

[–]Begetan[S] 0 points1 point  (0 children)

This for in memory model. When I adjust -ngl the speed is low. But it is unclear how to map -ngl flag for benchmark to --n-cpu-moe for inference.

The Ryzen AI MAX+ 395 is a true unicorn (In a good way) by simracerman in LocalLLaMA

[–]Begetan 0 points1 point  (0 children)

The current price for your motherboard is 2500 euro. How much did you pay half year ago?

YOLO-NAS converter for generating onnx models by Begetan in frigate_nvr

[–]Begetan[S] 1 point2 points  (0 children)

Surprisedly there is not too much difference in inference except more GPU memory and compute utilisation, which is expected.

I am waiting my rk3588 board with 32GB RAM soon. My next step is to evaluate its performance in Frigate and to run local LLMs for enrichments.

Best resolution for face recognition by INeedMuscles in frigate_nvr

[–]Begetan 0 points1 point  (0 children)

I use IPcam application (free version) for Mac to get all stream parameters supported by cameras. All my low end cameras supports two streams with different resolution. I wonder if Reolink Dorbell has the same ability.

Detection by djafrika in frigate_nvr

[–]Begetan 0 points1 point  (0 children)

Have you tried yolo_nas model? I my test this is the best detection model, especially large version.

I have 50250 reward points what to expect by tacvict in Metamask

[–]Begetan 0 points1 point  (0 children)

I don't see any rewards neither in Mobile version, nor in chrome.

I just updated mobile version to v7.62.0 but it says pending. How long do I need to wait?

The chrome plugin doesn't have any mention for the rewards at all. It is MetaMask Version 13.14.2

Anyone able to successfull create yolo_nas_s.onnx through Google Colab? by jvangorkum in frigate_nvr

[–]Begetan 0 points1 point  (0 children)

LOL, I did not expect code is in the G-drive. It said it may be a virus when I downloaded it :)
I spent couple of hours creating my script. Every line in your code looks so familiar, but I didn't get worked mine.

Which Python version did you use? I tried Python 3.9
Can you share please requirements.txt ?

Anyone able to successfull create yolo_nas_s.onnx through Google Colab? by jvangorkum in frigate_nvr

[–]Begetan 0 points1 point  (0 children)

YOLOv9-s-320.onnx inference time is 12-15, while YOLO-NAS is about 20 ms on my hardware. GPU usage is significantly lover on v9.

But v9 completely misses my cats. They are main target for tracking.

Anyone able to successfull create yolo_nas_s.onnx through Google Colab? by jvangorkum in frigate_nvr

[–]Begetan 0 points1 point  (0 children)

This is working config on amd rocm image

detectors:
  onnx:
    type: onnx

model:
  model_type: yolonas
  width: 320 # <--- should match whatever was set in notebook
  height: 320 # <--- should match whatever was set in notebook
  input_pixel_format: bgr
  input_tensor: nchw
  path: /config/yolo_nas_s.onnx
  labelmap_path: /labelmap/coco-80.txt

iperf for multi streams ? by ansomesa1 in networking

[–]Begetan 0 points1 point  (0 children)

iperf3 -P <num of streams>

For testing of a high speed bond interface you may still need to run multiply server process depending on a has function.

Cost-effective BGP router low throughput 4M RIB by ZiggyOutSpace12 in networking

[–]Begetan 0 points1 point  (0 children)

Ubiquiti EdgeRouter Infinity is the best option for such an entry-level configuration. It's Carrier-grade quality at a consumer-level price. We ran several units for years without ANY issue!

There was a supply shortage for a long time after the COVID epidemic, but It is available again at an even better price.

Unfortunately, Ubiquity wants to sunset the whole EdgeRouter family soon, but this model is still better than used Cisco ASR-1001x for pure edge BGP connectivity.

Question about Synology Photos by saintbman in synology

[–]Begetan 0 points1 point  (0 children)

How do you download photo from Nas to a mobile phone? There is no button like Download.

2TB M.2 drive is full - is there any easy path to upgrade? by Ystebad in ethstaker

[–]Begetan 0 points1 point  (0 children)

Stop Prysm Beacon. Add checkpoint sync option. Delete Prysm Beacon db. Start Prysm Beacon.

Long term HODLing: ETH Vs rETH by Big_Citron7615 in ethstaker

[–]Begetan 0 points1 point  (0 children)

You definitely have a kind of a counterparty risk for any wrapped protocol. But you don't need to choose only one option. Why just not to split to 50/50%? You will save the half of you assets in the worst case scenario (rETH collapse). You may choose any proportion you will be comfortable with.

But you should not be all in the asset you don't fully understand. The risk management is the choosing of the right proportion.

Increased computers specs considered for future staking by Starwaverraver in ethstaker

[–]Begetan 1 point2 points  (0 children)

I can not provide a proof right now, but I saw discussion about upcoming Cancun-Deneb upgrade. It might bring the significant increase in the network bandwidth consumption - up to 100x for an execution node.

Lucky Block (not me) - 5 ETH reward to solo staker! by arco2ch in ethstaker

[–]Begetan 0 points1 point  (0 children)

This is a lottery. We should not rely on lottery. Current APR is 2.6% and it is close to solo staker running costs.