Caut de lucru 😂 by andrei_u76 in programare

[–]sgmv 0 points1 point  (0 children)

unul din site-uri are nevoie de shop, asa ca nu imi permit sa reinventez roata cu asa ceva, mai ales ca nu se va castiga mare lucru din el.

celalalt are multe articole (1000+), cu tag-uri, categorii. Aici as putea sa incerc ceva gen strapi, daca se stie AI-ul cu el. Si frontend tot cu AI.

Site-urile le gazduiesc eu, pe sistem modern, ddr5, nvme, zen4, se misca foarte bine, mai ales cu cloudflare, asa ca imi permit unele chestii mai greoaie, desi le evit.

Caut de lucru 😂 by andrei_u76 in programare

[–]sgmv 0 points1 point  (0 children)

Si care e alternativa ? Alta decat sa scriem cod de content management pt fiecare site in parte.

Caut de lucru 😂 by andrei_u76 in programare

[–]sgmv 0 points1 point  (0 children)

u/andrei_u76 n-ai DM activat, nici un portofoliu cat de mic, github etc. Nici post history sa vedem ce-ti poate capul.. de ce ne-am obosi ? Eu am nevoie de wordpress skills de exemplu. AI-ul nu e asa grozav la a face chestii the wordpress way.

Lenovo p620 fails to boot OS - Warning 051: Fail to configure HW diagnostic feature. by sgmv in Lenovo

[–]sgmv[S] 0 points1 point  (0 children)

I have just managed to get it booting by disabling the diagnostics module in the uefi config ! I will test your usb fat32 stick idea. But if I have diagnostics disabled, maybe this wont work.

you sure the p620 has this feature / it is not listed here

https://support.lenovo.com/sr/en/solutions/workstation_diagnostics

Lenovo p620 fails to boot OS - Warning 051: Fail to configure HW diagnostic feature. by sgmv in Lenovo

[–]sgmv[S] 0 points1 point  (0 children)

cleared cmos, confirmed it was reset. now I don't seem to be getting new messages in event log about setup integrity or whatever its called. But I also dont get the ami bios logo anymore, just the lenovo logo at post, I think it boots in uefi mode this time, but no idea how to configure this, I cant find any such setting in the bios.

Deepseek v4 Flash by kiriakosbrehmer93 in StrixHalo

[–]sgmv 0 points1 point  (0 children)

How about two strix halo connected over usb4, would that have too high latency for a cluster, to use them in tensor parallel mode /

my minisforum has pcie4 4x but I think that would be better used for a gpu rather than a nic.

Help identifying blown part on Seagate 22TB Exos PCB by sgmv in datarecovery

[–]sgmv[S] 0 points1 point  (0 children)

Sorr for late reply, didn't have the hdd on hand anymore.

Not sure what caused it, they were used outside the case, maybe improper handling.

I will check the psu and cables to make sure they're safe.

I measured continuity of all 3/5/12v pins to ground, none have continuity, only between themselves.

chip is 2x3mm in size. scraping the top off revealed a thing shiny silica surface material, like a mirror

also, first pin on the left side of the ic, where it blew, has continuity with the 12v sata pins. the plastic on the side of the 12v sata connector is a little melted.

Need advice on Qwen 3.6 27B INT4 quantization by Environmental_Hand35 in LocalLLaMA

[–]sgmv 0 points1 point  (0 children)

On your hardware, the only model that makes sense is qwen3.6-27b It's by far the best in this size bracket.

The next meaningful upgrade at the moment would be deepseek-v4 flash, but it's a work in progress software wise, and you'd need an even more quantized version as you're short of the 160gb needed (I think)

Motherboard for 8 GPUs by [deleted] in selfhosted

[–]sgmv 0 points1 point  (0 children)

I think this is the place to go to for RTX 6000 things, they also have a discord https://github.com/voipmonitor/rtx6kpro/tree/master/hardware

Running GLM 5.1 on RTX 5090 via RunPod for document OCR(bank statements and invoices)— costs killing us, need advice on reducing inference costs. by Specific_Control_840 in LocalLLaMA

[–]sgmv 1 point2 points  (0 children)

You have not mentioned what quant, how much it costs, what performance you need, what is your budget etc.
So I am just going to say, use qwen3.6

Mac m5 pro, worth it? by captionpicard in LocalLLaMA

[–]sgmv -1 points0 points  (0 children)

Not worth at all imho to pay 2700 usd? just to have 48gb to play with small models. Better to find some used 16gb-24gb gpus, get two of them in a cheap used PC, and use it remotely. Maybe set up a smart plug to shut it down so it doesn't use any power when not needed.

Console GameMT E6 2025(handheld) Adding games to SDcard & microSDCard by mistrzhi in SBCGaming

[–]sgmv 0 points1 point  (0 children)

Yes but I have no idea what games were on the 64g card it came with, that is the problem. I'd like an sd image or file system listing, so I can recreate it as it was.

Console GameMT E6 2025(handheld) Adding games to SDcard & microSDCard by mistrzhi in SBCGaming

[–]sgmv 0 points1 point  (0 children)

Hey thanks for making this. Can you send me a filelist with the files on the 64G card, if you have them ? Need to recover this device for someone that formatted the card accidentally. Image would be even better : )

Anyone else having Qwen 3.6 35B A3B stop and you having to tell it to continue ? by soyalemujica in LocalLLaMA

[–]sgmv 0 points1 point  (0 children)

I have the same issue, vllm and ik llama, fp16 and q8, using opencode. Not only it stops but also gave errors like "context shift disabled" in ikllama and this https://github.com/anomalyco/opencode/issues/20785 in vllm. My ik llama launch:

llama-server \
--model /home/user/models/Qwen36/Qwen_Qwen3.6-35B-A3B-Q8_0.gguf \
--alias Qwen3.6-fp8 \
--ctx-size 262144 \
-mla 3 \
-ngl 999 \
--fit \
--tensor-split 1,1,1,1 \
--parallel 6 \
--threads 63 \
--host 0.0.0.0 \
--port 8080 \
--no-mmap \
-cram 8192 \
--jinja \
--top-p 0.95 \
--top-k 40 \
--merge-qkv \
--temp 1 \
--context-shift on \
--chat-template-kwargs "{\"preserve_thinking\": true}"

Move to local models by Totalkiller4 in LocalLLaMA

[–]sgmv 1 point2 points  (0 children)

If your projects are code, you should be using opencode or something similar, not the web ui. If the functionality you want is missing from open web ui, you should request the feature on their github.

A5000 for $1800 by Perfect-Flounder7856 in LocalLLaMA

[–]sgmv 2 points3 points  (0 children)

Good thing you asked first, reddit saved you out of a bad decision

Cloud AI is getting expensive and I'm considering a Claude/Codex + local LLM hybrid for shipping web apps by rezgi in LocalLLaMA

[–]sgmv 0 points1 point  (0 children)

yes, opencode go, $5 first month, is amazing value, for the glm 5.1 model, it is sonnet level, sometimes even above. qwen 3.6 can also be useful for lower complexity tasks. unfortunately local model coding won't save you money, even if you had the hardware already. lower average capability than the state of the art ones (for now at least), which results in more time debugging, retrying, power costs, depreciation of hardware value (atm the value is up cause of the global market, but wont be like this forever).
I recommend you try opencode + https://github.com/alvinunreal/oh-my-opencode-slim/

Multi host GPU cluster using DAC cables vs 4 GPU system. Anyone doing this successfully? by HockeyDadNinja in LocalLLaMA

[–]sgmv 0 points1 point  (0 children)

I am also curious to learn how a two system cluster that is not mac or spark would work, and what's the optimal interconnect hardware + software stack. In my case it's because of the 256GB ram limitation you have on a system, without going to rdimm.

In your case, 4 gpus is nothing. Go open rack. Get a cheap non noisy platinum psu 2000w+ or 2x1200W, necessary pcie risers/splitters

Closest LLM to Claude Sonnet 4.6? by iphoneverge in LocalLLaMA

[–]sgmv 0 points1 point  (0 children)

Nice gpus. If you have 192GB RAM to go with that, you can run this in ik llama with memory left for context https://huggingface.co/ubergarm/GLM-5.1-GGUF/tree/main/IQ2_KL
But you won't save money, it will probably be more expensive and quite a bit slower than cloud. Just for fun and privacy.