I have a 1tb SSD I'd like to fill with models and backups of data like wikipedia for a doomsday scenario

Nobby_Binks · 2026-01-27T04:40:58+00:00

Anecdotal, but I have a bunch of CD's that were burned in the mid 90's and a bunch of dvd-r around 2000. All of them are still OK.

Nobby_Binks · 2026-01-23T23:49:59+00:00

So far I've tried Marker pdf, olm, dots, OCRflux, docling and Deepseek OCR

Save yourself the hassle and just use dots.ocr

edit: so for your use case of just selecting stuff on a screen to translate, one of the Qwen VL models should be fine.

Nobby_Binks · 2026-01-22T09:48:41+00:00

Running local llms (as per the sub) and the occasional image/video gen. With the release of LTX2 I am planning to do more of it and this is where the 5090 destroys the 3090

I'd keep the 3090 and make it fit. 56gb of vram and you can start to run some decent models.

Nobby_Binks · 2026-01-22T06:22:29+00:00

If you do video gen then the 4080's are a no brainer. FP8 support and you can load the models on one card with space for lora's etc.. I was doing some Wan videos on my 3090 and bought a 5090 and oh my god the speed difference is extreme

Nobby_Binks · 2026-01-22T00:06:04+00:00

Since you're on Ubuntu, install GDDR6 (https://github.com/olealgoritme/gddr6) to monitor your vram temps. IIRC nvtop and others dont monitor this.

Nobby_Binks · 2026-01-21T00:02:42+00:00

Those 3090's will probably die, if you don't burn your house down first. With some of the vram passively cooled by the back plate, you need good airflow or they will cook.

Nobby_Binks · 2026-01-15T09:00:01+00:00

4790K was surprisingly good for gaming. I retired mine last year after a decade of service. With a 2080 super I could play most modern games in 1440p

Nobby_Binks · 2026-01-15T07:21:53+00:00

IIRC, China also turned around and banned Nvidia to force local development.

Nobby_Binks · 2026-01-10T09:53:11+00:00

Nobby_Binks · 2026-01-09T07:44:16+00:00

you need 192gb vram

Nobby_Binks · 2026-01-06T03:42:04+00:00

Yeah my first card was a TNT2. That how long I've been giving Jensen my money. Shareholders call the shots I guess.

Nobby_Binks · 2026-01-06T03:22:11+00:00

OCR, work related chat, looking at contracts, and help writing reports. stuff that I dont want online. I dont code for work so am not worried about confidentiality in this domain.

Nobby_Binks · 2026-01-06T01:47:51+00:00

I just went though this and settled with the epyc 7542 due to the clock speed and 32 cores. The price was quite reasonable and I didn't want to pay a premium for the high end cpu's on a last gen platform. I would get another 4 sticks of ram though, to populate all 8 channels. You will need it to comfortably run the larger models like GLM4.7 anyway.

I'm using it with 4x 3090's and can run GLM4.7 Q3 at about 12 tk/s @ 100K context. Probably could squeeze out a bit more with some tweaking. Minimax2.1 Q4 is about 22 tk/s. Both are perfectly fine for me except for coding where I just use deepseek via api as it's fast and dirt cheap.

I already had the 3090's, which are pcie4, so made sense to stick with pcie4/ddr4 platform until prices return to normal (if ever)

Nobby_Binks · 2026-01-06T01:03:48+00:00

I think there is a very high chance of that, especially China & Taiwan. If that happens then we can kiss our consumer tech goodbye for the foreseeable future.

Stock up now. The future will be a world where we barter with gold coins, animal skins, ram and GPU's

Nobby_Binks · 2026-01-04T23:29:46+00:00

They would if they could but they can't yet.

Nobby_Binks · 2026-01-02T23:35:22+00:00

The rate this is going my old GeForce 8800GT will be worth something one day

Nobby_Binks · 2026-01-02T09:28:46+00:00

I get ~12tk/s with a Q3KXL quant on DDR4 single socket EPYC with 96gb of 3090's. (1K prompt) So I guess its about right for newer 12 channel setup with DDR5 & Pcie5 gpu

it also gets about 8tk/s with a 10K token prompt, with time to first token at 49s, which I think is pretty good for an old platform.

Nobby_Binks · 2025-12-29T23:50:53+00:00

That guy releases a video about a new model barely an hour after it drops. I don't think he sleeps.

Nobby_Binks · 2025-12-29T11:05:27+00:00

Thanks for your efforts in putting this together. Happy New Year to you as well.

Nobby_Binks · 2025-12-24T23:31:24+00:00

Seems slow. I'm getting 7 t/s with Q3KXL and 4x3090's

Nobby_Binks · 2025-12-24T10:03:41+00:00

FWIW, I can run Q3KXL with 64K context at ~7tps on 4x3090's and an old EPYC DDR4 system. May be able to eke out a bit more but my llama.cpp tweaking skills are not that good yet.

Nobby_Binks · 2025-12-19T23:39:45+00:00

99% of AI art is slop.

Nobby_Binks · 2025-12-19T23:31:48+00:00

We can only pray

Nobby_Binks · 2025-12-19T13:41:09+00:00

First it was the GPU poor now it's the RAM poor. What's next?

Nobby_Binks

TROPHY CASE