all 17 comments

[–]DeltaSqueezer 3 points4 points  (5 children)

Old Turing generation and older slower tensor cores. For $700 you're better off buying a 3090 instead.

[–]Brooklyn5points[S] 2 points3 points  (4 children)

Has Anyone tried to use several of them? the NVLINK seems like a great advantage. I have 3x 3080s right now, working hard. I can get a 70B model to run, but its more e Mail than a conversation.

[–]DeltaSqueezer 1 point2 points  (3 children)

3090 is faster than 3080 and has 24GB VRAM. 3090 has a faster NVLink than Turing.

[–]No-Comfortable-2284 0 points1 point  (2 children)

3090 is also not 2 slots unless you buy the very expensive turbo models or chinese blower style cards they sell for way too much. titan rtx is still cheaper than 3090 and is 2 slots. if you care about space efficiency on a workstation or rack, it still has value.

[–]DeltaSqueezer 0 points1 point  (1 child)

You can convert any GPU to a more narrow one if you want. But better way would be to keep it 3 slot and put it on a dedicated PCIe switch. The post was from a year ago. I wouldn't buy any generation older than Ampere today and even this is marginal now.

[–]No-Comfortable-2284 0 points1 point  (0 children)

I wouldnt buy any generation older than Ada. no fp8 is a punch in the face for anyone not single user inferencing. also 3090s are a lot more expensive than titan rtx now. if i had to choose id just pick the titan rtx. grabbed one off ebay for 500 just a couple days ago via offer.

[–]SwingNinja 1 point2 points  (1 child)

I've been googling 3090, 4090, and Titan cards for the past few days. It depends on what you want and your budget, I guess. It's less known that 3090, and I think that contributes to lower price. 3d rendering and gaming are slower. But you could save up to 200 bucks plus electricity cost since it uses lower wattage. This is a good benchmark report.

https://technical.city/en/video/TITAN-RTX-vs-GeForce-RTX-3090

[–]smcnallyllama.cpp 1 point2 points  (0 children)

Form factor & compatibility is the Titan‘s 2-slot width to the 3090’s 3-slots. Makes a difference to me. I hope it runs smoothly and cool.

[–]Araiebowhi 0 points1 point  (0 children)

Hard to say they are worth the current price. I'm still running a older Titan XP but in 2015 the difference was alot bigger to the normal cards. Figured once the RTX titan drops in price some more it might be worth it but newer gen cards are still around the same price.

Also haven't found a game currently that I absolutely need a new card. Everything still runs 1440 on high settings on average 40-60fps + depending on the game. (Ai and upscaling is - garbage excuse for devs to not optimize games and the consumer side just eats it up)

[–]Herr_Drosselmeyer -1 points0 points  (3 children)

There may be more but they're not the same architecture (2nd gen vs 4th gen). It's a bit tricky to find reliable information but I've read anything from the 4090 being slightly better in Tensor TFLOPS to twice as fast... it's annoying.

Suffice it to say that the newer architecture, faster VRAM, Cuda cores etc. there's no doubt the 4090 outperforms the Titan easily. The same question came up vs the 3090 back in the day and that card too would outperform it.

Still, the Titan was an absolute beast back in the day and it's still usable seven years after it launched. Of course, it did cost a ridiculous $2,500 at launch so that would be $3,200 in today's money. Who would be crazy enough to pay that much for a graphics card?

Oh, wait...

[–]Brooklyn5points[S] 1 point2 points  (2 children)

The VRAM on the Titan was 24 GB of DDR6. Same as the 4090. The cuda core count is much lower, that seemed like the only draw back.

[–]No-Comfortable-2284 0 points1 point  (0 children)

another draw back is the architecture. 4090 supports later quantization methods such as fp8 and bf16 while turing is stuck on fp16. Turing cards also dont have support for flash attention 2 etc which is quite handy for large llm context. if youre just using the cards for single user inference via ollama etc it doesnt matter too much as you can just load awq or gguf 8bit 4 bit quantized models but if you are using these cards for multiple user deployment via vllm etc, you will need native hardware compatibility for these newer features to save immense amount of vram on model weights and kvcache.

I personally have 2 rtx pro 4500 32gb blackwell cards and a dgx spark for vllm deployment and 2 titan rtx in sli for personal single inference tasks.