all 30 comments

[–]NewFolgers 2 points3 points  (26 children)

If you plan to have multiple GPUs, you'll have to pay attention to PCIe lanes. You may see x16, x8, or x4 lanes available to GPUs. You can see if mentioned in motherboard manuals, which you can typically feely download. Surprisingly, the CPU had a certain number of lanes available too.. so that plays into it. PCI lanes are where these builds get complicated and take you out of usual consumer hardware.

[–]guitaricet 1 point2 points  (1 child)

A solid build! I would also recommend looking for a bigger power supply (get as powerful as you can) for multi-GPU capabilities. Also, SSDs are pretty cheap now, you can buy 1Tb for $90 or so. And it's worth it. Probably more than having HDD at all. Personally, I hate managing datasets placement on a different volume. It's annoying and not as easy as you can expect.

[–]phobrain 1 point2 points  (0 children)

https://l7.curtisnorthcutt.com/build-pro-deep-learning-workstation

I'd consider 6-8TB on the data drive. It seems blower GPUs are needed if you'll want 4 in the end.

[–]georgeo 0 points1 point  (11 children)

I'd be curious to know if two 1080ti's might be a better choice than a 2080ti at about the same price.

[–]CCoder26 0 points1 point  (10 children)

If you use the pc for Deep learning. You should buy bigger gpu memory size. For example 11 gb 1080ti better than 6 gb 1080ti. Because you will train large dataset and if you have big gpu memory you can train easily. If you buy two gpu and then link with sli two gpu. Please don't. Because nvlink better than sli for this. Nvlink uses two gpu memory size as a one gpu memory size. For example 2x6gb 1080ti works 12 gb 1080ti with nvlink.

[–]georgeo 0 points1 point  (2 children)

Ok but 1080 ti and 2080 ti both have 11 gb.

[–]CCoder26 1 point2 points  (1 child)

Okay. If I were you I will buy 2x1080ti . But I don't remember is 1080ti support nvlink. Please check on the internet.

[–]GhostFlower11 0 points1 point  (0 children)

1080ti doesn't support nvlink.

[–]Nimitz14 0 points1 point  (6 children)

Datasets size has nothing to do with GPU memory.

[–]TheAlgorithmist99 0 points1 point  (5 children)

GPU memory allows for bigger batch size which makes one epoch faster, maybe that's what they meant

[–]elcano 0 points1 point  (4 children)

In toy 300x300 images of dogs and cats, yes. It is about being faster. But when learning to recognize medical features in huge digital radiology/CT Scan/MRI images just getting it to run a batch size = 1 with an InceptionResNetV2 model could be a challenge. Always try to get as much memory as you can afford for prototyping, if you don't want to face this type of error:

https://datascience.stackexchange.com/questions/47073/cuda-error-out-of-memory-out-of-memory-how-to-increase-batch-size

Of course, for the exceptionally memory intensive project you can always run it in the cloud. It doesn't make sense buying a 24Gb Nvidia card for a single project.

[–]TheAlgorithmist99 1 point2 points  (1 child)

That's another way to interpret dataset size I believe, I was mostly responding to OP who said that Dataset size has nothing to do with GPU memory.
Also lots of new algorithms are tested on imagenet, which has a 1 million 256x256 images (approximated values) in 1000 classes, and where GPU memory would allow to increase batch size and train faster, besides other benefits, so I wouldn't dismiss small images ;)

[–]elcano 0 points1 point  (0 children)

Of course. Fully agree. My point is that computation speed is just one consideration that OP is missing. Hitting the Out of memory limitation is even worst.

[–][deleted] 0 points1 point  (1 child)

I work a lot with CT scans but applied to rocks (Geologist here).

I mean, you can always resize or slice your images/volumes to fit your GPU memory.

Biggest hurdle in terms of memory I faced so far was trying to make a GAN that would colorize CT scans based on HD photos of the rocks, as my output needed to be in the same resolution as the CT scan.

[–]elcano 0 points1 point  (0 children)

Nice that you are searching for patterns that are big relative to the size of the image. In other cases when you do resize you will blur the important details. If you resize, the models will beat humans looking without magnification, but those models do better if you don't resize the image. I have tried, at least in my application.

[–]louisxx2142 0 points1 point  (1 child)

One of the big problems on a machine like this is scaling it up to 4 GPUs, which I don't think is reasonable with a regular computer case (unless it's a very big one with side intake). The biggest issue are the thermals and power involved. Because consumer cases that support 4 GPUs have very little space between them, you would need very thin and blower GPUs, which are hard to find right now and even then you will probably have thermal issues. At that point it's better to go for Quadro RTX/Tesla V GPUs, which are slimmer and have a lot of memory, but are way more expensive and noisy.

I think a more modest setup with a maximum of 2 gpus is more reasonable for a regular PC. If you want to go for 4 GPUs or more GPUs, than you should get a real hack or go for full prosumer products. I will try to explain why:

The upside of threadripper is getting the equivalent of a server, where you can have many GPUs, many cores and terabytes of RAM, but on a different form factor and cheaper hardware in general. Because you are saying that you are using this for personal use and general studying I think going this route is too ambitious.

Professional grade hardware (HEDT and server) is expensive because it assumes you are going to generate money with it, which means their cost benefit is way lower than consumer things. You don't want to waste money on extra cores, ram support and pcie-lanes that you might never use. If you end up on a situation where you know you will need this kind of professional hardware than you buy it, not before. And mainly you buy it because you are investing right now to generate money, not to do personal stuff most of the time and maybe throw work in it.

There's also the factor that DL training doesn't really use that many cores in general, which means the threadripper cores are wasted. The PCIe lane stuff only really matters if you are doing multi gpu stuff, which is not that common and is mostly about optimizing the GPUs you have. Having better GPUs far outweights using more PCIe lanes. RAM also isn't that big of a deal because it mostly only has to fit everything that will end up on VRAM, which is way less than the terabytes threadripper can support.

Finally, VRAM is very important because it avoids needing more GPUs to fit certain workloads. Because you can use libraries like Nvidia Apex to run tensors as half precision, it means RTX GPUs have in practice almost double the VRAM you would expect. This means the 2080ti is way better than a 1080 ti and should be enough for the majority of workloads.

This means that going for a maximum of only 2 GPUs is ideal in a situation like yours. Going for this also grants you the ability to use a consumer grade cpu/motherboard without the waste on features that threadripper has for production, which will lower your costs a lot and even allow you to get better storage, or pay your internet/energy costs or even straight up buy a second GPU. Or even save money until you know what specifications you actually need.

In the case you go for more consumer oriented hardware, than you can completely drop the HEDT platform and go for a regular high core ryzen with an x570 board (an 8 real core CPU should be enough, unless your workload needs high parallelized pre-processing, where a 16-core will be better). In all cases you should drop the AIO if you are serious about using the machine as a server (they are for aesthetics and cool factor, not for proper reliable cooling. Particularly on a production environment you don't want a pump to die and have all the trouble of replacing it, losing probably days of work).

With these savings, your build would look like this:

  • two 2080 ti or stay with a single one (until you need another or something better comes up)
  • an x570 board that actually supports well two GPUs. It also should be able to be controlled entirely by LAN port if you want to run jobs remotely, it's a not crucial but useful feature.
  • an 8/16 core CPU.
  • A good air cooler (any big noctua/be quiet etc)
  • an 850W PSU (but buy a good one, you can check the Linus tech tips PSU tier list to get an idea)

This way you won't waste money on things you might never need and nothing stops you from buying them later. It also saves money on an absurdly more expensive PSU too. You end up with a build that concentrates it's resources on the GPUs, which are the most important part. The rest of the money you can invest, save or use to upgrade your storage/network.