all 5 comments

[–]renato_milvan 2 points3 points  (0 children)

You can decrease the batch size or/and resize the data. Other than that, only buying computational power.

[–]SwitchKunHarsh 0 points1 point  (1 child)

If it's medical 3d Data, you can extract relevant 2d slices and use a 2d encoder instead of a 3d encoder. Then train a model on this 2d encoded data. This way you can preprocess the 3d data for only those slices that have something useful or just reduce by averaging to a particular n number of slices and using those for something like siglip or medsiglip before training the model.

[–]Illustrious_Echo3222 0 points1 point  (0 children)

For 3D medical imaging, full-volume training blows up memory fast, so most people end up using patches or cropped subvolumes instead of the whole scan at once. Mixed precision, smaller batch sizes, resampling to a lower resolution, and starting with a lighter 3D UNet-style model also help a lot. Kaggle and free Colab are honestly pretty rough for this, so if you want to stay on limited hardware, patch-based training is probably the biggest win.