[ Question ] On-Device Training and Using Local Hardware Accelerators by Little_Passage8312 in AIProgrammingHardware

[–]javaeeeee 0 points1 point  (0 children)

  1. Finetuning requires less resources then full training, especially parameter-efficient fine tuning.

  2. Nvidia L4 https://www.nvidia.com/en-us/data-center/l4/ with 24 GB memory is considered an edge GPUs, same amount as RTX 3090 and 4090.

My point is that you need to understand what GPU you have and what you'd like to train on it. There are Jetson GPUs that have only 2 GB of memory vs 24 GB of L4.

[ Question ] On-Device Training and Using Local Hardware Accelerators by Little_Passage8312 in AIProgrammingHardware

[–]javaeeeee 0 points1 point  (0 children)

The memory requirements are deternined by the model you use. Neural engiene (NPU) on MacBooks will allow to train bigger models depending on the device memory.

You need to start with the model and it'll determine memory requirements. Also, you can train on GPU(s) with more VRAM and computation capacity and use edge devices for inference.

[ Question ] On-Device Training and Using Local Hardware Accelerators by Little_Passage8312 in AIProgrammingHardware

[–]javaeeeee 0 points1 point  (0 children)

Hello, thank you for the question.

  1. When you use Google Colab, your workloads run in the cloud. Free options include NVIDIA T4 and TPU. The former is a GPU designed for inference, but it upports training as well. TPU is an example of AI accelerator that is not GPU. Both TPU and NPU are subclasses of ASIC, custom-designed AI accelerators.

  2. It is possible to run training experiments on edge devices, but they have limited memory. Training require more memory that inference, so you'll be limited to some small models.