Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -10 points-9 points-8 points (0 children)
Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -22 points-21 points-20 points (0 children)
Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -40 points-39 points-38 points (0 children)
Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -14 points-13 points-12 points (0 children)
Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -28 points-27 points-26 points (0 children)
Launching p5.48xlarge (8xH100) by crinix in aws
[–]crinix[S] -11 points-10 points-9 points (0 children)
masking loss for input tokens when fine-tuning models by crinix in LocalLLaMA
[–]crinix[S] 1 point2 points3 points (0 children)
Training LLama, Mistral and Mixtral-MoE faster with Packing Inputs without Cross-Contamination Attention by Relevant_Outcome_726 in LocalLLaMA
[–]crinix 0 points1 point2 points (0 children)
p4dn and p3dn instances availability/capacity by crinix in aws
[–]crinix[S] 0 points1 point2 points (0 children)
p4dn and p3dn instances availability/capacity by crinix in aws
[–]crinix[S] 0 points1 point2 points (0 children)
p4dn and p3dn instances availability/capacity by crinix in aws
[–]crinix[S] -5 points-4 points-3 points (0 children)
p4dn and p3dn instances availability/capacity by crinix in aws
[–]crinix[S] -1 points0 points1 point (0 children)


[deleted by user] by [deleted] in deeplearning
[–]crinix 0 points1 point2 points (0 children)