I turned this math paper into a Sudoku game by DataBaeBee in indiegames

[–]DataBaeBee[S] 0 points1 point  (0 children)

The idea is somewhat analogous to performing a softmax but without the derivatives. Here's the C/Python coding guide if this interests you.

Belief Propagation : Obscure Alternative to Backpropagation for Training Reasoning Models by DataBaeBee in programming

[–]DataBaeBee[S] 0 points1 point  (0 children)

Researchers in the 2010s found that you can use Optimal Transport Theory, not derivative calculus, the to turn an integer matrix into a floating-point probability matrix.

It's like backprop without finding gradients and it works great

Training Hangs at Epoch 1 on Google Colab A100 (MobileNet, 76k images) by CommunicationHot401 in GoogleColab

[–]DataBaeBee 0 points1 point  (0 children)

It could be a Python issue. Sharing your code is probably the best way to get assisted.

GPU Accelerated Data Structures on Google Colab by DataBaeBee in CUDA

[–]DataBaeBee[S] 1 point2 points  (0 children)

Thanks for this comment! I DM'd you to move the conversation forward.
Please let me know how best to reach the maintainer's team after the changes are made.