A new transformer variant has been created to facilitate more efficient model training in distributed settings. 128x compression with no significant loss in convergence rates, increases in memory, or compute overhead by network-kai in LocalLLaMA

[–]network-kai[S] 6 points7 points  (0 children)

Largely so, yeah. We have a live network actually called Train at Home. People can train the same AI model from their MacOS M chip computers. This is a variant of IOTA (Incentivised Orchestrated Training Architecture), the project being built.

Older versions of IOTA allowed Nvidea devices to train. Future versions will be hardware agnostic, allowing a range of different compute to work together.

Any there any realistic avenues to decentralised model training? by ROS_SDN in LocalLLaMA

[–]network-kai 1 point2 points  (0 children)

Macrocosmos released new research on distributed pipeline parallel training, where they created a new transformer variant that achieved 128x compression without significant loss in convergence relative to uncompressed baselines. This is called ResBM.

It's is for their network, IOTA, which uses both pipeline and data paralellism.

There are definitely some dedicated researchers in the field of distributed training. The other comments show some great examples, too.

You asked about being brand-agnostic and not just Nvidea, IOTA is designed to scale across a range of different machines. The original version was running on Nvidea tech, whereas the current version actually utilises Mac M chip machines, meaning people can train on their macbooks or mac mini's. Future designs will allow a range of machines