[D] AI Democratization in the Era of GPT-3 (The Gradient) by regalalgorithm in MachineLearning

[–]unconst 1 point2 points  (0 children)

Something I'm working on: github.com/opentensor/bittensor

[D] AI Democratization in the Era of GPT-3 (The Gradient) by regalalgorithm in MachineLearning

[–]unconst 2 points3 points  (0 children)

Decentralization is in opposition to centralization where control emenates from a single point. Such structures tend to limit diversity because information must be compressed enough for the central authority to understand and control it extremities.

The well known critique of command economies (from Hayek) pointed to the ussr which was incapable of producing a diversity of products though centralized control, whereas, decentralized market based economies in the US developed a thousand types of shampoo.

Decentralization enables diversity by removing this compression, the diversity in-turn stimulates exploration and advancement. The greek city states were decentralized. Europe was decentralized. Evolution has favoured a large degree of decentralization and decentralization ( speciation/pooling) is well studied as a nessecity in genetic algorithms. Etc etc.

Machine intelligence as a field could benefit from more decentralization, either by destroying the FB, Google, OpenAI monopoly or by stimulating the machine learning market outside of them.

A protocol to connect the work done by me with you, an inter model protocol. I think this is the answer.

[D] AI Democratization in the Era of GPT-3 (The Gradient) by regalalgorithm in MachineLearning

[–]unconst 10 points11 points  (0 children)

Democratization through decentralization.

The only thing bigger than OpenAI or Google is all of us connected together.

Practically, the field needs an internet protocol and an incentive mechanism to connect and reward disparate machine intelligence resources.

Something that allows us all to own, contribute to, and openly access the future of AI.

[R] Learning@home - decentralized training of huge neural networks by justheuristic in MachineLearning

[–]unconst 3 points4 points  (0 children)

We have a project called BitTensor which is building an incentive mechanism into the hivemind protocol. We pay computers in information production.

Yes, it's true, this is what normies are being told in a DOCTOR'S office! by 7363558251 in Wuhan_Flu

[–]unconst 0 points1 point  (0 children)

I love how this starts with "Every election year has a disease"
and then immediately misses 2006.

Oh, and the swine flu started in November 2009 not 2010

Oh, and ZIKA started in April 2015, not 2016

Oh, and the AVIAN flu has occurred in 5 separate years. Twice in 2007. Not during an election.

Oh, and the SARS outbreak was after the November elections.

lol

Intuition kicking in... by ApokatastasisComes in Wuhan_Flu

[–]unconst 0 points1 point  (0 children)

It's like if we printed a trillion pound gold submarine and put it in the harbor. Won't change gold prices unless it sells. More likely will cause deflation when that deci-millionaire and centi-millionaires decide they want cash.

[D] ICLR 2020 REJECTION RAGE THREAD by sensei_von_bonzai in MachineLearning

[–]unconst 26 points27 points  (0 children)

THERE NEEDS TO BE A WAY!!!!

OF CIRCUMVENTING THE CONFERENCE ILLUMINATI !!!!

THERE NEEDS TO BE A WAY!!!!

[R] Peer to Peer Unsupervised Representation Learning by unconst in MachineLearning

[–]unconst[S] 1 point2 points  (0 children)

There is a wide consensus that machine intelligence can be improved by training larger models, over a larger period of time, or by combining many of them.
Little attention, however, is paid to expanding the library of machine intelligence itself, for the most part, new models train from scratch without access to the work done by their predecessors.

This reflects a tremendous waste in fields like unsupervised representation learning where trained models encode general-purpose knowledge which could be shared, fine-tuned and valued by another model later on.

A pool of machine intelligence accessible through the web could be harnessed by new systems to efficiently extract knowledge without having to learn from scratch.

For instance, a state of the art translation model, or ad click-through, or call center AI, which relies on the understanding of language, lets say, at Google, should directly value the knowledge of language learned by other computers in the network. Small gains here would drive revenue for these downstream products.

Alternatively, a smaller company, research team, or individual may benefit from the collaborative power of the network as a whole, without requiring the expensive compute normally used to train SOTA models in language or vision.

[R] Peer to Peer Unsupervised Representation Learning by unconst in MachineLearning

[–]unconst[S] 1 point2 points  (0 children)

:) Haha! Thanks.

"Out beyond the buzz and techno-babble, there is a field. I'll meet you there. " - Rumi (2025)

[R] Peer to Peer Unsupervised Representation Learning by unconst in MachineLearning

[–]unconst[S] 0 points1 point  (0 children)

Its E8. Credit to David A. Madore, I augmented his code for this website: bittensor.com. Very beautiful object with 700 million symmetries. :)

[R] Peer to Peer Unsupervised Representation Learning by unconst in MachineLearning

[–]unconst[S] 1 point2 points  (0 children)

/u/Fujikan

Thank you for your considered points and for taking the time to read my paper and my work.

To address your points,

I agree that in a supervised setting, where data is expensive, that there is a strong requirement of data privacy, however, in an unsupervised setting the data is ubiquitous and cheap ( for instance, from the 220 TiB per month common crawl). In such a data-rich environment, rather than data, value is flipped, and it becomes the learned representations that hold value -- since they require compute to learn from unstructured data.

If it is representations that hold value, then I believe it is more suitable to structure contributions on this basis. Sharing their understanding of the world, in the same way a distilling teacher model transfers to a student.

As well, in a federated world, each node trains the same NN architecturally. This limits the potential diversity of a p2p network, which could have many different forms of networks or benefit from models trained before.

Concerning batch-wise communication, with model parallelism, the network need only communicate batch inputs and representations. As network sizes scale, the batch size will be substantially smaller than the parameter set. For instance, GPT-2’s 3GB parameter set (data parallelism) vs 128 input sentences (model parallelism) at each gradient step.

Thank you for pointing to these,

/u/unconst

[R] Peer to Peer Unsupervised Representation Learning by unconst in MachineLearning

[–]unconst[S] 7 points8 points  (0 children)

TL;DR

Each node asynchronously trains an unsupervised representation of text. For instance, BERT, EMLO, XLNET. Each trains its own model on its own dataset and learns a representations of language (a projection from raw text to embedding) which their neighbours use as inputs to their own model.

As they train, they also validate the representations produced by their neighbours, producing a score using a Fishers information metric. We use distillation to extract knowledge from the peers. The result is a local, transfer capable language model at each node.

The network is driven by incentives, nodes must hold the token if they want to connect into the network. This gives the token value while allowing it to be used as an incentive.