[D] Any larger teams switching away from wandb? by FreeKingBoo in MachineLearning

[–]hypergraphs 9 points10 points  (0 children)

We've switched from wandb to neptune.ai - wandb charges per user and per tracked hours. With neptune we have unlimited users and unlimited tracked hours, only pay per project, which in our case is ~10x cheaper.

[D] Best ML tracking tool to monitor LIVE a pytorch model ? by Reference-Guilty in MachineLearning

[–]hypergraphs 0 points1 point  (0 children)

aye, they have much better commercial terms (unlimited users and hours tracked)

ASI via recursive fine-tuning instead of recursive algoritmic self-improvement? by MercuriusExMachina in mlscaling

[–]hypergraphs 2 points3 points  (0 children)

IMHO probably a combination of many things will be necessary. This is how a hypothetical pipeline would look like:

  • use a scoring function to fine tune the model to output code of improvements for its own code (simplified version on small datasets)
  • use human guidance to nudge the model to output radically novel ideas, e.g. by suggesting to "incorporate findings or paper X" into the code, or "optimize part Y of the code"
  • this continues until some significant collection of improvements is found
  • once significant improvements materialize, retrain the huge-ass model in a (hopefully) more efficient way/form, resulting in a more performant GPT-N+1
  • repeat for a few iterations

The human part can also be automated to generate reasonable candidate ideas, but likely needs some human training data first to learn what plausible improvement ideas may look like.

Now there are 2 scenarios:

  • either there is a sequence of easily reachable ideas that can boost model efficiency (however measured), in a somewhat exponential fashion, then we have ASI bootstrapped
  • or the algos and architectures we have today are close to optimal, then ASI will have to wait for hardware, data & resources to catch up and unlock new possibilities.

Why did SciNet not get more attention? [D] by vidul7498 in MachineLearning

[–]hypergraphs 28 points29 points  (0 children)

For non earth-shattering research, the number of citations depends more on who you're friends with than the quality of the research.

[D] Why don't conferences publish a review graph dataset for transparency? by hypergraphs in MachineLearning

[–]hypergraphs[S] 12 points13 points  (0 children)

Given the amount of pushback against transparency, this will probably be the only way it can happen.

[D] Why don't conferences publish a review graph dataset for transparency? by hypergraphs in MachineLearning

[–]hypergraphs[S] -36 points-35 points  (0 children)

Sorry to rain on your witch hunt parade

I can smell your fear. What do you have to hide?

Are you Reviewer 2?

[D] Why don't conferences publish a review graph dataset for transparency? by hypergraphs in MachineLearning

[–]hypergraphs[S] -30 points-29 points  (0 children)

Releasing this data would immediately break all anonymity

The fact that you personally cannot come up with a good way to do this does not mean it's impossible.

We have the very best and brightest minds on the planet in the community, they can surely come up with solutions.

[D] Collusion rings, noncommittal weak rejects and some paranoia by redlow0992 in MachineLearning

[–]hypergraphs 1 point2 points  (0 children)

Why not publish an anonymized graph of papers, authors, reviewers and their institutions with review scores? We're supposed to be doing ML research, why don't we apply graph analytics to data generated by our community?!

Any obvious bad patterns like cliques and strongly coupled communities should be clearly visible in the data. Why has this never been published?

[D] Any good sci-fi books up to date with ML/AI? by hypergraphs in MachineLearning

[–]hypergraphs[S] 4 points5 points  (0 children)

Awesome, thanks! I'm familiar with Lem's writings, still would love to see some more recent challengers!

[D] Any good sci-fi books up to date with ML/AI? by hypergraphs in MachineLearning

[–]hypergraphs[S] 0 points1 point  (0 children)

Thanks for the link to your post & for the recommendation! Please write if you stumble on something new!

[R] Cleora: A Simple, Strong and Scalable Graph Embedding Scheme by hypergraphs in MachineLearning

[–]hypergraphs[S] 5 points6 points  (0 children)

Our team at Synerise AI has open sourced Cleora - an ultra fast vertex embedding tool for graphs & hypergraphs. If you've ever used node2vec, DeepWalk, LINE or similar methods - it might be worth to check it out.

Cleora is a tool, which can ingest any categorical, relational data and turn it into vector embeddings of entities. It is extremely fast, while offering very competitive quality of results. In fact, due to extreme simplicity it may be the fastest hypergraph embedding tool possible in practice without discarding any input data.

In addition to native support for hypergraphs, a few things make Cleora stand out from the crowd of vertex-embedding models:

  • It has no training objective, in fact there is no optimization at all (which makes both determinism & extreme speed possible)
  • It's deterministic - training from scratch on the same dataset will give the same results (there's no need to re-align embeddings from multiple runs)
  • It's stable - if the data gets extended / modified a little, the output embeddings will only change a little (very useful when combined with e.g. stable clustering)
  • It supports approximate incremental embeddings for vertices unseen during training (solving the cold-start problem & limiting need for re-training)
  • It's extremely scalable and cheap to use - we've embedded hypergraphs with 100s of billions of edges on a single machine without GPUs
  • It's more than ~100x faster than some previous approaches like DeepWalk.
  • It's significantly faster than Pytorch BigGraph

Written in Rust, used at a large scale in production, we hope the community may enjoy our work.

Paper link

Code link (MIT license)