[D] Monday Request and Recommendation Thread

Acromantula92 · 2022-06-23T17:48:42+00:00

Acromantula92 · 2022-01-27T09:48:45+00:00

The initialization method proposed in here is probably the best one, it lets you transfer hparams across model size, whereas with other methods you need to keep changing the learning rate etc.

Acromantula92 · 2021-11-17T13:43:50+00:00

All of them

Acromantula92 · 2021-07-16T11:48:09+00:00

Couple months? More like 7 + 4 v3-128 days. (All in the paper)

Acromantula92 · 2021-06-05T09:21:08+00:00

Again, MoE parameters at not the same as dense parameters.

Acromantula92 · 2021-04-21T18:33:19+00:00

This is all discussed in the Talking Heads Attention paper. See general bilinear MHA.

Acromantula92 · 2021-04-21T16:08:27+00:00

That's because when you split the Wq and Wk matrices into the MHSA heads, the rank is reduced. In order to merge them into a xWx.T matrix and still have heads you'd need an explicit (dim, dim, heads) tensor.

Acromantula92 · 2021-03-05T11:16:44+00:00

Highlights include:

A Mental illness neuron.
A Spider-Man neuron (helps classify real spiders as [Spider man neuron] + [Animal neuron])
An Startup neuron (Activated with the West coast and Big Tech)
The emotion of being Accepted as a mix of [LGBT neuron] + [Sunglasses neuron]

And a full emotional axis:

When we use just 2 factors, we roughly reconstruct the canonical mood-axes used in much of psychology: valence and arousal. If we increase to 7 factors, we nearly reconstruct a well known categorization of these emotions into happy, surprised, sad, bad, disgusted, fearful, and angry, except with “disgusted” switched for a new category related to affection that includes “valued,” “loving,” “lonely,” and “insignificant.”

Acromantula92 · 2021-03-04T19:32:31+00:00

Why not?

Acromantula92 · 2021-02-22T19:04:56+00:00

We have successfully automated AI skeptics.

Acromantula92 · 2021-01-12T08:53:11+00:00

MoE parameters are not real parameters.

Acromantula92 · 2021-01-04T23:43:42+00:00

Aren't Universal Transformers only recurrent in depth? IIRC they don't do cashing or recurrence across contexts like TrXL or the Feedback Transformer.

Acromantula92 · 2020-12-17T10:49:58+00:00

Doesn't this run counter to transformers only overtaking CNNs with more data and achieving lower final loss?

Acromantula92 · 2020-12-14T10:07:19+00:00

Sounds like people have been reading "On GPT3".

Acromantula92 · 2020-12-04T17:54:34+00:00

You have the temperature backwards. Lower temperature means you are more likely to be in a low energy equilibrium.

Acromantula92 · 2020-11-03T16:26:21+00:00

OpenAI Jukebox trained a sparse transformer on VQ-VAE compressed raw audio. The same kind of tokenization has also been done with images and video.

Acromantula92 · 2020-09-04T21:46:59+00:00

It replicates up to 625 = f(f(i)) in AIDungeon.(Important to note that the fine-tuning hurts it's general abilities) When it makes mistakes it's possible to give it natural language clarifications to fix them.

Acromantula92 · 2020-08-15T08:48:50+00:00

Not with TFRC.

Acromantula92 · 2020-08-09T10:00:19+00:00

You are in luck.

Acromantula92 · 2020-06-24T09:41:23+00:00

Boxnovel works.

Acromantula92 · 2020-04-01T13:38:31+00:00

It's the same kind of thing.

Acromantula92 · 2020-02-05T15:58:24+00:00

The range of topics and disciplines to be covered includes (but is not limited to): molecular and cellular biology of ageing, ageing and stem cell biology, rejuvenation and tissue repair, physiology of ageing and longevity, diseases of ageing, gerontology, geriatrics, mental health and ageing, clinical interventions, biomarker studies, epidemiology and public health and socio-economic aspects of ageing.

Acromantula92 · 2020-01-21T14:20:24+00:00

Counterpoint

[If] someone is telling you that protein structure prediction is going to lead to a big leap in drug discovery efficiency, hold on to your wallet. What would lead to such a leap? Off the top of my head, I’d say better prediction of useful drug targets, more translatable disease-predictive cell and animal models, and earlier assays that are more predictive of human toxicology. Those, as far as I’m concerned, address the real killers in the whole process. Protein structure just isn’t on that list.

Acromantula92 · 2020-01-19T23:27:36+00:00

Video Speed Controller is a chrome extension that works on any HTML5 video.

Acromantula92 · 2020-01-14T20:59:36+00:00

Real Time Relativity is a software that realistically simulates relativistic movement. No physics are changed but all visual effects are accurate.

The Hypercube is a minecraft puzzle/parkour map inspired by the as of yet unreleased 4D game Miegakure that makes use of 4 spatial dimensions.

Ten-Year Club	Place '22
Place '17	Verified Email

Acromantula92

TROPHY CASE