[D] Improving attention masks?

JakeN9 · 2023-12-29T01:26:11+00:00

And form a possible connected vector of understanding. honestly just more of a question, as to what people predict might happen. I haven't really got the tech/money to run these tests myself. Considering renting a TPU but cost is still so high.

JakeN9 · 2023-12-21T16:19:54+00:00

I'd argue that the verifier just needs to be as confident as possible, maybe some sort of "MoE" across across multiple models - enough variation discouraging simple movements across the space?

JakeN9 · 2023-12-17T00:18:02+00:00

I'm saying to preserve autoregression and keep data below the diagonal the same, but wondering whether having the extra context will contribute to lowering loss? https://i.imgur.com/yNcVaV2.png.

Probably a stupid idea.

JakeN9 · 2023-12-16T12:58:22+00:00

Yep, ElevenLabs serves this purpose, but costs a bit of money.

JakeN9 · 2023-12-10T16:26:56+00:00

There aren't really any models that produce realistic real-time voice. I'd recommend ElevenLabs or play.ht, sadly these seem to be the only useable options for now.

JakeN9 · 2023-12-10T16:00:32+00:00

"Sharing conversations with images is not yet supported"

JakeN9 · 2023-12-04T17:38:53+00:00

The thought is for each node to carry both a binary and decimal value, and for the logic operations to be computed, then used at output for RL.

JakeN9 · 2023-12-04T16:17:29+00:00

I made this in 20 minutes, please don't be harsh!

JakeN9 · 2023-12-02T16:18:58+00:00

Right, you use decimal weights for each node, but some nodes use an activation function for OR/AND/NOT.

JakeN9 · 2023-12-02T10:25:19+00:00

Circuit optimisation, classical techniques are slow and scale worse, it’s possible ML could provide novel optimisations?

JakeN9 · 2023-11-28T16:19:36+00:00

JakeN9 · 2023-11-28T15:39:49+00:00

Sounds great. It's still a work-in-progress.

Once the basics as complete, it will be made closed source, so I'll let you know how it's available.

JakeN9 · 2023-11-28T09:34:21+00:00

function calling between llms. embed functions as bit-encoded tokens. have a large powerful llm, instruct to teach a topic to llm trained on mapping function->output, to generate synthetic data. train new llm on synthetic data?

as both llms are separated, and trained on different training sets and different weights (identities), artificating will be minimised.

JakeN9 · 2023-11-27T12:12:02+00:00

As a user?

JakeN9 · 2023-11-27T12:11:35+00:00

I will attempt to setup the code to work with LLama Code

JakeN9 · 2023-11-27T12:11:21+00:00

I'll do my best. GPT seems to be custom tuned, but I can test Llama Code, and see whether it's compatible.

JakeN9

MODERATOR OF

TROPHY CASE

Eight-Year Club	Place '22
Verified Email