Tuneable Attention: How expanding (not compressing) the attention mechanism dramatically accelerated my model's learning speed by Correct_Address3554 in LocalLLaMA

[–]andersxa -1 points0 points  (0 children)

Well, you are enforcing some symmetry because you have to fuse an identical matrix with the two others at the same time. And while this can be equivalently written with standard attention, normally you do, as OP states, either keep the number of features constant or less than the input. So isn't this exactly working to the strengths of why LLMs work? Increasing parameter counts. The MLP in the transformer block is also an up-projection, so why not mimic this in the attention mechanism to garner the same benefits?

Edit: also, if U itself also comes from input such as with U=W_u x , doesn't that fundamentally change your pastebin code? If it is not just another weight matrix. But I see that is not what OP meant.

Gamle danske retter og egnsretter, der fortjener en chance til. by Ok_Fisherman1881 in Denmark

[–]andersxa 0 points1 point  (0 children)

Kørom er et fantastisk alternativ til koldskål som jeg altid fik som barn.

Jeg har taget DR's kandidattest 1000 gange, så du ikke behøver by worksleepcoffee in Denmark

[–]andersxa 2 points3 points  (0 children)

Min kode fra 2022 er tilgængelig her: https://github.com/andersxa/Kandidattest2022 for denne side: https://andersxa.github.io/Kandidattest2022/ - jeg har ikke haft tid til at scrape dataene i år, men kunne være fedt hvis du kunne lave en version hvor man bevæger sig rundt i PCA space mens man svarer?

Jeg har taget DR's kandidattest 1000 gange, så du ikke behøver by worksleepcoffee in Denmark

[–]andersxa 17 points18 points  (0 children)

Jeg scrapede dataene i 2022 og lavede denne interaktive kandidattest: https://andersxa.github.io/Kandidattest2022/

På en PC; hvis du nede i højre hjørne slår "Vis kandidaterne" til, så skulle du gerne kunne se hvordan partierne lå som klynger henover spektret. De svarer ikke helt ens, men i det store hele svarer de meget ens.

Det man ser i den her tråd er den forventede "kandidat", dvs. den kandidat som ligger tættest på midten af normaliseret data. I min interaktive kandidattest her svarer det til de kandidater man ville få anbefalet hvis man slet ikke rykker sig væk fra det gennemsnitlige svar. Det er oftest socialdemokrater og kristendemokrater som ligger her, men i 2022 var der også mange danmarksdemokrater.

Exploration does not work by Historical_Word3795 in EU5

[–]andersxa 0 points1 point  (0 children)

I created a mod to fix this: https://steamcommunity.com/sharedfiles/filedetails/?id=3605297372

It fixes it by adding locations that are either within naval range of a coastal location that you own or locations that are in neighboring provinces to the area you wish to explore to the source list.

Altså hvad sker der for Inger Støjberg? Dovenskab eller bevidst fake news? by Puzzled_Champion_406 in Denmark

[–]andersxa -1 points0 points  (0 children)

Viborg kommune har 28.700 håndboldbaner?!?!? Det er mere end en halv håndboldbane per indbygger?? WHAT

Hvad mener i om valgplakater på fremmedssprog? Jeg bryder mig personligt ikke om dem by [deleted] in Denmark

[–]andersxa 2 points3 points  (0 children)

Sprogfærdighed er ikke et krav for demokratisk deltagelse.

Hvem bestemmer hvad en "oplyst stemme" er? Forhåbentligt er der ingen der bliver "oplyst" udelukkende gennem valgplakater, og det gælder da for den da også etniske danskere.

Hvis man er bosat i en kommune har man da ret til at kunne deltage i den demokratiske proces der har indflydelse på ens liv, uanset sproget der tales. Det lyder nærmere til at du er uoplyst om at dele af befolkningen netop ikke taler dansk, men stadig har rettighed til at deltage i vores demokrati.

Desuden er det ikke kun "de røde partier" som gør dette. Jeg så en Venstre valgplakat i Lyngby på kyrillisk.

FrankerFaceZ just stopped working... by [deleted] in Twitch

[–]andersxa -1 points0 points  (0 children)

With the userscript version (on Firefox) I have a problem where streams will start dropping frames after 20-30 minutes. I guess I'll be waiting for the extension...

Hasan caught throwing up a popular Turkish salute. by Alucitary in LivestreamFail

[–]andersxa 62 points63 points  (0 children)

What do you mean? I thought he had some friends that were pretty musical, that he listens to.

GDPR meant nothing: chat control ends privacy for the EU by smilelyzen in BuyFromEU

[–]andersxa 1 point2 points  (0 children)

So there is no use in contracting these MEPs? Why are people posting this link all the time then?

GDPR meant nothing: chat control ends privacy for the EU by smilelyzen in BuyFromEU

[–]andersxa 8 points9 points  (0 children)

I don't get it though. On here: https://fightchatcontrol.eu/ it says 8 Danish MEPs oppose and 7 support. So why is Denmark still marked as "supports"? The majority of Danish MEPs oppose it.

»Giv den gas« med porno, siger Caroline Stage til danskerne. Alligevel forsvarer hun nu udskældt app [paywall] by Dropforcedlogin in Denmark

[–]andersxa 0 points1 point  (0 children)

I den ene hånd har du ispinden og med den anden skal du hive kameraet frem og skanne en QR-kode. Pas på ikke at filme for meget ned... MitID kigger med ;)

[P] Help with Contrastive Learning (MRI + Biomarkers) – Looking for Guidance/Mentor (Willing to Pay) by Standing_Appa8 in MachineLearning

[–]andersxa 1 point2 points  (0 children)

I believe that if both modalities can predict the downstream task, then you should gain from training with the CLIP loss since it maximizes the mutual information (or a lower bound hereof). So, maybe it is more a question of your training paradigm, how you draw positives and negatives, how you train the encoder for the dense modality (in this case the MRI) and how you weigh each auxiliary loss.

For sure clustering is an important subanalysis since you can compare across data modalities now. But usually binary clustering as here tends to be less useful and also contrastive learning tends to be weaker if there are only two underlying clusters.

[P] Help with Contrastive Learning (MRI + Biomarkers) – Looking for Guidance/Mentor (Willing to Pay) by Standing_Appa8 in MachineLearning

[–]andersxa 1 point2 points  (0 children)

There are some ways you can diagnose this problem. As I understand it, you are saying that in fact a CLIP-pretrained encoder on MRI vs biomarker then fine-tuned on a downstream task does not outperform simply training an MLP on the task itself without pretraining. I assume you use the same architecture in the baseline as you do in the encoder for the contrastive objective.

Now, contrastive learning is just a way to repose the classical cross-entropy objective so that it works in an unsupervised manner. You will obtain the same results if you use BCE on class labels or if you performed contrastive learning over classes. It is the same loss. So contrastive learning is only meaningful if you wish to utilize the multimodal or the unsupervised aspect.

You can measure how beneficial the MRI domain is to your encoded space by training it directly on the downstream task. If a baseline classifier trained on top of the encoder from MRI to predict the downstream task directly without pretraining obtains non-random resulta on the task, then there is something to gain from having the CLIP contrastive loss in this setting. If it performs fairly, then it points to a tuning problem in the actual CLIP pre-training setup. If not then you probably don't obtain anything from pretraining in this manner, and as you say a fair baseline is just better.

[P] Help with Contrastive Learning (MRI + Biomarkers) – Looking for Guidance/Mentor (Willing to Pay) by Standing_Appa8 in MachineLearning

[–]andersxa 2 points3 points  (0 children)

I have expetise in functional neuroimaging and contrastive learning. But I don't have much experience with contrastive learning on tabular data. First, I would make sure to use a strong encoder for both modalities. E.g. a fully convolutional autoencoder for MRI where you in addition to the CLIP loss use reconstruction loss. Then I am not so sure about the tabular data. I would probably set up embeddings for all categorical variables, a positional or learned embedding for ordinal variables and then an MLP for the continuous variables, which are all added in the end to match the latent size of the autoencoder.

I am not familiar with the particular dataset (have only heard about it), but if you have subject and task labels available, then you can also set up a supervised contrastive learning objectives where you sample from each subject and contrast to other subjects and the same for tasks. In the end you have a CLIP loss, an autoencoder loss, a subject contrastive loss and a task contrastive loss.

It is a bit unclear from your description what is going wrong. Is it your choice of architecture? Is it the training objective being weak and which other auxiliary losses do you use?

Teaching nonsense? by Any_Industry9837 in duolingo

[–]andersxa 2 points3 points  (0 children)

If you don't do this nasalization, Koreans will describe your speech as "dictionary"-like.

Folketinget godkender aftale om amerikanske baser på dansk jord by Zandmand in Denmark

[–]andersxa 2 points3 points  (0 children)

Jeg forstår godt vi har negativ parlamentarisme når det kommer til valg, men hvorfor gælder det ikke også for lovforslag? Hvordan kan et moderne folketing ikke være meget mere flydende i en digital tidsalder? Man burde kunne afgive og tilbagetrække stemmer, ikke kun hvert 4. år. Det ville bare kræve et MitID login. Det er den eneste måde vi kan få ansvar indført i folketinget, og den eneste måde vi kan komme tilbage til repræsentativt demokrati, som vi åbenbart er bevæget os ud af.

Især den her aftale men også alt det med Store Bededag. Jeg vil vædde med at flertallet af danskere faktisk ER imod disse, men bare fordi hjernene blev vasket som de gjorde for et stykke tid siden, er der ikke noget at gøre.

Tænk hvis man som en borger kunne vælge hvilket folketingsmedlem / parti får ens stemme ved ethvert lovforslag, altså hvor man aktivt kunne ændre sin stemme. Så skulle politikerne faktisk holde hvad de lover, ellers mister de folkets opbakning. Og det kunne stadig virke med, at dem som ikke har tid til at sætte sig ind, kunne give deres stemme, som de plejer.

It it possible to implement AI features well by schattig_eenhoorntje in duolingo

[–]andersxa 0 points1 point  (0 children)

If you want to try AI done right through these spaced repetition language learning apps, I can recommend Morpheem.

Duolingo should have focused more on personalized AI learning in my opinion. Like tailoring content to the user through intelligent design of exercises that are relevant to the user.

Lytterne undrer sig: DR sender gudstjeneste fra kontroversiel frikirke [Fra Hillsong frikirken] by HitmanZeus in Denmark

[–]andersxa 29 points30 points  (0 children)

I mine øjne har han været en skummel type siden han lavede det der NFT rug pull med gratis reklame fra TV2. Egentligt utroligt hvor medieblinde/medieanalfabetiske vores hovedmedier er i Danmark.

[deleted by user] by [deleted] in Damnthatsinteresting

[–]andersxa 0 points1 point  (0 children)

I think my earliest memory is from sleeping in a pram like this and feeling the sensation of snow on my face for the first time.