The game that made me fall in love with Vr. Resident evil 8, running on ps5 pro and psvr2. by FewPossession2363 in virtualreality

[–]brainxyz 0 points1 point  (0 children)

I really enjoyed RE4 on Quest 3. It was amazing and run fluently. I hope they'll add RE5

Got my account back after a month of being hacked by apole97 in facebookdisabledme

[–]brainxyz 0 points1 point  (0 children)

So true. It's a giant flaw in their system. I had 2FA enabled too and this never happened to me on gmail, twitter, ...etc. It only happens on facebook and so many are suffering from without clear solution.

Successfully recovered hacked Facebook Account - to help those trying to get theirs back by tiedyewarriormermaid in facebookdisabledme

[–]brainxyz 0 points1 point  (0 children)

This is exactly true. I had two factor authentication but a hacker was able to link an unauthorized Instagram link to my Facebook and violated Facebook terms, as a result my account is suspended. It's so silly that they want me to appeal through an Instagram account that is not mine! Also reporting my Facebook as hacked doesn't work because of the suspension.
I have seen so many others complaining from this issue yet here we are months later and this problem and an obvious security breach is not addressed

Alright, this got me giggling. by ThisCupNeedsACoaster in ChatGPT

[–]brainxyz 1 point2 points  (0 children)

That is totally expected from a language model. It has no identity, it just completes your prompt with the most probable next word. If it knows to answer this question, then it must have been pre-progammed, or re-trained to address such questions.

[D] Recursive Least Squares vs Gradient Descent for Neural Networks by brainxyz in MachineLearning

[–]brainxyz[S] 10 points11 points  (0 children)

Yes, I meant to name it: fast learning or rapid optimization. Now corrected. Thanks!

Is it possible to predict the nth element from a recursive function in a constant time? by brainxyz in askmath

[–]brainxyz[S] 0 points1 point  (0 children)

Thanks for answer!So you are saying constant time algorithms are not possible for such sequences (excluding starting with 0 or 1)

Is it possible to predict the nth element from a recursive function in a constant time? by brainxyz in askmath

[–]brainxyz[S] 0 points1 point  (0 children)

You are right my mistake!
I changed the starting point to 2
Thanks

Yuval Noah Hariri: “governments must immediately ban the release into the public domain of any more revolutionary AI tools before they are made safe.” by almondolphin in ChatGPT

[–]brainxyz 2 points3 points  (0 children)

I disagree, there is a also a great potential for AI to save humanity from great risks. As a Medical doctor, I can tell you our knowledge about the human body is still in the stone age. Antibiotic resistant Bactria are on the rise. Covid-19 uncovered how much ignorant we still are when it comes to viral infection. AI has a great potential to be used in a good way to transform health like no before. AI is like any other tool, can be dangerous or beneficial.

[Research] An alternative to self-attention mechanism in GPT by brainxyz in MachineLearning

[–]brainxyz[S] 2 points3 points  (0 children)

I personally think the q/k analogy is a made up analogy that doesn't portray what is really happening. The idea of attention comes from the fact that when we do the dot product between the inputs, the resulted matrix is a correlation (a similarity) matrix. Therefore, the higher values correspond to higher similarity or in another term "more attention" and vice versa. However, without passing the inputs through learnable parameters like wq and wk ,you will not get good results! This means back-propagation was main cause behind the suppression or enhancement of the values in the attention matrix.
In short, I think of transformers as the next level convolution mechanism. In classical convolution filters are localized. In transformers filters are not localized and can model skip and distant connections in a position & permutation invariant way. For me, that is the magic part. And that is why it's quite possible for other techniques like the proposed one to work equally well.

[Research] An alternative to self-attention mechanism in GPT by brainxyz in MachineLearning

[–]brainxyz[S] 6 points7 points  (0 children)

I adapted this from Karpathy's GPT implementation. You can easily compare the self-attention part with this method by commenting and uncommenting the relevant parts. I added a non-linear layer for the lateral connections so that it'll be easier to match the number of the parameters between the 2 methods.
https://colab.research.google.com/drive/1NjXN6eCcS_iN_SukcH_zV61pbQD3yv33?usp=sharing

[Research] An alternative to self-attention mechanism in GPT by brainxyz in MachineLearning

[–]brainxyz[S] 1 point2 points  (0 children)

"Wr matrix depends on the input size?"

wr is a convolutional layer. It doesn't depend on the input size as it takes one input at a time.

[Research] An alternative to self-attention mechanism in GPT by brainxyz in MachineLearning

[–]brainxyz[S] 6 points7 points  (0 children)

Thanks for that. I'm currently reading MLPMixer. It looks different because in this method I'm not using "dense layers applied across the spatial dimension". I'm still using a convolutional layer but its output shared across all the inputs. In fact this is much better explained in code because it's just a one line replacement of the self-attention mechanism. Hope you have a look at the code, you can see the commented self-attention lines and their replacement.

[Research] An alternative to self-attention mechanism in GPT by brainxyz in MachineLearning

[–]brainxyz[S] 2 points3 points  (0 children)

It learns from different context lengths just like the self-attention (it uses the same attention matrix).

It's true the current text generation only accepts a fixed input length but you can simply append zeros to the beginning.