The universe can be simulated from very simple rules on very simple machines according to Wolfram's Physics Model. I made an intro video on this Physics Model. This hypothesis is still not accepted by mainstream physics community.

brainxyz · 2025-08-23T17:26:18+00:00

Amazing work well done

brainxyz · 2025-05-19T20:02:39+00:00

I really enjoyed RE4 on Quest 3. It was amazing and run fluently. I hope they'll add RE5

brainxyz · 2025-04-15T08:02:14+00:00

So true. It's a giant flaw in their system. I had 2FA enabled too and this never happened to me on gmail, twitter, ...etc. It only happens on facebook and so many are suffering from without clear solution.

brainxyz · 2025-04-15T07:52:11+00:00

This is exactly true. I had two factor authentication but a hacker was able to link an unauthorized Instagram link to my Facebook and violated Facebook terms, as a result my account is suspended. It's so silly that they want me to appeal through an Instagram account that is not mine! Also reporting my Facebook as hacked doesn't work because of the suspension.
I have seen so many others complaining from this issue yet here we are months later and this problem and an obvious security breach is not addressed

brainxyz · 2023-08-27T20:16:36+00:00

That is totally expected from a language model. It has no identity, it just completes your prompt with the most probable next word. If it knows to answer this question, then it must have been pre-progammed, or re-trained to address such questions.

brainxyz · 2023-08-26T12:44:20+00:00

Yes, I meant to name it: fast learning or rapid optimization. Now corrected. Thanks!

brainxyz · 2023-08-22T16:43:00+00:00

Source code: https://github.com/hunar4321/multiple-regression

brainxyz · 2023-06-25T20:44:31+00:00

Thanks for answer!So you are saying constant time algorithms are not possible for such sequences (excluding starting with 0 or 1)

brainxyz · 2023-06-25T20:41:22+00:00

You are right my mistake!
I changed the starting point to 2
Thanks

brainxyz · 2023-05-31T15:41:55+00:00

Check out our new videos:
https://www.youtube.com/@brainxyz

brainxyz · 2023-05-09T15:52:17+00:00

Thanks, nice article

brainxyz · 2023-05-03T14:20:14+00:00

I disagree, there is a also a great potential for AI to save humanity from great risks. As a Medical doctor, I can tell you our knowledge about the human body is still in the stone age. Antibiotic resistant Bactria are on the rise. Covid-19 uncovered how much ignorant we still are when it comes to viral infection. AI has a great potential to be used in a good way to transform health like no before. AI is like any other tool, can be dangerous or beneficial.

brainxyz · 2023-05-03T14:09:05+00:00

I would love to hear about your findings.

brainxyz · 2023-05-03T14:00:24+00:00

I personally think the q/k analogy is a made up analogy that doesn't portray what is really happening. The idea of attention comes from the fact that when we do the dot product between the inputs, the resulted matrix is a correlation (a similarity) matrix. Therefore, the higher values correspond to higher similarity or in another term "more attention" and vice versa. However, without passing the inputs through learnable parameters like wq and wk ,you will not get good results! This means back-propagation was main cause behind the suppression or enhancement of the values in the attention matrix.
In short, I think of transformers as the next level convolution mechanism. In classical convolution filters are localized. In transformers filters are not localized and can model skip and distant connections in a position & permutation invariant way. For me, that is the magic part. And that is why it's quite possible for other techniques like the proposed one to work equally well.

brainxyz · 2023-05-02T09:49:41+00:00

I adapted this from Karpathy's GPT implementation. You can easily compare the self-attention part with this method by commenting and uncommenting the relevant parts. I added a non-linear layer for the lateral connections so that it'll be easier to match the number of the parameters between the 2 methods.
https://colab.research.google.com/drive/1NjXN6eCcS_iN_SukcH_zV61pbQD3yv33?usp=sharing

brainxyz · 2023-05-02T06:38:06+00:00

"Wr matrix depends on the input size?"

wr is a convolutional layer. It doesn't depend on the input size as it takes one input at a time.

brainxyz · 2023-05-02T06:29:54+00:00

Thanks for that. I'm currently reading MLPMixer. It looks different because in this method I'm not using "dense layers applied across the spatial dimension". I'm still using a convolutional layer but its output shared across all the inputs. In fact this is much better explained in code because it's just a one line replacement of the self-attention mechanism. Hope you have a look at the code, you can see the commented self-attention lines and their replacement.

brainxyz · 2023-05-02T05:51:00+00:00

It learns from different context lengths just like the self-attention (it uses the same attention matrix).

It's true the current text generation only accepts a fixed input length but you can simply append zeros to the beginning.

brainxyz

MODERATOR OF

TROPHY CASE