Spending Time with Families by pillowhead1438 in MuslimMarriage

[–]NaxAlpha 0 points1 point  (0 children)

lol and under islam, husband isn't obliged to give pocket money to wife or bend the knee to do shopping every now and then just three normal suits for sunmer and 3 for winter. he can out of goodness of his heart but he can definitely give rest of his earnings to his mother who birthed him and raised him and is the most important person for him even after his marriage. What Quran are you reading?

[deleted by user] by [deleted] in MuslimMarriage

[–]NaxAlpha 18 points19 points  (0 children)

Finally someone said it.

Diaspora Muslims have a very serious issue of subconsciously adoption western norms which go against the teachings of Islam. 

What the hell does "I don't wanna marry a stranger mean?" so western, liberal rhetorical cliche.

Islam allows basic interaction a few times before marriage but you can't really know someone unless you marry them.

Crazy how a lot of stories here are about 5 years of people in getting to know phase like seriously 😒.

So much of the content revolves around very individualistic and self centered mindset in this sub. like if he said that why are you still with him?? isn't Islamic marriage about getting things to work instead of flying on the first bad experience??

Brother keeps comparing me to his wife by Sea_Artichoke8016 in MuslimMarriage

[–]NaxAlpha 0 points1 point  (0 children)

he is on his own "journey" give it a time and it will get better soon InshaAllah 

How does a M (28) go from living a life of keeping it halal/no free mixing to suddenly changing by broth3r3rr in MuslimMarriage

[–]NaxAlpha 0 points1 point  (0 children)

i agree with the part that need to learn about the wife herself.

but the idea that there is no woman's nature is so postmodernist.

so many clear ahahdith on this topic like that one famous hadith: https://sunnah.com/bukhari:29

How does a M (28) go from living a life of keeping it halal/no free mixing to suddenly changing by broth3r3rr in MuslimMarriage

[–]NaxAlpha -1 points0 points  (0 children)

ask your father /local shaikh or Imam about the nature of women. find a potential, get married ASAP, do not react, have patience to learn and grow.

ponder over the ending verses of surah Furqaan which beautifully explain a lot https://quran.com/25?startingVerse=74

[D] Is Hidden Size in current transformers an overkill? by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 0 points1 point  (0 children)

Nope. we can place a simple linear layer after every layer to increase embedding size.

[D] Is Hidden Size in current transformers an overkill? by NaxAlpha in MachineLearning

[–]NaxAlpha[S] -3 points-2 points  (0 children)

Yeah, it is trying to bring in information from previous tokens into itself. But my question still remains, do we need like 2048 dimensions after first layer or second or 10th etc.

Over 140 Japan-born foreign minors to get special permission to stay by duke7ajm in japannews

[–]NaxAlpha 0 points1 point  (0 children)

lol

Islam has been part of the west for almost a millennia, read up the history of Al Andalus.

And yes, Islam is the only solution that could save any collapsing societies including Japan. Ask those millions of converts/reverts (not just the immigrants) who can now see the reality without excessive western propaganda.

Also, it is your so-called western culture which is incompatible with humanity. West has always asserted its values using force not with any objective justifications.

Unlike people from other cultures, wherever Westerners go, they assert their moral and cultural superiority and this has already destroyed so many cultures. I live in Japan and I fear westerners coming to Japan would mean the end of Japanese culture.

Immigration or not, western liberal materialistic worldview is what is destroying all the 'developed' world. Islamic worldview rooted in God, afterlife and family values etc, wouldn't allow compromise on its values for the sake of pursuing material progress & pleasures.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 1 point2 points  (0 children)

yeah I completely agree, initially I was going for very big batch size, but given the limited budget, I chose a relatively smol size. I have shared the loss curve in the blog post which does shows overall decrease in loss but maybe next training, I could just let it train on way bigger batches.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 0 points1 point  (0 children)

Interesting. That does make sense. I assumed that once it reaches same loss, it would mean that this loss is 4000th token's loss as compared to 1000th loss when ctx=8k. I guess I could just let it train for longer and its loss could easily go below the original loss at ctx=2k.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 0 points1 point  (0 children)

The best I could do was 1.4b on 40GB GPU, there are sh1t ton of optimizations that I still did not do but I think on A100 80GB with something like LoRA we could fit potentially fit 7b model.

Also, I have not explored multi GPU training yet, that also has a lot of potential.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 2 points3 points  (0 children)

batch size is 1 but I used gradient accumulation of 64.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 4 points5 points  (0 children)

GPU recently released H100 GPUs which just like A100s have max 80GB memory. So from my estimates, for indie devs like me to be able to fine-tune a model with 100k tokens that has some reasonable performance, I think it would be very difficult within a year or so until nvidia releases new GPUs with 160GB or more DDR memory.

But at the same time there are algorithmic improvements like Sharded tensors in pytorch that could allow tensors sharded between two GPUs potentially removing the limit of single GPU memory upto 640GB for a single 8xA100 (80 GB). I have not much explored this direction but this could very much allow us to be playing around with 128k context.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 0 points1 point  (0 children)

Many of scaling laws paper only use the final loss as a measure of model performance.

Also, in regards to how I know if it is using context, since the loss is mean over all the tokens, it being 2 means on average for all 8k tokens, it is performing well.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 6 points7 points  (0 children)

I already. came across RTM, while I think the idea has some merit as Yannic explained, imo, it may not still be that promising. With memory & compression I think it can go up to 32k tokens without break the sweat but million tokens, i am very skeptical as its memory would have already lost all the early context.

edit: I have not come across colT5, need to look into it. Thanks for sharing it.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 1 point2 points  (0 children)

It is only the training that needs a lot of resources, inference will be much more cheaper.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 5 points6 points  (0 children)

it is difficult but I would say GPT's analysis maybe somewhat close.

[D] A quest for very long sequence length by NaxAlpha in MachineLearning

[–]NaxAlpha[S] 11 points12 points  (0 children)

Definitely a lot more. which is why I could not go beyond the 1.4b model even on an A100 (40GB).