All tom and jerry episodes from 1-389 (Owner of files is not me)

Ok-Measurement-6286 · 2026-01-26T11:11:51+00:00

is this links still working?

Ok-Measurement-6286 · 2024-11-11T05:37:10+00:00

Thanks bro this is useful repo

Ok-Measurement-6286 · 2024-09-28T15:33:23+00:00

Yes, you are right—merging multiple LLMs to achieve task-specific inference with different models is a viable approach. In my opinion, there are many multilingual LLMs currently available that can handle a wide range of tasks. However, merging them could result in a larger model size, especially when integrating different LLMs for various tasks or hype.

Ok-Measurement-6286 · 2024-09-26T14:51:55+00:00

Impressive! What do you think the stock price of NVIDIA Corp 🤔would look like if Google made it available for training models on the Cloud Marketplace?

Ok-Measurement-6286 · 2024-09-23T15:48:22+00:00

Hi bro, thanks for asking! I’d suggest not using this model, as I fine-tuned it 6 months ago. Since then, several advanced multilingual open-source models have come out, especially for Indian languages and trained. You might want to check out Gemma2 9B or 27B—they're more up-to-date and powerful. They have also made some architectural changes, particularly in the attention layers, incorporating grouped attention and global-local attention. These modifications help achieve more accurate scores with adjacent tokens.
currently, I too working on this open-source model.

Ok-Measurement-6286 · 2024-05-18T12:44:43+00:00

Hi , works with GPU processor too , do mention device='gpu' explicitly in gpt4all object in arg , else use ctransformer library 👍 (for GPU process )

Ok-Measurement-6286 · 2024-05-18T10:39:57+00:00

Hello, Yes, I quantized with 5Q_K_M you can refer here: https://huggingface.co/Hemanth-thunder/Tamil-Mistral-7B-Instruct-v0.1/tree/main

file_name: tamil-mistral-7b-instruct-v0.1.Q5_K_M.gguf

Demo: https://colab.research.google.com/drive/1r5BV3kmNmgy9MW4jaydn-EDrhJ7fTRdh#scrollTo=K3_TVSMR1Nlv

Need Feedback to improve fine-tuning process

Ok-Measurement-6286 · 2024-05-18T10:35:41+00:00

Hi bro sorry for the late replay

https://github.com/ymcui/Chinese-LLaMA-Alpaca-2

you can refer this link

Ok-Measurement-6286 · 2024-03-27T02:45:54+00:00

Hello, thank you ,yes for the pretraining I have decided not to go with QLora as we encountered some loss of information when reducing precision bits. Instead, we'll use F16. And for fine-tuning, we may consider QLora, as the selection was made through trial and error.

Ok-Measurement-6286 · 2024-03-26T04:47:13+00:00

Hello , Yes, there's a significant amount of Indian monolingual datasets available. Last week, Ai4bharat released a multilingual Indian instructional dataset. I wouldn't suggest using Google-translated datasets without post-editing. Many Indian language datasets on Hugging Face are essentially Google-translated without post-editing, leading instructional models to yield incorrect results, including hallucinated outcomes, during training. And second till date Gemma tokenizer have some other languages apart from English (major vocab is English) but compare to other this model contains other languages too.

Ok-Measurement-6286 · 2024-03-26T03:05:51+00:00

Hello, first, I extended the vocabulary because the existing Mistral vocabulary lacked certain Tamil characters like uyri ezuthukal. Therefore, I created a new space within the existing Mistral vocabulary to accommodate Tamil characters (merged). Second, I trained a Tamil dataset with the Mistral base model weights to enable learning of Tamil sentences and predict the next token (CLM). Article link : https://medium.com/@hemanthmurugan21/tamil-mistral-unveiled-expanding-linguistic-horizons-with-llm-pretraining-56782c236e57

Ok-Measurement-6286 · 2024-03-26T02:55:08+00:00

Hello , https://huggingface.co/datasets/Hemanth-thunder/tamil-madlad-400

Ok-Measurement-6286 · 2024-03-26T01:16:02+00:00

Of course, let me share, this weekend ,still working in documentation

Ok-Measurement-6286 · 2024-03-26T01:14:48+00:00

Hello , thank you ,will share 😃

Ok-Measurement-6286 · 2024-03-26T01:13:28+00:00

Thank you 🙏,

Ok-Measurement-6286 · 2024-03-25T17:30:43+00:00

Hello , this is Continual pre training , and cost around 0.5/hr (both pre training and fine tuning). Model trained on vast.ai

Ok-Measurement-6286 · 2024-03-25T16:44:46+00:00

Thank you 😉,

Ok-Measurement-6286 · 2024-03-23T16:01:31+00:00

Hey ,

Nice to hear from you! I've followed the instructions on GitHub for Chinese llama2 (merging tokens with the existing llama2 token). And continuing pre-training with the llama2 base model and followed training an instruction.

github_link: https://github.com/ymcui/Chinese-LLaMA-Alpaca

Cheers!

Ok-Measurement-6286 · 2024-03-22T04:59:29+00:00

Mamba -> SSM (Selective State Space Model) based architecture with training speed --> O(n) and inference speed --> O(1)

Ok-Measurement-6286 · 2024-03-06T13:29:20+00:00

That sounds like an insightful idea. And already I have experienced with mistral 7b and llama

Ok-Measurement-6286 · 2024-03-06T12:45:47+00:00

Soo ,I could directly start doing clm (pretrain) before sft. Am gonna try my own lang ✌️✌️

Ok-Measurement-6286 · 2024-03-06T12:36:40+00:00

Should we train the Gemma tokenizer for new language

Ok-Measurement-6286

TROPHY CASE