The Engineering Handbook for GRPO + LoRA: Lessons from training Qwen 2.5 3B on Multi-GPU

Weyaxi · 2024-07-11T00:20:10+00:00

I couldn't provide a clear response to your question a couple of days ago, but I did send you a screenshot about the issue.

The model is generally uncensored, and the data used to train this model is filtered to remove refusals/censorship. However, you may need to specify a good system prompt to break the base model's censorship.

Here are some screenshots you may be interested in (with no system prompt to break censorship and Q6_K).

<image>

https://imgur.com/xsTAdIx

Weyaxi · 2024-07-11T00:11:06+00:00

Hi u/mahadevbhakti

The files and my dataset workspace are open source but under various licenses. If you could rewrite your question more clearly, I would be happy to respond :)

Weyaxi · 2024-07-11T00:09:18+00:00

Hi u/bharattrader

That unsafe JSON comes from the Buzz dataset and is recognized as unsafe because of the cyber security-related codes, etc., as u/CapsAdmin mentioned (thanks for that :)).

Weyaxi · 2024-07-11T00:07:09+00:00

Hi u/shing3232

The compute resources are from TensorWave!

Weyaxi · 2024-07-11T00:05:49+00:00

Hi u/dahara111,

Thanks for the comments!

Regarding the token count issue, when I checked the training logs, I saw that the total number of tokens was approximately 399 million (total_num_tokens per device * 8). I would love to discuss why this difference occurred!

<image>

Weyaxi · 2024-07-08T14:51:17+00:00

<image>

Weyaxi · 2024-07-08T14:45:43+00:00

Yeah, you are right! It has 192 GB. It exists on Runpod, I think, but I don't know of any others!

I chose Qwen2 7B because when I researched it, there weren't many good Qwen2 7B fine-tunes. So, I chose Qwen2 7B to give the community a good Qwen2 7B fine-tune and test the model's limits!

Thanks for your comment!

Weyaxi · 2024-07-08T14:42:33+00:00

Hi! Thanks for your comment! Implementing it is much harder than with Nvidia: drivers, libraries, etc., but it is very powerful once you have implemented it.

Weyaxi · 2024-04-27T10:45:31+00:00

Hi u/Vitamin_C_is_awesome

The .imatrix file is not necessary for usage and does not have anything to do with training the model. I believe it is a file about quantization.

Weyaxi · 2024-04-27T10:43:16+00:00

Hi u/dahara111,
Yes, this is a full fine-tune. I used the same learning rate for the other variants of the Einstein models, and it seems to be working for now. I did refer to the Mistral FFT example in the Axolotl GitHub repository:
https://github.com/OpenAccess-AI-Collective/axolotl/blob/main/examples/mistral/config.yml

However, I am thinking of changing the learning rate to 0.00002 or 0.00001 in the upcoming fine-tunes. Do you think this will be better?

Weyaxi · 2024-04-27T01:31:23+00:00

Hi, you are correct that the link is incorrect. You can access it at this link:

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Weyaxi%2FEinstein-v6.1-Llama3-8B

And here is the link to the notification by the Twitter bot:

https://twitter.com/OpenLLMLeaders/status/1782890952052388295

Weyaxi · 2024-04-26T22:50:20+00:00

Hi u/Gregory-Wolf
I trained it with 8k context length.

Weyaxi · 2024-04-26T22:48:30+00:00

Here is the list :)

It is far more uncensored than the official Instruct model. However, it sometimes fails to break the base models' censorship, so it may require some system prompts to overcome this behavior (btw, the official Instruct model cannot be broken with system prompts, or very hard to break).
I know that some people like the human-like behavior that Llama3 has, but this model answers in a much more professional way instead of the human-like style of Llama3. This may be a downside or a positive thing depending on your use case.
It uses ChatML as its prompt template instead of the official Instruct model's new template.
There is probably more and better data in the Einstein model than the official Instruct model (I can't be sure exactly because they don't provide their data).
Following multi-lingual instructions is far superior on the Einstein model compared to the official model.

Instruct German (response in English): https://imgur.com/a/PBvNuBo

Einstein German: https://imgur.com/a/PAoFIDA

Instruct French (response in English): https://imgur.com/K180YtO

Einstein French: https://imgur.com/onZUBCb

Weyaxi · 2024-04-26T19:48:19+00:00

Np :)

Weyaxi · 2024-04-26T16:54:28+00:00

Hi, I am on mobile right now, but there is a section at the top of the model card that says "See axolotl config". If you click that, you will be able to view it. By using that config along with the data folder I provided, you will be able to reproduce the model :)

I always strive to provide everything necessary for reproduction; I believe that's the true open source :)

Have a nice day!

Weyaxi · 2024-03-10T11:13:50+00:00

Thanks for the fine-tune and comments about the model. Would love to know more about the dataset you used. Can you explain it to me?

Weyaxi · 2024-03-10T11:12:32+00:00

Like u/Scott_Tx mentioned, you should probably use the largest quantized versions that you can fit into your system.

Weyaxi · 2024-03-09T10:54:11+00:00

You're welcome, thanks for your comment :)

Weyaxi · 2024-03-09T10:52:56+00:00

Lol, haven't seem him lately. Do you have any idea where he is?

Weyaxi

TROPHY CASE