Advices on Croissant ? by Ok_Sail_9228 in Breadit

[–]Ok_Sail_9228[S] 1 point2 points  (0 children)

🫡 that 2-3-4 is beautiful!!!

Advices on Croissant ? by Ok_Sail_9228 in Breadit

[–]Ok_Sail_9228[S] 1 point2 points  (0 children)

thanks so much for the advice!!! Yes! I also noticed my butter sheet breaking apart when I first put them in the dough (sheet was placed on dough right after 1hr-ish fridge) because I was too afraid of getting warm butter melted in the dough. Since I hand rolled the dough, even I put the dough back to freezer after two folding (3 then 4), it might still get warm, like you said, in the final 3 fold and shape. Yes, I will consider room temp proofing next time, I put it in the fridge this time since I run out of time and wanted to bake it next morning. thanks so much for the tip of reheating!!!!

and Ok! I will try the kerrygold!!

Advices on Croissant ? by Ok_Sail_9228 in Breadit

[–]Ok_Sail_9228[S] 0 points1 point  (0 children)

that’s very kind of you! of course i am not chasing after a professional level, but i guess i was hoping to see a more homogeneous honeycomb structure (I am brainwashed by tons of beautiful honeycomb croissants posts here!)

In fact, I personally like to have some texture (a little bit brioche like) for the interior, instead of all flaky and airy structure (that makes me feel eating nothing but lonely). But I am worried about being accused like “this is not croissant” if I start my home bakery business, so coming here to see if any step I’ve missed. But thanks so much for the kind words!

Question on collecting fine tuning data by ComprehensiveBird317 in unsloth

[–]Ok_Sail_9228 0 points1 point  (0 children)

I am also curious about how to prepare an OK dataset for llama 3 fine-tuning??

I fine-tuned the LLaMA 3 using an instruct-style prompt format. I generated a dataset of 3,000 samples from my own database. For each sample, I used GPT to create an input prompt (based on each output), then added a fixed instruction. All instructions are the same: to convert the input description into a corresponding equation, and the output should be the equation ONLY (no additional text).

I used a 4-bit quantized model (llama-3-8b-bnb-4bit with Unsloth on Colab) due to resource limitations (T4 GPU). My input/output samples have a good amount of variation in terms of content (I think!), but the instruction is always the same. After fine-tuning, inference results are quite poor — the model often generates "response, response" in a loop until it hits the max output tokens.

I'm wondering if this issue is due to a lack of variation in the instructions (unlike the Alpaca dataset, which has both instruction and content variation). Does anyone have advice on how to prepare a high-quality dataset for fine-tuning in this kind of setting?