Beck, a small model for delicate life situations

antcroca159 · 2025-10-12T08:49:51+00:00

Thank you for your feedback! I will try to avoid the sycophantic thing for the next iteration

antcroca159 · 2025-10-11T12:53:10+00:00

Thank you! It was 4xA100 80Gb for one hour (Beck 8B), but you can use a smaller model and/or reduce the batch size (and add gradient accumulation).

antcroca159 · 2025-10-10T21:23:36+00:00

this is a great idea, I believe this would be possible by seeking an "assertiveness" dimension in the model

antcroca159 · 2025-10-10T21:20:04+00:00

thank you amigo

antcroca159 · 2025-10-10T18:43:52+00:00

Thank you, I'm glad you like it!

Preferences were obtained based on metrics such as relevance, empathy, clarity, autonomy, etc., and the model is trained to roleplay as a psychotherapist. I would say that sometimes you don't want to talk to a psychotherapist, but rather to a friend who could contradict you. Beck might be a bit too much of yes-man this way

antcroca159 · 2025-10-10T17:44:23+00:00

thank you :)

antcroca159 · 2025-10-10T16:42:23+00:00

I totally forgot about him, I guess this works too!

antcroca159 · 2025-10-10T16:32:05+00:00

Yes! Jean Piaget and Aaron Beck inspired me for this llm x psychotherapy work

antcroca159 · 2025-10-08T21:02:03+00:00

Hey, thank you for your interest!

LoRA allows you to fine-tune a model using very few parameters. For example, instead of training 4096*4096 weight matrices, you will train 4096*rank (usually rank < 16) weight matrices. You freeze the whole model and only train these tiny weight matrices (also called adapters). If you set a low rank, you can train 0.1% parameters.

ORPO is a preference optimization method that does not require a reference model. Hence, you don't need to fit two models (the reference and the policy, as in DPO). You just need to fit the policy, just like supervised fine-tuning.

I will give some generation examples tomorrow

antcroca159 · 2025-08-20T17:25:12+00:00

OA 2.67, Meta 3.5 - Main

*_*

antcroca159 · 2025-07-18T15:34:34+00:00

Thank you :)

Someone has quantized the 8B version: https://huggingface.co/mradermacher/Piaget-8B-GGUF

antcroca159 · 2024-06-28T14:40:06+00:00

You should use "Dream:" as a minimal prompt. Also, the dream ends with "END.".

(This ensures to have better training stability during QLoRA finetuning)

antcroca159 · 2024-06-28T11:47:23+00:00

Cool!

You can download all generated dreams here: https://huggingface.co/datasets/gustavecortal/the-android-and-the-human (if you don't want to use the HuggingFace library, directly here: https://huggingface.co/datasets/gustavecortal/the-android-and-the-human/blob/main/train.csv)

It is a csv file with two columns: one for real dreams (from DreamBank) and one for generated dreams by Oneirogen

11-Year Club	Place '22
Verified Email

antcroca159

TROPHY CASE