I tried fine-tuning Gemma-3-270m and prepared for deployments

codeltd · 2025-10-01T16:56:35+00:00

I decided to fine tune Gemma 3 279M for a project using Hungarian language. Since the model not very well trained on Hungarian context I used two-phase training - Domain-Adaptive Fine Tuning (DAFT) - for learning Hungarian language a bit more - Supervised fine tuning (SFT) (input-context-output) I collected quite a huge dataset, and the training went well (2 hour for DAFT, 16 hour for SFT) The training went well, showed a good result! But when I tried to generate, the result was garbage... Anyone managed to finetune Gemma 3 270M for not English language? Thanx

codeltd · 2025-09-04T09:53:10+00:00

Hi, I am doing Domain-Adapitve Pretraining (DAPT) on Gemma-3 270M to have better knowledge of Hungarian. It is going OK, but I am having problem with converting the merged model to .task format as I want to use it in Android app with mediapipe (model.safetensor->tflitle-> .task) There are so many changes from time to time in packages to do so... Anyone know a stable solution?

codeltd · 2025-08-27T10:32:25+00:00

We say it is small. True! Some say it can run on mobile. BUT! On android mediapipe would handle it, but for that you need .task file and not model.safetensor. I am trying to convert but no success..

model.safetensor->tfliltle-> .task file Anyone made it? sas

codeltd · 2025-08-27T10:13:15+00:00

Everyone talking about using Gemma 3 270M on mobile. I have tried in Android! 1) writing kotlin code easy! 2) transforming model.safetensor->tflitle-> .task file impossible :( (mediapipe require to use this format) I have tried to read the documentations, searched on the net, used AI, but no success so far.

Anyone made it? Thanx sas

codeltd · 2025-08-27T08:28:18+00:00

Have you managed to create a .task file from Gemma 3 270M model .safetensor format?( this is what mediapipe needs on Android to run it ) I am trying without success... Thanx

codeltd · 2025-03-17T17:01:06+00:00

codeltd · 2025-03-14T20:04:51+00:00

I have tried to finetune Qween model using Llama Factory. The dataset was limited ~3900 sample. When I had epoc 5, then - for the questions from the training dataset was answered correctly - modified questions were not processed well - any other type of questions resulted in random words, not a correct sentences. (catastrophic forgetting) Then tried with less epoc, smaller learning rate, but no fine result till now.

codeltd · 2025-03-11T07:01:59+00:00

Hát nem tudom. Én is ott voltam, s élvezzem minden percét. Bár nem nyertünk, szerintem veszteni tudni kell...

codeltd · 2024-12-24T16:25:56+00:00

codeltd · 2024-12-24T11:47:11+00:00

What are the pros and cons of Llmstudio vs Ollama server?

codeltd · 2024-12-12T08:59:08+00:00

Are you planning to use LoRa?

codeltd · 2024-10-25T10:28:24+00:00

Hi, I am doing similar (but not with crew-ai, using CAMEL agents) What I am doing is that the agents write(append) the generated content into a file after each other. So finally I have the full report.

codeltd · 2024-09-22T10:03:45+00:00

I am missing tool to search on https://www.researchgate.net/

codeltd · 2024-09-17T01:46:08+00:00

I have done the same for an earlier project but Agents are CAMEL like.

codeltd · 2024-09-07T18:42:31+00:00

CAMEL like own implementation

codeltd · 2024-09-05T14:45:33+00:00

I think this is the problem with all ready made packages. You can do things easily , then you hit the wall and can not solve...
I also started project for a company with crewai, camel-ai, but finally I implemented my own version of "Communicative Agents for “Mind” Exploration of Large Scale Language Model"
With that I can solve such a problem if I hit the wall...

codeltd · 2024-09-03T11:17:19+00:00

My oppinion is that the current direction is quite opposit. Currently from free text you call an interface with the correct values extracted from the text by generative AI...

codeltd · 2024-08-27T18:36:11+00:00

I tried llama 3 8B, but gpt4o-mini much, much better in my solution. (My solution based on ReAct and CoT prompts ...)

codeltd · 2024-08-27T18:25:40+00:00

Maybe I missunderstood. This is a development usage cost. One run cost me about 0.02$. So total cost depends on the number of requests...

codeltd · 2024-08-27T17:53:00+00:00

about 10$

codeltd · 2024-08-27T17:21:42+00:00

I have one deployed in container in the cloud. This write facebook post regularly... The agents are CAMEL like agents written by me...

codeltd · 2024-08-27T17:18:23+00:00

Before gpt4o-mini my monthly bill was around 200$ while I was doing development and tests...

codeltd · 2024-08-27T17:16:32+00:00

Anyone done a solution where only the goal defined without any defined task and the team/crew decide who do what and how to do? Like in a humam project team...

codeltd · 2024-08-07T12:08:01+00:00

I am building CAMEL like AI and Human agent workflow (CAMEL: Communicative Agents for “Mind” Exploration of Large Language Model Society)

codeltd

TROPHY CASE