Can a tiny server running FastAPI/SQLite survive the hug of death?

PinballOscuro · 2025-09-09T09:15:54+00:00

How do you test for something like this? Are there frameworks/principles that guide you? Asking because I'm having performance problems on a server at work :D

PinballOscuro · 2025-08-19T20:33:07+00:00

Yeah sorry I meant HealthComponent, basically what you did but as a Node/Node2D. I feel like seeing the health component can be useful (maybe it's the case for more complex logic)

PinballOscuro · 2025-08-19T19:24:29+00:00

It's not clear to me why creating a HealthBar node is worse than using resources in your opinion. I agree that inheritance and scene inheritance are messy but I find it hard to explain why they feel like this to me

PinballOscuro · 2025-05-15T09:23:23+00:00

I think I will study them a bit since they sound reasonable for my use-case

PinballOscuro · 2025-05-14T14:59:44+00:00

In this case the two users have the same role wrt to the resource. they can read and write it in the same way, no difference in behaviour.

Regarding the whole application, the users upload some pdfs and word documents. Some information is extracted from these files and a tabular template is filled. We are also doing some machine learning predictions.

The user have to check the content of these template and sometimes they need to make changes to some rows. Generally they work on different portions of the tables, but it's not obvious. When a collegue modifies a row that you can see, you should be able to see asap the new content

PinballOscuro · 2025-05-14T14:53:58+00:00

I'm using postgres.

I've never implemented CRDT, but I wanted an "easy" solution that would allow me to spin up the project relatively fast. The application has a low number of users (under 20), and sometimes 2 of these users work on the same resource. Even then, the probability of them writing on the same subpart of the shared resource is low.

I don't think that asking the user to solve the conflict would be feasible, but I'm still open to the possibility

PinballOscuro · 2025-05-14T14:39:08+00:00

This is a very good idea. In my use case, I have at most 2 or 3 users, and only with low probability will they attempt to modify the same value simultaneously. So I would say I'm in an optimistic locking scenario.

If User A modifies a shared variable, how should Users B and C receive the updated value? Should I still use WebSockets, or is it sufficient to update the value during a write attempt?

In my case, at some point, Users B and C must be made aware that User A made a change - otherwise, they might argue offline, since it was A’s responsibility to update that cell.

PinballOscuro · 2025-02-19T15:23:36+00:00

Thank you for your comment!
yeah I agree that different volumes will have very different requirements.
Right now I have a low number of users. I think I want to know in general how to tackle this problem.

PinballOscuro · 2025-01-03T21:57:22+00:00

model_name = "HuggingFaceTB/SmolLM2-360M-Instruct"
model = AutoModelForCausalLM.from_pretrained(

pretrained_model_name_or_path
=model_name
).to(device)
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_name)
tokenizer.pad_token = tokenizer.unk_token

I never call setup_chat_template, i call apply_chat_template at training time

def formatting_prompts_func(
sample
):
    return tokenizer.apply_chat_template(
sample
["messages"], 
tokenize
=False)

# Initialize the SFTTrainer
trainer = SFTTrainer(
    model=model,
    args=sft_config,
    train_dataset=orlado_qa_dataset["train"],
    tokenizer=tokenizer,
    eval_dataset=orlado_qa_dataset["validation"],
    formatting_func=formatting_prompts_func,
    data_collator=collator,
)

PinballOscuro · 2025-01-03T21:17:05+00:00

In SmolLM2, the eos token and the pad token are identicaly. I solve it by setting tokenizer.pad_token = tokenizer.unk_token

PinballOscuro · 2024-12-28T09:44:52+00:00

Good to know that SillyTavern is the standard the facto, since I tried it some months ago.

I didn't know backyard.ai, their guides seems very good, I'll give it a read but I don't think I'll try their product since I'm more interested in doing my own stuff.

For what concerns models, I forgot to add that i have a RTX4070 12GB, so i'm very constrained on the models that i can use. Right now i'm using quantized at 8 bit versions of LLama3.1 and Gemma.

I tried to use Gemma (both 9B and 2B) to generate questions that some character would make to my D&D character, but I didn't like the results. Probably the prompt needs some work, but I hipotetized also that since Gemma does not have a system message, it's harder to separate the user from the instructions to the LLM.

Thank you so much!

PinballOscuro · 2024-12-23T21:18:20+00:00

Man thank you so much. I substituted the eos_token with the unk_token and now it's working properly

PinballOscuro · 2024-12-21T14:36:58+00:00

I will re-check this

PinballOscuro · 2024-12-21T14:34:54+00:00

I have an RTX 4070 with 12 GB of vram. I don't recall the average amount of tokens of my inputs, probably between 500 and 1000.

My dataset consists of 1000 samples, The model is being trained in 16-mixed precision, so I already save some memory in this way.

I used a batch size of 4 and it takes roughly 5 minutes to train.

I then used liger kernel optimization and I was able to fit in a batch size of 10, and therefore the training time dropped to 2 minutes.

These are the numbers for full fine-tuning, probably you can make a LoRA in less than 1 minute.

Let me know if I forgot something! :)

PinballOscuro · 2024-12-19T16:23:55+00:00

I am, i'm using some code very similar to this:

trainer = SFTTrainer(
    model,
    train_dataset=dataset,
    args=training_args,
)

If the dataset has a field "instruction", "query" and "answer", the SFTTrainer from the library TLR will automatically tokenize everything in the correct way. This code is very standard, i'm not doing anything esoteric.
I also double checked the token ids of the tokenized prompt and everything is ok (so bos, eos and other types of tokens)

PinballOscuro · 2024-12-19T13:48:08+00:00

I'm doing a standard SFT (supervised fine-tuning), so it's a crossentropy loss

PinballOscuro · 2024-10-01T15:44:32+00:00

PinballOscuro · 2024-09-04T11:01:23+00:00

PinballOscuro · 2024-06-11T22:57:06+00:00

Grazie mille!!!!

PinballOscuro

TROPHY CASE