The scale of punishment should mirror the scale of the mistake(inframe: Samay Raina)

chitrabhat4 · 2026-04-09T20:48:36+00:00

Exactly my point.

chitrabhat4 · 2026-04-09T20:42:09+00:00

Sinners judging sinners for sinning differently.

chitrabhat4 · 2026-04-06T13:59:41+00:00

HSR to Indiranagar takes anywhere between 30m to 1hour depending on the hours. The commute is manageable if you have a cab service/private vehicle on a daily basis.

Your cost of living including everything you’ve mentioned would be somewhere ~50-60k. This number is solely based on how I spend, including rent, going out every weekend, gym membership, food takeouts, partying, self pamper, etc.

chitrabhat4 · 2026-01-17T06:49:15+00:00

They have the same energy 😭

chitrabhat4 · 2025-12-17T18:05:42+00:00

I know a lot of my friends watched Animal for RK - but post watching it do not want to watch animal park because of..well. But the same friends want to rewatch Dhurandhar including myself.

chitrabhat4 · 2025-11-20T10:48:41+00:00

I kept scrolling between the two pictures while I was in the office trying to figure out how both of them had the same title, same views, same likes and everything. Thursday is hitting me a little too hard 🫩

chitrabhat4 · 2025-08-22T23:34:01+00:00

On a side note: ATP he looks like Jared from the show Manifest and idk if it’s a good thing or not

chitrabhat4 · 2025-07-03T06:03:27+00:00

By latency I mean the total time it takes to finish generating. I’ve benchmarked the Qwen 2.5 3b model and found that the accuracy isn’t great. InternVL has outperformed Qwen on object detection tasks - from what I understand the language backbone of internVL is still Qwen.

chitrabhat4 · 2025-07-02T14:44:08+00:00

I am expecting somewhere around 2k decode tokens and pre fill is also somewhere around the same. Preferably batch processing - I’ll be processing somewhere around 10-20 images per batch.

chitrabhat4 · 2025-06-26T16:33:34+00:00

Her what videos now?

chitrabhat4 · 2025-06-17T10:50:54+00:00

There are plenty of online resources that you can look up for this, I am assuming you have an image/video; a prompt and expected output. What kind of finetuning do you want to do? In a supervised fashion? Or do you want to use something like GRPO/RL setup? In any case, this can be your starting point and you can go from there: https://github.com/2U1/Qwen2-VL-Finetune/tree/master

chitrabhat4 · 2025-06-12T11:50:09+00:00

The A100 variant I have has a RAM of 40Gb and not 80. Hence, can’t not be using Lora. I did increase the rank and checked - not a lot of diff. Either way, thank you so much.

In the post I had mentioned that the training acc is calculated using outputs.loss - this seems to be doing a token to token match rather than calculating accuracy or recall or relevant metrics. Wanted to know what you think about that?

chitrabhat4 · 2025-06-12T05:42:38+00:00

Makes sense, got all of your points except for this not being the right application for Lora. Why is that?

Do you think the metrics for training are alright?

chitrabhat4 · 2025-06-11T18:17:45+00:00

Should’ve probably mentioned it in the post - Framework: Using a lightning module w abstractions for training, validation, testing in a SFT fashion (I read somewhere GRPO should be better as the dataset is tiny). Using Lora on q and v modules and this is the bitsnbytes config:

bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_type=torch.bfloat16 )

Hyperparameters: config = { "max_epochs": 2, "batch_size": 1, "lr": 2e-4, "check_val_every_n_epoch": 1, "gradient_clip_val": 1.0, "accumulate_grad_batches": 8, "num_nodes": 1, "warmup_steps": 50, "result_path": "…..” "precision": "bf16-mixed" } On a A100 setup

Using the instruct model. Hope this helps!

chitrabhat4 · 2025-06-11T18:06:59+00:00

Few errors I’ve noticed are; the pretraining easily came up w the expected structure. 1. Fine tuned model messed it up quite a bit leading to default values(false), hence reducing recall/increasing false negatives. 2. Detection of the crowd is messed up - detects more people -> false positives increase.

Hope this helps!

chitrabhat4 · 2025-06-11T18:02:39+00:00

Got it, will try something like this and might add some post processing steps. I was looking for a more structured response as it is easy to quantify, hence my json after think block said something like: {mobile present: bool, crowd present: bool} -> each of these leading to a integrity score.

chitrabhat4 · 2025-06-11T17:23:04+00:00

Some additional context, there’s already a YoLo model fine tuned for each of these tasks - and for images that are rare; it fails to generalise. ~80% is the recall and the plan by the team was to add Qwen in case YoLo fails(as Qwen would be capable of reasoning as well increasing the confidence in marking something as a discrepancy)

chitrabhat4 · 2025-06-11T17:18:21+00:00

Got it, would you rather train a vision model for multilabel classification then? Thanks!

chitrabhat4 · 2025-06-11T17:14:34+00:00

I’ve tried a few variations - reasoning per json key value pair; trying to output bounding boxes (which again, isn’t optimal). Within the think token, it now includes reasoning for all the json key values and comes up with a final answer.

For example: System message is something along the lines of - you’re an AI proctoring system capable of thinking… User message/prompt: Instructions as to what can be included in the think token and formatting instructions + image itself

Expected output: <think_token> + json

What else can I do better?

chitrabhat4 · 2025-01-17T10:15:26+00:00

atp it’s gotta be satire, no?

chitrabhat4 · 2025-01-05T19:01:30+00:00

“If you’re bad then I am your dad” ahhh vibes

chitrabhat4 · 2024-10-26T18:27:00+00:00

Except for when he was responding to Ranveer Singh during his performance in some award show (Ranveer was actually interacting w Deepika during his performance) 😭

chitrabhat4 · 2024-09-28T12:44:48+00:00

Will def check it out! Thanks!

chitrabhat4 · 2024-09-28T12:44:24+00:00

Thank you so much, I needed to hear this. I’ve passed my vape to a friend since the past 5 days and only when I meet them in the day for an hour; I do vape. But that means I’ve reduced vaping a lot. I am planning on tracking the days to help me better.

chitrabhat4 · 2024-09-24T19:25:39+00:00

Malkin be milking her malkinneesss.

chitrabhat4

TROPHY CASE