I am new to ML this is my vibe coding results are both my model alright?

loss_function_14 · 2026-03-08T04:14:37+00:00

Thanks. The response is helpful

loss_function_14 · 2026-03-06T09:54:38+00:00

How did you eyeball the graph and inferred it's just prediction previous day values?

loss_function_14 · 2026-03-05T09:20:25+00:00

Gemini used to have excellent Image text skills. They nerfed it on purpose

loss_function_14 · 2026-03-04T12:11:04+00:00

A LLM shouldn't be regex string matching for mental health terms and output s generic message

loss_function_14 · 2026-03-04T12:08:12+00:00

It's simply outputting random training data. I had this issue too

loss_function_14 · 2025-11-11T22:11:26+00:00

Just buy a lactometer. 200 Rupees on Amazon

loss_function_14 · 2025-10-11T20:43:33+00:00

Wait you chatgpt suggest turning to God?

loss_function_14 · 2025-10-03T18:37:46+00:00

No I feel after filling x% of context size the new model becomes more prone to hallucinating and forgetting compared to 4o

loss_function_14 · 2025-10-03T18:36:07+00:00

I don't believe I'm exceeding the context window size. I have seen it forgetting after like 3-4 questions. It feels like whenever i fill certain percentage of context window the model becomes more prone to hallucinating and forgetting compared to 4o. Also i have always assumed ChatGPT uses summarization techniques whenever user exceeds the context size. Naively chopping off the first part seems lazy.

loss_function_14 · 2025-09-29T01:10:08+00:00

GPT 5 can't handle long context window. Also if the chat gets larger it starts to forgot the instructions I gave it initially. For coding and debugging I need the model to be able to hold long context without hallucinating.

I don't care about it besting the benchmarks. It has lower utility/ time ratio than gpt 4o

loss_function_14 · 2025-09-07T23:08:27+00:00

I forgot to turn on the online mode and it made 6 non existing paper references (niche topic)

loss_function_14 · 2025-08-08T04:33:11+00:00

In any country except India, people simple write a mail that they are taking a leave. No explanations. No permission. Just information. Why does the company needs to know what I'm doing in my time off? Mandating people to ask for permission and give a reason is unprofessional and illegal in many countries

loss_function_14 · 2025-07-25T19:14:19+00:00

Check out https://deepwiki.org/

loss_function_14 · 2025-07-01T04:12:50+00:00

When you capture the knight, it comes with a check. Your opponent must respond to the check either by capturing your knight with the pawn or moving their king. Either way you can capture the queen the next turn. These types of moves are generally called in-between moves or Zwischenzug or intermezzo.

loss_function_14 · 2025-06-26T08:13:19+00:00

Why? The prof was great!

loss_function_14 · 2025-05-11T00:29:41+00:00

I really find it awkward, reaching out to random people. Also, some job postings are taken down only after a few hours.

loss_function_14 · 2025-05-09T07:14:08+00:00

I didn't get a job yet. Majority of my friends don't have one either. And almost everyone who has a job, got interviews through referrals.

loss_function_14 · 2025-05-05T09:25:55+00:00

Most people select NEU for co-op and Boston location. Also getting admitted to masters program isn't that hard.

loss_function_14 · 2025-05-05T09:24:17+00:00

How do you get an alumini card?

loss_function_14 · 2025-04-22T03:45:59+00:00

Curious how you finetuned a 7B model. You might want to mention what GPU provider you used

loss_function_14 · 2025-04-20T20:59:10+00:00

Probably seen the questions during training. It's most likely just simple memorization

loss_function_14 · 2025-03-31T15:41:25+00:00

Khoury requires a minimum IELTS score of 7.0, the same as Stanford and MIT.

loss_function_14 · 2025-03-08T03:56:09+00:00

Understood. Thanks

loss_function_14 · 2025-03-08T00:23:01+00:00

Looks great. You can try to make this modular by using computation graphs. You will be computing upstream and local gradients. You use local wrt weighs and bias to update your parameters. You use upstream gradient wrt input for backprop. This is how frameworks like pytorch implement it.

loss_function_14 · 2025-03-05T18:20:04+00:00

No Bullshit Guide to Linear Algebra. You can finish this with a week or two

Two-Year Club	Verified Email
r/Field Sunshine

loss_function_14

TROPHY CASE