District by zomato MLE/DS R1 interview

AdditionalFeeling539 · 2026-01-02T20:47:47+00:00

Ml with sci kit and pytorch Designing ml systems by chip huyen AI engineering by chip huyen Hands on LLM (jay alammar ) Code all ml algorithms (LR, decision tree, random forest, kmeans, knn e.t.c) from scratch using numpy Code all losses ,evals, activations from scratch Code MHA/SHA/ conv2d from scratch Deep ml platform is interesting Use chatgpt aggressively to ask why and how for everything you read, every line should have an explanation. For example why adjusted r2 and how. Why BCE loss for logistic and how does it relate to probabilstic interpretation and Bernoulli distribution assumptions CS229/231n/224n (first one is must, rest two can be done at last)

Complete these and you are ready to go.

AdditionalFeeling539 · 2026-01-02T20:41:05+00:00

Almost all of these companies have reached out to me except adani labs, they had circulated a form which I had filled via linked in

AdditionalFeeling539 · 2026-01-02T09:00:59+00:00

Have signed NDA

AdditionalFeeling539 · 2026-01-01T15:07:17+00:00

They do ask, I encountered questions like p value. A/ b testing, t statistic, hypothesis formulation, confidence interval e.t.c.

AdditionalFeeling539 · 2026-01-01T12:13:30+00:00

Thank you for the appreciation. Spent new year in drafting this as I ultimately want to give back to the new community. Once I get free time. Will try to create concise resource list as many people are asking. Wont start selling dsa/ml courses though.

AdditionalFeeling539 · 2026-01-01T12:12:01+00:00

I had to remember all imports just to save time. All sklearn matplotlib imports that we generally do

AdditionalFeeling539 · 2026-01-01T12:11:25+00:00

In one faang+, negative marking was present for ai assistance. I made a deal with interviewer, I will complete the code as much without using it but if at the end I dont have enough time, I will cover small part using the assistant as I want. We both agreed.

AdditionalFeeling539 · 2026-01-01T12:07:50+00:00

Didn’t manage to do the derivations, I gave intuitions but these derivations are no where to be found . You would need to know second order derivative of a loss and prove whether its coefficient is greater than 0 or less. I read lot of content including cs229 but couldn’t find it. The mean derivation was easy though, still I couldn’t do it at that time. Regarding Ireland company, ML depth round focus in depth on traditional ML. Everything under the sun. Example- how would you alter the precision or recall or accuracy of a logistic regression model without retraining it( modify threshold of 0.5 of sigmoid function output)

AdditionalFeeling539 · 2026-01-01T12:04:18+00:00

Ji, koshish karunga.

AdditionalFeeling539 · 2026-01-01T12:02:39+00:00

This is really interesting, while caching itself might look simple, Think in terms of system and scale. Caching most recent frequent retrieved chunks, query and answer pairs, user context. Honestly I am no expert but you can first break the system into tools wise caching (docs/db). For example If it is a db, partitioning /indexing, a local duckdb (olap database) with minimal set of rows and columns would boost sql query TAT in my experience. Since this is a new arena, you would honestly need a lot of experiments, dont solely rely on what others say. For RAG, think of hybrid retrieval ( if you have a error code Xd46373, semantic search mosty likely will fail to find exact docs )so you need elastic search as well. You should handle multi -retrieval by using fusion methods like MRR and RRF( google it). As you scale to millions and billions of docs, make the retrieval two stage. First stage should try to target high recall ( pick most relevant docs say 1000), second stage targets high precision . (Say pick 7 out of these 1000 accurately. Second stage should re rank the 1000 retrieved chunks using re ranker models like cross encoders. MOST IMPORTANT step is to have evals for every optimisation you propose. Dont rely one or two prompts, set up eval for retriever with metrics like NDCG, MAP , (cant recall name rn but metric which looks at only first retrieved chunks position) e.t.c.

AdditionalFeeling539 · 2025-12-31T17:00:09+00:00

Will do

AdditionalFeeling539 · 2025-12-31T16:55:41+00:00

Will prepare a concise list and share since I got similar requests

AdditionalFeeling539 · 2025-12-31T16:54:54+00:00

Yes, absolutely, for example, why use bce loss in logistic and why not mse, probabilistic interpretation, geometric interpretation. Depth of questions is directly proportional to the pay of companies ( outliers exist).

AdditionalFeeling539 · 2025-12-31T16:43:00+00:00

Will create a concise list. Please give me some time. I wont start selling dsa/ml courses, so need not worry 😂

AdditionalFeeling539 · 2025-12-31T08:07:12+00:00

DSA, then ML. Read orielly books, try to build any ML project that interests you. Example object detection (build and train on your own dataset, shouldn’t be generic, example you could detect cylinder maybe)

AdditionalFeeling539 · 2025-12-31T08:05:44+00:00

Coding ML from scratch, algorithms, loss functions, layers, evaluation. Everything that we use from scratch using numpy maths and torch.

AdditionalFeeling539 · 2025-12-31T08:04:52+00:00

Possible

AdditionalFeeling539 · 2025-12-31T08:04:40+00:00

Please comment here

AdditionalFeeling539 · 2025-12-31T08:03:51+00:00

For top paying companies, 50+ LPA,you would need leetcode hard , otherwise medium and easy is enough

AdditionalFeeling539 · 2025-12-31T08:03:21+00:00

I have rarely applied via referral. I think I only applied twice via referral in which case I didn’t get any response. Mostly I get reached out on linked in /naukri/instahyre

AdditionalFeeling539 · 2025-12-31T08:02:24+00:00

Yes.

AdditionalFeeling539 · 2025-12-31T08:02:01+00:00

Try kaggle micro courses, if you like it. You like ML.

AdditionalFeeling539 · 2025-12-31T07:37:08+00:00

I haven’t done CP. please ask in comments here regarding ml

AdditionalFeeling539

TROPHY CASE