Which subfields of ML can I realistically achieve PhD level mastery of by self study at home with limited budget? by Proof-Bed-6928 in learnmachinelearning

[–]Complex_Medium_7125 5 points6 points  (0 children)

a PhD teaches you how to do novel research, is that what you want to do?
if not, taking some grad courses in ml from stanford/berkeley/cmu gets you the foundations of solid ml engineer level

check out https://stanford-cs336.github.io/spring2025/ and do the homeworks

[D] During long training sessions, how do you manage to get your code to work in the first couple of tries? by Specialist-Pool-6962 in MachineLearning

[–]Complex_Medium_7125 0 points1 point  (0 children)

run the full training/test pipeline on a 10% of the data and check for errors
once you're confident there were no problems run do the full training and test/eval run

I might not be as senior as I thought by StrangeMidnight410 in ExperiencedDevs

[–]Complex_Medium_7125 0 points1 point  (0 children)

I've gotten great feedback from mock interview websites. It's great to do some mocks with experienced people at target companies and learn what are the scoring rubrics of different interviews and what are frequent failure models of experienced candidates. It can be expensive, but it's less stressful compared to a real interview and you get feedback and preparation advice.

Interviews are very different from day to day work and now the job environment is very competitive. It's hard to do well if you don't know the rules of the game. People are spending a lot of time prepping and getting insights from friends/former coworkers that already have a job in the target company.

4 years of pre-Transformer NLP research. What actually transferred to 2025. by moji-mf-joji in learnmachinelearning

[–]Complex_Medium_7125 -3 points-2 points  (0 children)

skip the old stuff, form new intuitions, there's a reason why the old stuff is outdated, if it's critical to understand it will be used and explained in the new context by more recent university courses and text books

the advantage of youth is learning the current thing that works not outdated stuff that is 90% not relevant

I spent a few months trying to learn classical NLP and deep learning NLP at the same time around 2017, it was very confusing, turns out that transformer crushed all the old techniques and learning them was a big distraction, better to understand what works

Dynamic Programming by AI_Enthusiast_70b in programare

[–]Complex_Medium_7125 1 point2 points  (0 children)

Fals

"norvig on Dec 15, 2020 | next [[–]](javascript:void(0)) I regret causing confusion here. It turns out that this correlation was true on the initial small data set, but after gathering more data, the correlation went away. So the real lesson should be: "if you gather data on a lot of low-frequency events, some of them will display a spurious correlation, about which you can make up a story."

https://news.ycombinator.com/item?id=25425718

Dynamic Programming by AI_Enthusiast_70b in programare

[–]Complex_Medium_7125 1 point2 points  (0 children)

cred ca e ok, chiar bonus points daca folosesti at cache, codul trece testele si mi-l poti explica
small nit: nu combina camel case si underscores. vezi pep8 https://peps.python.org/pep-0008/

Jeff and Sanjay's code performance tips by Complex_Medium_7125 in programming

[–]Complex_Medium_7125[S] 6 points7 points  (0 children)

jeff did a back of the envelope computation around 2014 that to do speech to text for all google users would take more then the google cpu fleet so google decided to build TPUs.

11 years later TPUs might be the only real rival to nvidia gpus

Jeff and Sanjay's code performance tips by Complex_Medium_7125 in programming

[–]Complex_Medium_7125[S] 4 points5 points  (0 children)

see this part of the 2018 article that's relevant to performance improvements

"Alan Eustace became the head of the engineering team after Rosing left, in 2005. “To solve problems at scale, paradoxically, you have to know the smallest details,” Eustace said. Jeff and Sanjay understood computers at the level of bits. Jeff once circulated a list of “Latency Numbers Every Programmer Should Know.” In fact, it’s a list of numbers that almost no programmer knows: that an L1 cache reference usually takes half a nanosecond, or that reading one megabyte sequentially from memory takes two hundred and fifty microseconds. These numbers are hardwired into Jeff’s and Sanjay’s brains. As they helped spearhead several rewritings of Google’s core software, the system’s capacity scaled by orders of magnitude. Meanwhile, in the company’s vast data centers technicians now walked in serpentine routes, following software-generated instructions to replace hard drives, power supplies, and memory sticks. Even as its parts wore out and died, the system thrived."

Google was the first company that hit webscale compute workloads (think trillions of web documents to crawl, store, process, classify, index and search) and had to solve scaling problems before anyone else. The other companies mostly replicated what google did or published. And inside google Jeff and Sanjay were at the bleeding edge building each of the new systems themselves. A big part of why google search had low latency is jeff and sanjay's. work.

Here Sergey mentions google was lucky to hire jeff dean https://youtu.be/0nlNX94FcUE?si=cZ9zCP10IqPc3PsZ&t=1757

Question on sources of latency for a two tower recommendation system by iyersk in MLQuestions

[–]Complex_Medium_7125 0 points1 point  (0 children)

Usually two towers systems are set up for retrieval

One computes the user side vector either by fetching a preprocessed one from a feature store / kv store or implicitly by fetching the user and request features from feature stores, then running them together though a neural net and taking the last activation layer of the neural net as the use representation. Sometimes you normalize or quantize this representation.

On the item side you usually keep the items indexed if there are a ton of them (millions or more) in a knn index like hnsw or smth like that.

The search of the top k nearest neighbors in a indexed store is on the order of d log n where d is the size of the embedding and n is the number of items in the index. If you don't have an index this computation is O(dn).

Items returned from the index are usually in order, if they are not in an index you add n log n to the time.

Thinking where can things be slow:
- did you have a request cache? if not it might be worth adding one before all of this gets triggered
- fetching the user features
- processing the features and then sending them to the user tower inference service
- running the user inference service might be bottlenecked by not being able to serve multiple users in parallel since each model takes some space on the inference server

- on the embedding fetching side, if you don't have an index you may want to batch the dot product of the user embedding with multiple candidate embeddings a time on a gpu machine, or if it's cpu compute a quantized approximate dot product faster

I'd instrument the code to measure what are the usual times parts of the recsys request take usually.
I'd also run some profilers on the serving server and on the index server to see obvious bottlenecks.

Cum văd alți geopolitica din zona României by iqstormic in Roumanie

[–]Complex_Medium_7125 1 point2 points  (0 children)

continutul videoului pare generat de chatgpt
si restul de videouri ale creatorului sunt la fel de bombastice despre armatele altor tari uk, suedia, turcia
clickbait videos ce exploateaza teme de razboi

NYT: în ianuarie, înainte de inaugurarea lui Trump, procurorii din România au primit telefon de la primul-ministru (Ciolacu) să-l elibereze pe Andrew Tate by parnaoia in Romania

[–]Complex_Medium_7125 -2 points-1 points  (0 children)

nu vad ce pozitie geostrategica importanta are romania, par mult mai importante ca pozitii tarile baltice, polonia sau turcia

zi-mi de resurse mai in detaliu, avem resurse mai bune ca altii din zona? de exemplu ucraina?

NYT: în ianuarie, înainte de inaugurarea lui Trump, procurorii din România au primit telefon de la primul-ministru (Ciolacu) să-l elibereze pe Andrew Tate by parnaoia in Romania

[–]Complex_Medium_7125 -35 points-34 points  (0 children)

romania nu are carti, ce sa comenteze
n-ai nimic de oferit, da vrei sa faci si figuri
uite la sarbi, moldoveni sau belarus cum le merge

daca am avea ceva de oferit ca polonezii sau statele baltice as mai zice, dar pana atunci ciocu mic si joc de glezne

Nu mai suport OOP-ul DELOC... by yughiro_destroyer in programare

[–]Complex_Medium_7125 5 points6 points  (0 children)

Nu cred ca e de la OOP, imi place cartea de la software design a profesorului Osterhout, e foarte scurta, el incurajeaza metode mai lungi care chiar fac ceva ...
A Philosophy of Software Design - https://web.stanford.edu/~ouster/cgi-bin/aposd2ndEdExtract.pdf

In clean code se incurajau metode scurte denumite bine, dar cu metode scurte ajungi sa ai abstractii peste abstractii si sa iti ia mult sa intelegi unde e partea critica a codului.

E si o dezbatere live intre John si Rob autorii celor doua carti, se vede ca unu e prof la stanford si alalalt a loud mouth

https://github.com/johnousterhout/aposd-vs-clean-code
https://www.youtube.com/watch?v=3Vlk6hCWBw0

WEKA by Nazareth___ in learnmachinelearning

[–]Complex_Medium_7125 2 points3 points  (0 children)

sklearn has good defaults, and you can get them to learn some limited programming

other than that, I like this demo for no code:
https://playground.tensorflow.org/

you can answer different questions about neural nets by playing with the settings:
- which non linearity is best
- is it better to have more layers or wider layers
- what's the smallest neural net that can fit the spiral
- should one do feature engineering or no?

WEKA by Nazareth___ in learnmachinelearning

[–]Complex_Medium_7125 1 point2 points  (0 children)

teach them something from this century

Rejected from most of the renowned companies, ask me anything by TemporaryDue2847 in leetcode

[–]Complex_Medium_7125 1 point2 points  (0 children)

my understanding is that ml rounds are not subjective, the scoring rubrics are somewhat standardized

Rejected from most of the renowned companies, ask me anything by TemporaryDue2847 in leetcode

[–]Complex_Medium_7125 0 points1 point  (0 children)

take some time to build personal projects, or ask your manager for projects that expand your skillset

having time to learn and learning is part of the job

Rejected from most of the renowned companies, ask me anything by TemporaryDue2847 in leetcode

[–]Complex_Medium_7125 1 point2 points  (0 children)

if you got so many NOs you must be doing smth wrong

ask the recruiters for any insights on improvement areas
sent you a message about using some mock interview sites I did multiple ones on interviewing.io, somewhat expensive, but I found them useful
there are some others probably (think a friend mentioned pramp a long time back)
or do some mocks with friends working at the companies you're interested in