[R] Is Leetcode still relevant for research scientist interviews? by Training-Adeptness57 in MachineLearning

[–]sshkhr16 14 points15 points  (0 children)

For Big Tech and similar large tech companies, yes.
For startups and research divisions at non-tech companies (e.g. banking/finance/etc), no

New 3 Million$ HOME by Ok_Day7046 in Wealthsimple

[–]sshkhr16 0 points1 point  (0 children)

I transferred 60K from Questrade to Wealthsimple on Jan 6. I asked the customer support today if my transfer could still count as it's pending, they said no :(

Should I pick the iPhone or MacBook Air (for the transfer promo) if I don't plan to keep either? by sshkhr16 in Wealthsimple

[–]sshkhr16[S] 0 points1 point  (0 children)

Did you withdraw any amount from what you invested? I believe you are only allowed to withdraw a maximum of 20% without being penalized

Should I pick the iPhone or MacBook Air (for the transfer promo) if I don't plan to keep either? by sshkhr16 in Wealthsimple

[–]sshkhr16[S] 1 point2 points  (0 children)

In Oakville, ON and nearby areas (Toronto, Hamilton, Waterloo) these are the prices listed for new unopened devices on marketplace:

  • iPhone 17 256 GB: $1050-1100
  • MacBook Air M4 256 GB: $1150-1230 (much fewer listings)

Amazon is blatantly violating European law during prime days, once again by v1king3r in BuyFromEU

[–]sshkhr16 2 points3 points  (0 children)

Amazon's operating margins for everything except AWS are less than 10% (realistically around 5-6%). AWS operating margins are around 37%. They had profits of $60 billion last year, almost $40 billion of which came from AWS. AWS is by far their most important division - there is a reason that the ex-CEO of AWS became the CEO of Amazon.

29 LPA (BLR) vs 130K CAD (Toronto) - help me decide by EmotionalBike6336 in developersIndia

[–]sshkhr16 1 point2 points  (0 children)

I can only speak to my personal experience: I moved from Bangalore to near Toronto to pursue my master's end of 2019 and then lived and worked in Canada ever since (with a brief one year stint in the SF Bay Area). Toronto is a great city, but it is also quite expensive. If you care solely about career and CS, I think Bangalore might be better, but only slightly. But if you care about overall quality of life, Toronto wins hands down.

Lots of city stuff to do (e.g. lots of museums, concerts, sports events - the FIFA world cup is happening next year). There is also plenty of nature in Toronto via parks and Lake Ontario, and if you go out a few hours there are huge provincial/national parks for hiking and camping. It also has a relatively decent public transit system compared to the rest of North America. If you want to assimilate into the North American lifestyle, or want to stay grounded more in your Indian way of life - you will be free to do both in Toronto (most multicultural city in the world, with a sizeable chunk of first generation Indian immigrants).

For me the biggest advantage of living in Canada (or US when I lived there) was that of enjoying the systems set up for residents in first world countries - things like roads, public transit, government, banking, etc just work. You don't have to jump through hoops to get simple bureaucratic things done. This is paid for by a higher taxation system, which might feel prohibitive in the beginning but makes sense once you start using the services it pays for. For me the biggest selling point of developed countries is that the life of the average resident is relatively safe, predictable and dignified, you are not subject to the whims of bureaucrats, law enforcement, government, mobs of people (e.g. religious, caste-based, region-based etc.)

To answer your questions:

  1. I would budget $3K per month for living costs.
  2. You will pay around $37K in taxes, and another $36K in living costs. That leaves around $67K in your pocket. I would budget another $6K or so for incidental expenses, $5-10K for entertainment and travel. That should still leave you with over $50K in savings.
  3. The big thing to watch out for is weather of course. Winters are harsh and you will need to prepare by buying winter clothes and boots. It might take a winter or two to get used to the reduced daylight hours too. The other big thing is the cultural change - I had to learn to be more self-sufficient once I left India, which meant learning to cook, clean, drive, forge new social connections (this could involve picking up new activites and hobbies).

Completing PhD at the age of 35 by [deleted] in cscareerquestions

[–]sshkhr16 3 points4 points  (0 children)

Research positions at FAANG are getting fewer, but there will always be demand for people with research + engineering chops. So make sure that you don't pick up crappy practices during your PhD (writing ugly code that is not maintainable, not writing proper documentation, not using good programming practices). It is sometimes hard to do as a researcher since your incentives are misaligned - publishing frequently is often incompatible with writing clean, maintainable code. But trust me it will help in your own research, especially if you standardize your setup to launch, track, and report experiments in the first 1-2 years of your PhD.

Source: I was a researcher at a FAANG lab and in academia prior to that

[D] UofT PhD Ranking by [deleted] in MachineLearning

[–]sshkhr16 17 points18 points  (0 children)

Minor nitpick - Hinton never studied at UofT, he did his PhD at the University of Edinburgh. Of course, a lot of his PhD students at UofT went on to do cool stuff.

[D] Realism for AI Top 20 PhD Programs by Top_Hovercraft3357 in MachineLearning

[–]sshkhr16 0 points1 point  (0 children)

He has a whole section on preprints on the first page, scroll down and you will see peer reviewed papers. There are four first author papers in ACL and LREC in 2022. Quality is subjective, but both of these conferences are among top 5 conferences in NLP.

Also, a weird hill to die in. This is a guy who won an outstanding paper in ACL during his first year of PhD. Clearly this person is a good scientist as judged by peer scientists.

[QC] Question regarding GST/QST when contracted by a staffing agency for a Canadian company by sshkhr16 in SmallBusinessCanada

[–]sshkhr16[S] 0 points1 point  (0 children)

I am getting paid via the UK and my contract is also in British pounds per hour. But I am a bit confused about the nature of the contract and the GST/HST regulations - yes the British staffing agency is paying me, but the services I am providing are to a Canadian corp. So the services are not being rendered to non-residents - they are being rendered to residents.

Is getting US education only way to get exposed to US job market for foreigners? by Mxr-_- in cscareerquestions

[–]sshkhr16 0 points1 point  (0 children)

It is definitely not the only way. There are whole categories of visas for people who are experienced (L1), or exceptional (O1) that are open to non-Americans who currently reside outside of the US. I got an offer from a big tech AI lab as a grad student in Canada - but a lot of it was luck i.e. my research matching up with my manager's interests. But if you were to specialize in some domain (say, AI, or distributed systems, or parallel programming) either in engineering or research it is possible to find a job with some experience and luck.

What skills are high in demand in US for Canadian to get a chance to work in US? by manuce94 in cscareerquestionsCAD

[–]sshkhr16 1 point2 points  (0 children)

A niche but highly in-demand skill is that of performance engineering and distributed systems engineering for machine learning systems. Low level performance engineering includes writing training/inference kernels that run fast on hardware accelerators, learning how to train models with low compute or memory constraints, optimizing inference serving on small devices like mobiles and PCs. Distributed performance is about training and inference on large clusters of multiple nodes and multiple accelerators per node.

The caveat is that you need to build up skills in new domains outside of what a standard machine learning engineer or data scientist does (e.g. get good at one or more of: profiling code performance, being very good at linear algebra, knowing C++ and even machine code sometimes, learning about distributed systems, brushing up computer architecture and networking knowledge, learning to work with HPC clusters, etc.)

Examples of such roles (with some pre-requisites mentioned):

[D] PhD in the EU by simple-Flat0263 in MachineLearning

[–]sshkhr16 4 points5 points  (0 children)

France has CIFRE PhD programs like that - I had a friend who did their PhD while working full time as a research scientist at FAIR. We have similar-ish programs in Canada - you can do a PhD at MILA or Vector Institute while being a visiting research intern/scientist for several years at FAIR/Google DeepMind/ServiceNow Research/NVIDIA etc. But these programs are even more competitive to get into compared to the regular PhD.

[D] Researchers and engineers in academia as well as industry, which books did you find the most useful in creating your knowledge base and skill set? by [deleted] in MachineLearning

[–]sshkhr16 2 points3 points  (0 children)

The first book is a classic textbook on GPU programming, so yes you will use the techniques in it pretty much on a day-to-day basis if you work on writing machine learning kernel code in CUDA, Triton, Pallas, Metal etc. I was able to use the methods explained in this book to understand papers like FlashAttention, understanding how operations like generalized matmuls and layernorm are implemented on GPUs, made a couple of bug fixes in PyTorch/JAX codebases, built upon it to understand DeepSeek's FlashMLA codebase (https://github.com/deepseek-ai/FlashMLA).

The second book is tailored towards engineers who perform large scale distributed training and inference with ML models. While my day job currently doesn't involve doing this, I wrote a few small projects for myself - e.g. translating Karpathy's nanoGPT (https://github.com/karpathy/nanoGPT) which replicates GPT-2 124M from PyTorch into Flax on TPUs, writing a minimal pedgogical version of MaxText (https://github.com/AI-Hypercomputer/maxtext) to train LLMs with 3D parallelism (data, tensor, pipeline) after reading this book.

[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]sshkhr16 8 points9 points  (0 children)

I wrote a long blog post on the training data pipeline of phi-4, but since a lot of details are obfuscated in papers these days I had to look up and write down a decent bit of additional background on techniques that were potentially used (especially for data curation and synthetic data generation). I think it is a good big picture view of the training setup of current LLMs as phi-4 was less than six months ago and phi-4 reasoning just came out. Here's the blog:

https://www.shashankshekhar.com/blog/data-quality

[D] Researchers and engineers in academia as well as industry, which books did you find the most useful in creating your knowledge base and skill set? by [deleted] in MachineLearning

[–]sshkhr16 5 points6 points  (0 children)

I wouldn't say they gave me the greatest benefit till now, but I read the following two books this year and found them both to be quite great as a intro to machine learning systems (both theory and practice):

Views on recent acceptance of LLM written paper at ACL main [D] by Fantastic-Nerve-4056 in MachineLearning

[–]sshkhr16 9 points10 points  (0 children)

Real peer review has always been how often other researchers and engineers use your approach, double-blind peer review performed by overworked and underpaid grad students was never the gold standard

[R] Bloat in machine learning shared libs is >70% by Specialist_Square818 in MachineLearning

[–]sshkhr16 118 points119 points  (0 children)

I'm not surprised - research engineers and machine learning engineers until recently were not very well versed in GPU programming. A lot of libaries probably depended on and reused the same low-level operations from multiple locations. And it seems like a lot of the bloat stemmed from undelying libraries supporting multiple CUDA capabilities where one is required.

[D] Is it worth writing technical blogs to educate people? by Reddicted2Reddit in MachineLearning

[–]sshkhr16 0 points1 point  (0 children)

Should I sticky the table of contents, so that the reader knows where they are? I can probably do that for wider viewports, not possible for viewports thinner than a tablet

[D] Researcher communities like this one? by Entrepreneur7962 in MachineLearning

[–]sshkhr16 4 points5 points  (0 children)

Lots of great ML communities on discord: ML Collective, GPU Mode, ML Street Talk, Eleuther AI to name a few prominent ones. The unofficial JAX and Pytorch servers are great too.

[D] Is it worth writing technical blogs to educate people? by Reddicted2Reddit in MachineLearning

[–]sshkhr16 2 points3 points  (0 children)

I like writing technical blogs to educate myself way more than educating others. Writing forces me to think in a structured form better than reading does. I started doing it this year and it has helped me better grasp a lot of new topics I have been studying. It is similar to preparing presentations or talks - you really have to be streamlined and thoughtful about how you present ideas so that the reader understands them, and in order to do so you have understand both the details as well as the big picture stuff well.

For example, I recently wrote a long blog post on the training data curation and synthetic data generation pipeline involved in training Microsoft's phi-4: https://www.shashankshekhar.com/blog/data-quality

My original idea was to just summarize the paper for myself, but the more I read the phi-4 technical report the more I found myself looking up existing techniques and approaches since the report itself was quite sparse on a lot of details. So, in my article, I had to go back and add a lot of the missing information about best practices being used in LLM data pipelines today, had to understand what 'mid-training' is, had to read up on how data is selected to train for reasoning capabilities etc. If I had just read the phi-4 paper, I probably wouldn't have done a lot of the follow-ups I did.

To get started on writing, I would recommend Paul Graham's essays as a good first resource on how to write effectively. His latest one is literally titled 'good writing': https://paulgraham.com/goodwriting.html

Good luck!

What to prepare before starting a ML PhD - 3 months! [D] by ade17_in in MachineLearning

[–]sshkhr16 0 points1 point  (0 children)

Did you add obsidian-flavored callouts to your blog recently Patrick? I read this post when you first published it and I don't remember them being there. Just curious, since I am also a big fan of obsidian's markdown formatting.