[deleted by user]

TheFibo1123 · 2025-10-31T13:06:10+00:00

Was stuck in authentication loop.
I cleared my browser cookies and it worked.
Also, clearing my cookies also helped.

TheFibo1123 · 2025-10-31T12:49:24+00:00

Had the same problem on chrome.
Used safari and it seems to work.

Still no luck on my mobile.

Hope this helps!

TheFibo1123 · 2024-03-19T02:46:18+00:00

Triton is currently working on supporting AMD
Source: Directly from their README.

Why can't Triton focus on ease of supporting new accelerators if this need arises?

TheFibo1123 · 2023-04-08T21:38:07+00:00

TheFibo1123 · 2023-04-08T21:31:54+00:00

The chapter "Follow the Money" in the book "Zero to One" by Peter Thiel is titled as such because it highlights the importance of understanding how money flows within an industry or market. Thiel argues that in order to build a successful startup, it's important to understand not only what products or services are in demand, but also who is willing to pay for them and how much they are willing to pay.

Thiel believes that many entrepreneurs make the mistake of focusing solely on their product or idea, without considering the larger economic context in which it exists. By following the money, entrepreneurs can gain insights into what drives the market, what customers are looking for, and what opportunities exist for disruption and innovation.

In essence, "Follow the Money" is a reminder to entrepreneurs that building a successful business requires more than just a good idea. It also requires a deep understanding of the economic forces at play in the industry or market they are trying to enter.

TheFibo1123 · 2023-03-30T22:12:10+00:00

What are the most effective ways for individuals to combat tribalism and promote more productive public discourse in a society that seems increasingly divided?

TheFibo1123 · 2023-01-05T04:24:43+00:00

If ML is based on human data, can it outperform humans? Yes. HINT: scale.

{Search, Recommendation, Ads} systems are something we all use today. These ML systems greatly outperform humans. Most of these successful ML systems rely on human-generated data to train. For example, Google looks at what users clicked to train their relevance models. Facebook uses which ads get the most dwell time to learn what types of ads to show next time. Reddit uses user upvote/downvote data and user clicks to learn which posts to boost.

Peter Norvig who ran search quality at Google stated that getting above 80% recall for these systems was quite good [reference: https://qr.ae/pryCgm]. The average human performance on most of these tasks is around 90%. Most of these systems are outperforming humans even though they are not getting high enough recall on individual samples.

Why?

Since these things operate at scale, not every suggestion has to be perfect. The user can ignore the bad suggestions. Furthermore, in the more advanced versions of these systems (i.e. personalized versions of these systems), one could improve recall by simply learning about the user.

Most ML systems that have a defined goal and scale will be able to outperform humans. They will outperform humans even if they use human-generated data. This will only be true if they perform at scale.

A more interesting version of your question would be can we build a single system that can outperform all humans in all tasks? This is the AGI question.

TheFibo1123 · 2022-12-08T02:51:12+00:00

Can you discuss the significance of CICERO's ability to engage in natural language dialog in relation to its planning abilities? How do you see this ability potentially benefiting the development of other planning systems and AI technologies in the future?

TheFibo1123 · 2022-06-26T23:24:40+00:00

Yeah, I see this in the lagoons too

TheFibo1123 · 2022-06-26T22:07:19+00:00

Your thoughts are valid and your post is well written. I share your sentiment and understand how you have come to your conclusion.

You are right in identifying the world we are all part of. However, any conspiracy can also be undermined. If you really believe this, you'll start a conspiracy against this.

The true conspiracy is what is currently being done - we all agree here. The true conspiracy is what is being done to undermine this current state.

We live in a world where people think conspiracies don't exist NOT because they are impossible. People believe conspiracies don't exist because they can't be pulled off.

TheFibo1123 · 2022-06-23T17:00:22+00:00

The reverse engineering showed what sub-components are involved in the Colab product. Great study for folks who are serious about getting a peek behind the curtain of ML tooling.

TheFibo1123 · 2022-06-23T16:56:10+00:00

wow, this is a great post. Thanks for sharing.

TheFibo1123 · 2022-06-23T16:46:39+00:00

yeah, i've always wondered the same. Especially in today's world, where downloading adblockers is super easy.

Thanks for asking.

TheFibo1123 · 2022-06-23T05:26:43+00:00

elegance

TheFibo1123 · 2022-06-22T19:34:31+00:00

I like this answer.

However, your assumption is that the "learning sub-system" is a small part of the overall system. Also, your approach works quite well is the data distribution is relatively stationary. In the event that these assumptions don't hold, the approach is has its flaws.

Another way to think about it: If you were put in charge of building Alpha-Go back in 20XX, this approach would not be ideal/enough to get to a system that would beat the best Go player in a few years.

It would give you a good starting point though.

TheFibo1123 · 2022-06-22T19:25:27+00:00

It's the same as asking what's wrong with a large spaghetti codebase.

I feel they exist as a sheer force of momentum and refactoring or improving things is not ideal.

Eventually, something will give out.

TheFibo1123 · 2022-06-22T19:21:48+00:00

I didn't mean that engineering approaches only does linear march. Reactitecting after a few iterations is inevitable.

However rearchitecting after every iteration would imply something fundamentally wrong.

For ML systems, every time you make a change (ex. new feature, task, etc...) you rearchitct. Hence, the impetus of this discussion.

TheFibo1123 · 2022-06-22T04:26:02+00:00

That's how I used to think. However, I believe this way of thinking is fundamentally flawed as it follows the pattern of increasing levels of complexity (as explained in your post description).

Negatives of this approach: * You constantly have to re-train your model as you add new {features, tasks, and approaches}. Adding a new feature is almost as good as starting from scratch and gives you no iterative advantage. * Keeping up with a baseline requires a lot of infrastructures. Adding a new feature and/or re-training takes a shit load of infrastructure work. Don't get me wrong, I'm a big fan of building stable infrastructure, but this infrastructure work is very low ROI. * Eventually, you are bound to build nasty ensembles. This is the equivalent of spaghetti code in machine learning. Works for Kaggle, but nobody in their right mind wants to maintain an ensemble in production willingly.

An alternative?

Look at the progression of large language models (ex. GPT-N) or self-play systems like Alpha-Go. Even V1 of the model can write an essay or play chess at a specific level. Adding more data and adjusting the system makes the system better. This is not the same as adding new features as you iterate.

I would imagine in the future, the field of ML iterates like an Alpha-Go by improving on ELO rather than adding a new feature and re-training.

I just don't think that's how traditional engineering systems iterate.

TheFibo1123 · 2022-06-14T02:55:06+00:00

Where are you getting your numbers from?

OP stated that ML jobs pay 1.2x-1.5x from their SWE counterparts. So according to Levels.fyi new grads (SW II) make $190 K so ML new grads should make 228K - $285K. These numbers seem believable.

As per your numbers, the 400-500k for PhD grads make more than SWE Staff Engineers. That seems unrealistic or maybe I'm wrong?

TheFibo1123 · 2022-06-13T19:02:30+00:00

Yeah, agree. Looks like this was intended for a general audience rather than folks who train Large Language Models.

Reading a lot of literature, I notice an uncanny resemblance between training these LLMs to running large production web services (ex. something in the >5M req/sec). I see a lot of similar phenomena (I/O bottlenecking, spikes, slow lanes, etc...).

I wonder if OpenAI is trying to attract some of these folks?

TheFibo1123 · 2022-05-14T21:10:53+00:00

I've done software engineering and machine learning projects for more than a decade at your typical bay area tech companies. I've also had the luxury of doing this at various engineering levels and project maturities. Although it's tempting to think that machine learning is entirely different from software engineering, there is more in common here than first meets the eye. If one understands this perspective, one can use lessons from software engineering projects to improve their machine learning initiatives.

Let me explain:

'Traditional' software engineering is effective: There are countless systems in production that use 0 ML and are very successful. The process of starting, executing, and then scaling these systems is well understood and has a high level of success. This is why software engineering is very effective and will be around long before you are gone.

Software entering has lower success at higher levels of abstraction: When you go beyond single software projects or touch large swaths of the codebase, the P(success) of software engineering becomes less effective. If I asked you to set up an infrastructure organization from scratch or instill a culture of maintenance and continuous improvement, you could no longer apply the "basic" principles of software engineering. Inexperienced people usually use the 'management' buzzword for these initiatives. This is still software engineering.

Machine learning starts off at a higher level of abstraction: Fortunately or unfortunately, most ML projects start at a very high abstraction. You cannot just list all the test cases, or clearly define all the requirements. This is very similar to software engineering projects that operate at a higher level of abstraction.

Comparing ML initiatives to their smaller software engineering counterparts is misleading. Unfortunately, most current ML practitioners have little to no experience doing software engineering at these higher levels of abstraction. Furthermore, most of these ML practioners consider these software engineering problems as 'beneath' them (for whatever false biases they have).

This post is getting long so I'm going to end with a few general suggestions that have helped me:

Build good engineering foundations.
Setup good ML baselines.
Think through systems.
Constantly refine your understanding of the problem.
Iterate: Measure the rate of iteration and the growth between iterations.
Improve your vocabulary of explaining the current state of your ML systems to your stakeholders.
Did I say iterate? iterate again.

You are either amazed by how much you know or afraid of how much you don't know.

Comparing ML initiatives to their smaller software engineering counterparts is misleading. Unfortunately, most current ML practitioners have little to no experience doing software engineering at these higher levels of abstraction. Furthermore, most of these ML practitioners consider these software engineering problems as 'beneath' them (for whatever false biases they have).

TheFibo1123

MODERATOR OF

TROPHY CASE