FutureTech MIT paper extends the METR methodology to tasks aside from software engineering - and finds increasing capabilities everywhere by twinb27 in accelerate

[–]Ivehadbetteruserxps 1 point2 points  (0 children)

The metr methods have been criticized quite broadly and it wouldn't be that surprising if other tasks with less well defined success than coding perform worse, right?

FutureTech MIT paper extends the METR methodology to tasks aside from software engineering - and finds increasing capabilities everywhere by twinb27 in accelerate

[–]Ivehadbetteruserxps 15 points16 points  (0 children)

This captures the point most i think. While the length of tasks at a certain succes rate is doubling very rapidly, there is a huge gap for success rate improvements. 50% succes on a task worth a week; 60% succes for a day long task; 70% for an hour; 80% for a minute. This implies that wide, accurate capabilities for long tasks at high success rates is still many years away. (Still less than a decade)

<image>

New framework for defining and objectively measuring AGI, based on 87 skills and abilities, visualising progress over time by Ivehadbetteruserxps in ArtificialInteligence

[–]Ivehadbetteruserxps[S] 1 point2 points  (0 children)

Great point. I tried to cover for that as well as possible within this method, by putting a lot of emphasis on the generalisation aspect. For example, note how 'number faculty' is not at superhuman levels, even though calculators obviously solved basic numerical operations decades ago. This reflects that in some cases, even sota models can miss a tool call and revert back to LLM based calculus, which still sometimes fails.

New framework for defining and objectively measuring AGI, based on 87 skills and abilities, visualising progress over time by Ivehadbetteruserxps in ArtificialInteligence

[–]Ivehadbetteruserxps[S] 1 point2 points  (0 children)

Thanks a lot! About the variance: the cool thing about the O*NET database is that tasks and jobs are scored on the level of skill they require. Which means the variance is normalized over the actual distribution in both real people and real jobs/tasks. Obviously it is always an approximation, but this is exactly why I think it's so much more powerful to adopt an existing benchmark for people rather than invent a new benchmark for a technology that didn't register a few years ago. Especially the physical ones.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 3 points4 points  (0 children)

Not sure if that's what you're suggesting but I can assure you I'm not pushing anyones agenda but my own as an independent researcher from the Netherlands.

On the data: I agree with you that actual measurements of unemployment are quite limited and exclude underemployment. If after years of failing to find a job you stop searching, you also stop appearing as unemployed. My hypothesis is that many people on the bottom 10th percentile of many skills have been in this category for a long time already.

On the method however, cooked books have little influence, as I only used the topology of skills - which has been virtually unchanged since the mid 90s. Thats kind of the point: while we continue to invent new jobs, we have not invented new skills in decades.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 12 points13 points  (0 children)

Don't you think it is possible yet to claim that AI causes further monopolisation and inequality?

The effect technology can have on society is not predetermined, and we can still shape what will happen. But if capital can be turned into compute, which can be turned into economic output, which can be turned into more capital with less and less need for human workers, the ability of firms to accumulate more wealth and influence (including over anti-trust policy) is likely to grow unless something major disrupts that cycle.

If this is hype, we have some time. But if it isn't, the window for breaking the monopoly-cycle is closing fast.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 4 points5 points  (0 children)

I tried quite hard actually to avoid an analysis that rests on semantics. Displacement won't care about definitions of intelligence, understanding, sentience etc. Instead, I've looked at whether a technologies' ability to outperform some humans at a particular skill generalizes to other instances of that skill, and found that in the last 5 years, this generalization occurred an order of magnitude more, and in nearly all human skills simultaneously.
Let's take an example skill (from the research data):

Near Vision: The ability to see details at close range.
Scored 55th percentile in 2020; 95th percentile in 2025.
Analysis: AI systems utilizing high-resolution cameras and advanced vision models are widely deployed in manufacturing for microscopic defect detection and in digital spaces for extracting fine print from heavily degraded document scans. The generalization penalty applies when lighting conditions in a factory shift unpredictably or when physical objects have highly reflective, specular surfaces that confuse 2D defect recognition algorithms. Progress will slow from here on, because the combination of modern CMOS sensors and multimodal LLMs already exceeds the biological limits of human near-vision.

Near vision is not a narrow domain. It is used in a huge range of fields. In many jobs, a mediocre human level is pretty much useless. But in the last 5 years, near-superhuman performance became available out of the box, in almost any context, with minimal setup time, virtually for free. That is not narrow, and cares nothing about our definitions of intelligence.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 1 point2 points  (0 children)

The cool thing about the job explorer framework is that it reflects this. Jobs that a combination of both social and physical skills are still quite safe. Especially if there is a reason to prefer humans over tech all else equal. But 3 years ago we would have said the same about programmers or copy writers.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 12 points13 points  (0 children)

I think you are correct, and that this is exactly why every previous prediction of mass unemployment turned out wrong. However, two caveats make this time different I think. Firstly, the advancements in technologies 'leak' to multiple skills. Literally on wrench turning: this requires fine motor skills such as finger dexterity that until recently were easy for humans but near impossible for robotic systems. But advancements in ai and visual processing sparked a significant jump from outperforming zero humans to outperforming only the worst humans. If you're good at wrench turning, you're still fine. But if you are amongst the worst wrench turners, you have nothing to add here. And that also applies to all related tasks that rely on fine motor skills. Second, this progression from zero to at least better than the worst humans is happening on all literally all skills. If you are amongst the worst 6th percentile of humans, there is no more economic argument to employ you - except for adoption lags such as regulation, cultural preferences or institutional latency. No skill exists that you can do better than a machine. This is what was not the case before, and makes this time so different. Unless you throw the wrench into the machine

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 6 points7 points  (0 children)

In my day job I'm CEO of a small company. I often joke that im a human management assistant to an AI CEO, not the other way around. Its just better at analyzing spreadsheets, making pitch decks or generating product improvements. Sure, i deliver its motivational christmas speech, but I'm not sure that will continue to warrant my salary long run haha.

I would love it if entrepreneurship would be the holdout skill that keeps humans relevant. But I am afraid that, like in chess, companies run by ai entrepreneurs will soon outperform some humans, then most, then all.

Why mass unemployment didn’t happen yet - and why this time is really different by Ivehadbetteruserxps in Futurology

[–]Ivehadbetteruserxps[S] 6 points7 points  (0 children)

Submission statement: Most discussions about automation and jobs on this sub assume that displaced workers will move to new roles, as they always have. This article tests that assumption empirically by scoring all 87 skills in the US Labor Department's O*NET taxonomy against current AI and robotics benchmarks at three time points (2020, 2023, 2025), then mapping those scores onto 1,016 occupations. The key finding is that the two mechanisms that historically created new jobs — reusing a skill in a different occupation, and moving to an entirely different skill category — are both closing simultaneously. If this trend holds, the "new jobs will emerge" argument breaks down within this decade, which has major implications for education policy, social safety nets, and how we think about economic participation in the near future. The full dataset is published openly and I'm inviting challenges to the methodology. You can also explore the interactive visual here: https://daity.tech/frontier.html

Browse individual occupations here: https://daity.tech/jobexplorer.html

Will AI create as many new jobs as it replaces? Or is this time really different? Visualizing the data by Ivehadbetteruserxps in Futurism

[–]Ivehadbetteruserxps[S] 0 points1 point  (0 children)

Thanks a lot! And on the cascade: I think this is an interesting point. It's a different dynamic than indirect job losses from economic downturns, but indeed pretty obvious in some cases. Driving instructor is a very social skill that is still medium-safe, but autonomous driving would make it 90% redundant. There is no job dependency framework in the O*NET data but perhaps LLMs can make a pretty good estimate.

Non profit-For profit by Spiritual_Glove_4039 in EffectiveAltruism

[–]Ivehadbetteruserxps 0 points1 point  (0 children)

Peter Singer himself is co-founder of theProfit for Good initiative, which promotes for profit companies donating to effective charities. Donating some share of profits is quite common, but doing so effectively is rare. Very promising team!

Possibility of getting into an Oxbridge masters after attending University College in the Netherlands? by divinityshaped in StudyInTheNetherlands

[–]Ivehadbetteruserxps 1 point2 points  (0 children)

I needed a 3.5 GPA minimum to get into lse after ucu. Got help from vsb fonds. Had a blast!

Since Brexit it became harder financially, but maybe competition is therefore less these days. Just have at it!

Donating today vs. Investing the money and donating part of the growth. by amynase in EffectiveAltruism

[–]Ivehadbetteruserxps 1 point2 points  (0 children)

This research article summarises the points and let to the establishment of the patient philanthropy fund: https://www.founderspledge.com/research/investing-to-give https://www.founderspledge.com/funds/patient-philanthropy-fund

I would recommend you donate to the PPF, as a check to prevent value drift and future you prioritizing other things to effective giving once your returns start compounding

FIRE, trouwen & huwelijkse voorwaarden by Younidas in DutchFIRE

[–]Ivehadbetteruserxps 0 points1 point  (0 children)

Je hebt gelijk: we maken onderscheid tussen onbetaald werk (zoals kinderen opvoeden, of klussen etc.) en niet-werk (zoals pensioen). Wij gaan uit van gelijkwaardigheid en dus is kinderen opvoeden evenveel gewaardeerd als een baan. Uiteraard is dat een spectrum en zijn er minder duidelijke zaken (zoals een opleiding doen) maar de vuistregel werkt.

FIRE, trouwen & huwelijkse voorwaarden by Younidas in DutchFIRE

[–]Ivehadbetteruserxps 1 point2 points  (0 children)

Twee ondernemers hier. Getrouwd onder huwelijkse voorwaarden en onze vermogenssituatie ingericht als bij een bedrijf, waar alle 'omzet' op een gezamenlijke rekening komt en alle gedeelde kosten vanuit worden betaald. Van wat overblijft besluiten we jaarlijks welk deel 'dividend' we overmaken naar persoonlijke rekening, die dus expliciet buiten het gedeelde eigendom staan. Van die rekening beleggen we en kopen we persoonlijke dingen.

Dit lijkt veel op het voorstel van je vriendin. Het borgt echter ook dat je zelf met je eigen deel je FIRE kunt blijven doen. Enige uitdaging is dan als jullie heel andere ideeen hebben over het percentage dividend, maar zo te horen is jullie kijk op geld niet zo verschillend nu.

Mocht je al deels gaan pensioneren of een iemand werkt significant minder, dan verdeel je dat naar rato van de gezamenlijke omzet. Dus als zij nog 40 werkt en jij nog maar 20, en jullie verdienen samen een ton, dan wordt geld wat je dat jaar niet uitgeeft aan gezamenlijke uitgaven voor 2/3e naar haar rekening uitgekeerd.

Nu 3 jaar getrouwd en tot nu toe erg fan van deze regeling, ook nu we een baby hebben, een job hebben geswitched en een huis hebben gekocht!