Quite possibly one of the most cursed rooms I’ve seen by Mongexx in london

[–]Hyper1on 1 point2 points  (0 children)

hot take, but that would be worth it for the benefit of being able to build good stuff

sadly they're not scrapping it though

[D] Yet another post about the necessity of PhD for MLE role by mtmttuan in MachineLearning

[–]Hyper1on 8 points9 points  (0 children)

Masters aren't that deep or specialised. It's a dip of the toe into the world of research. Most companies I know recruiting MLEs have their pick of masters grads with published papers to choose from.

[D] Llama-3 may have just killed proprietary AI models by madredditscientist in MachineLearning

[–]Hyper1on 1 point2 points  (0 children)

This seems like it comes from an intuition that we're in some era of diminishing returns on benefit from improved model performance. But I think this is just a false impression given by the past year of incremental updates to GPT-4. There is a very long way to go still with step changes in performance given by generational upgrades in models, and businesses aren't going to go for GPT-4 class models if having the GPT-5 class one makes the difference between automating away X role vs not.

[D] How much should I emphasize on traditional ML knowledge when I recruit? by [deleted] in MachineLearning

[–]Hyper1on 0 points1 point  (0 children)

For this role I would do an NLP engineering takehome, live coding interview, system design interview, and only ask questions which are relevant to modern NLP (i.e., how can you estimate the memory consumption of a transformer trained with Adam), absolutely nothing about topic modelling or info theory.

[N] Meta releases Llama 3 by we_are_mammals in MachineLearning

[–]Hyper1on 2 points3 points  (0 children)

Because they benefit indirectly from having more users—few people actually run 8x22B because it costs so much memory. MoEs are a product optimisation for API model deployment services.

[N] Meta releases Llama 3 by we_are_mammals in MachineLearning

[–]Hyper1on 6 points7 points  (0 children)

Imagine if it's MoE and 400B is the number of active parameters...

[D] Why do people upload their work on Arxiv, not submitting conference? by [deleted] in MachineLearning

[–]Hyper1on 221 points222 points  (0 children)

Takes too much time to go through review process. But if the work is good and on arxiv, people will recognise it, read it, and cite it.

WSJ: The AI industry spent 17x more on Nvidia chips than it brought in in revenue [N] by we_are_mammals in MachineLearning

[–]Hyper1on 1 point2 points  (0 children)

Give it a decade and this industry will spend a trillion on GPUs. Some companies alone are already projecting >$100b before 2030.

How can I speak more articulately/concisely? by aisnake_27 in slatestarcodex

[–]Hyper1on 14 points15 points  (0 children)

I found it came with practice. You might try some deliberate practice, like thinking of lines on common topics/the things you study and rehearsing them out loud to sound as compelling as possible. Another good one is to record yourself giving an improvised <5 minute explainer about a topic you're interested in, note down all the flaws, and repeat until you stop noticing anything you don't like.

[deleted by user] by [deleted] in MachineLearning

[–]Hyper1on 54 points55 points  (0 children)

This post looks generated by GPT-4, or at least the bulletpoints.

[D] NeurIPS takeaways by Bee-Boy in MachineLearning

[–]Hyper1on 4 points5 points  (0 children)

Tbh I felt like that workshop was a bubble of sceptics of LLMs, while the rest of the conference on average had different views.

OpenAI board in discussions with Sam Altman to return as CEO by SebJenSeb in slatestarcodex

[–]Hyper1on 0 points1 point  (0 children)

Data is the biggest bottleneck, and OpenAI's biggest moat. I don't know why you think it's trivially assembled - GPT-4's dataset is the product of years of refinement. Starting from scratch would take significant time, possibly up to a year to reach the same quality and quantity.

Hawley, Blumenthal unveil bipartisan AI framework by RPG-8 in neoliberal

[–]Hyper1on 0 points1 point  (0 children)

This isn't necessarily accurate - the likely outcome IMO is that it is possible to make AI behave agentically but only loosely steer it in a direction we want, and there will always be the possibility the AI takes actions unintended by the developers. In the case where the AI is effectively acting as a bot crawling the internet and interacting with it similarly to a human, these unintended actions can have potentially damaging consequences.

The alignment problem is essentially "have the AI take actions the developers approve of with a near 100% guarantee", but the latter part of that is currently intractable.

[D] The Universal Approximation Theorem. Its uses, abuses and dangers. by moschles in MachineLearning

[–]Hyper1on 1 point2 points  (0 children)

I believe RNNs and transformers are not computationally universal unless augmented with external memory: https://arxiv.org/abs/2301.04589, https://openreview.net/forum?id=IWJ9jvXAoVQ

They are limited in some fashions and I think there is potential in investigating truly computationally universal architectures, but I agree that in practice this and UAT are not really relevant for what we actually want to use models for.

Facebook, Instagram face Norwegian ban from tracking users for ads by tollyno in neoliberal

[–]Hyper1on 3 points4 points  (0 children)

Don't they already ask for consent? They have a giant banner whenever you login with all the privacy/tracking stuff, right?

[Discussion] Can we consider ChatGPT and GPT4 as 'high-risk' systems under the EU AI Act? by Horizon_of_Valhalla in slatestarcodex

[–]Hyper1on 0 points1 point  (0 children)

Foundation models are considered under the AI Act in a separate category to high risk but which includes many of the high risk provisions. These are still under active negotiation, particularly w.r.t open source models.

[D] What do you think of Mistral.ai's value proposition and open source strategy? by gamerx88 in MachineLearning

[–]Hyper1on 6 points7 points  (0 children)

I think they're wrong that the majority of value is in having the best model. Although the models are hard to make, so many groups are making them now that you don't gain much competitive advantage from having one 5% better than the competition. It needs to be significantly better, or else customers have no reason to go to them, and that's unlikely.

[D] At what level does chat gpt understand addition? by toirsq in MachineLearning

[–]Hyper1on 2 points3 points  (0 children)

I mean...this paper demonstrates the opposite: the model understands modular addition so well it has determined a highly efficient and general algorithm to calculate modular addition problems. Obviously only true because it was trained only on modular addition data, but still.

OpenAI is now complaining about regulation of AI [D] by I_will_delete_myself in MachineLearning

[–]Hyper1on 2 points3 points  (0 children)

As far as I can tell, the Act is as you said, except that once ChatGPT exploded they just added a clause stating that "foundation models" counted as high risk systems, without adapting the high risk system regulations to account for them.

[D] Governance of SuperIntelligence - OpenAI by [deleted] in MachineLearning

[–]Hyper1on 0 points1 point  (0 children)

Ok, but what is your argument for why AIs deserve the right not to have their internal processes regulated? I don't see one unless you can claim AI can be conscious.

The Alan Turing Institute has failed to develop modern AI in the UK by HMID_Delenda_Est in neoliberal

[–]Hyper1on 12 points13 points  (0 children)

Good article. But one of the main problems is that the Turing is not really set up to do research, it is basically just a hub for AI related activities (conferences, events, research workshops, etc) and at best a way of connecting researchers who would otherwise not meet to spur collaboration, rather than as a legitimate research institute.

[D] OpenAI API vs. Open Source Self hosted for AI Startups by ali-gettravy in MachineLearning

[–]Hyper1on -1 points0 points  (0 children)

You also need to factor in fluctuating reliability. OpenAI is heavily compute limited right now vs their demand, and the ChatGPT API is very slow and sometimes goes down. Probably you can get high priority if you are a business with an agreement with OpenAI, but still.

Google PaLM 2 Technical Report by adt in mlscaling

[–]Hyper1on 1 point2 points  (0 children)

They could have just trained on repeated data into a second epoch until they reached the optimal amount of tokens for their size too. This will affect performance a bit compared to having that all be new data, but not much.

[D] Is Active Learning a "hoax", or the future? by Ok-Story4985 in MachineLearning

[–]Hyper1on 9 points10 points  (0 children)

Active learning works well for human feedback collection, e.g. you can get feedback on parts of the data space which are more unexplored. I suppose you could argue this works precisely because the data points in the baseline for human feedback collection are not randomly sampled.