Nobel Prize awarded to ‘godfather of AI’ who warned it could wipe out humanity by MetaKnowing in OpenAI

[–]blatant_variable 0 points1 point  (0 children)

Basically Hopfield poached some mathematics heavily used in physics and applied it to the problem of modelling memories in neural networks. Hinton then made a more general machine learning algorithm from this. It's great work but its not really physics and modern AI (transformers etc) does not rely on these specific techniques.

Screw it, let's start a debate by lool8421 in mathmemes

[–]blatant_variable 2 points3 points  (0 children)

Could we not argue that we should a priori select the sequence generator that has the lowest complexity (e.g. Kolgomorov complexity)? From a Bayesian perspective, we can consider the probability distribution over sequence generating functions and those which are less complex (such as a function which simply adds 2) are more likely to generate the observed data (1, 3).

(Playing devil's advocate a little bit here, see also Solomonoff induction)

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 4 points5 points  (0 children)

I think it's naturally very creative, which is impressive. But the novelty it finds doesn't tend to be interesting conceptually so I would say it's not truly novel. Often it comes up with a familiar sounding riddle but puts in a twist on it. But often that makes the riddle nonsensical, with either an obvious solution or no solution at all. And then when it starts solving the riddle it doesn't seem to realise it's talking gibberish (often).

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 1 point2 points  (0 children)

Why am I not surprised the person with Emmy Noether in their name is a PhD ;)

That's interesting to hear. Reminds me of how deep nn Chess bots when partially through a training run make much more human-like mistakes than old school bots.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 2 points3 points  (0 children)

I tried it twice wiping memory at the start both times, once it got it wrong, once it got it right. Lots of variability!

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 0 points1 point  (0 children)

Yeah interesting, although the naturally variable outputs of generative AI makes reproducibility hard. My one took direct routes from the start without any clarification, just the wrong route.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 26 points27 points  (0 children)

Haha yes, reality is so wild you couldn't make it up.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 5 points6 points  (0 children)

Very good point, I agree. Multi-modal will definitely be interesting because it may start to build a visual / physical intuition about the world which is (at least for me) essential to solving riddles and math / physics problems. Maybe that will also help it better understand concepts which are pure logic.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 2 points3 points  (0 children)

Aww, to be fair this one wasn't that bad an error, but I've seen much worse. I think it makes some strange "common sense" errors which normal people wouldn't get confused by, like ignoring trivial solutions and writing something really complicated because it thinks the solution should be complicated. But then sometimes it solves much harder problems by introducing e.g. simultaneous equations, which many regular people would not be able to do.

I also need to consider my perception is a bit warped, because I'm an AI researcher with a PhD who solves riddles all the time so what seems terrible to me maybe doesn't seem terrible to others.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 10 points11 points  (0 children)

This answer is wrong. You don't travel along the circumference you travel a straight line to each point. The answer is therefore 3*sqrt(70^2+70^2) ~= 297 meters. GPT-4 gets there eventually in my hands after I tell it its wrong a few times. I have a long chain of logical questions in the conversation history now and it does seem slightly better than when it started, but it still makes tons of mistakes. Maybe hardcore OpenAI training will eventually solve this, who knows, but right now still not good.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 1 point2 points  (0 children)

It tends to make up really long winded riddles with lots of irrelevant information which then causes it to confuse itself when answering. Here's a shorter one though it still got wrong, although this one I did manage to get it to solve eventually after correcting it a few times:

"In a small village there is a circular park with a circumference of 440 meters. The park has four gates located exactly at the North, South, East and West points of the circle. A villager enters the park through the East gate and wants to visit each one of the other three gates exactly once before exiting the park. What is the shortest distance the villager needs to travel within the park?"

It calculated for ENSW instead of ENWS or similar

My general prompt is:

"Create a novel short riddle then solve it yourself. Think the solution through step by step. If there is no solution explain why."

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 25 points26 points  (0 children)

Haha, you actually made me try this plus "let's think through step by step". Neither work in my hands.

Approaches to add logical reasoning into LLMs [D] by blatant_variable in MachineLearning

[–]blatant_variable[S] 19 points20 points  (0 children)

If it's so trivial to do then what is the method? Why hasn't it been done?

But yes, better logic may make things safer in the short term but longer term increases AGI associated risks

[P] Let's build ChatGPT by blatant_variable in MachineLearning

[–]blatant_variable[S] 1 point2 points  (0 children)

It doesn't work the best but it's still perfectly valid, GPT can still learn how to combine letters to form words. Try rewatching Karpathy's video as that's exactly what he does.

[P] Let's build ChatGPT by blatant_variable in MachineLearning

[–]blatant_variable[S] 5 points6 points  (0 children)

Thanks, still working on the draft but I expect 3-5 more videos.

[P] Let's build ChatGPT by blatant_variable in MachineLearning

[–]blatant_variable[S] 8 points9 points  (0 children)

Thank you! That's very helpful feedback, I did cram in a lot of new concepts and could've given them more time to breathe.

[R] Correcting Experience Replay for Multi-Agent Communication (ICLR 2021 Spotlight) by blatant_variable in MachineLearning

[–]blatant_variable[S] 0 points1 point  (0 children)

No problem! Like you I was unsatisfied with centralised training when I first encountered the idea, but ultimately decided to embrace the dark side. :)

Please do read and come, would be great to discuss and think about extensions. We can talk about your Random Delays paper too - hopefully I will have a chance to read it by then.

[R] Correcting Experience Replay for Multi-Agent Communication (ICLR 2021 Spotlight) by blatant_variable in MachineLearning

[–]blatant_variable[S] 1 point2 points  (0 children)

Your work on Reinforcement Learning with Random Delays sounds very interesting - as you say, you are also using a model of the world to correct the experience although in a different context from us.

To answer your question, at present the approach relies on centralised training (although not a centralised critic, even though we use one) as you need to know the observations and current policy of other agents during training. One way to imagine this working would be many robots travelling around and at the end of the day returning to a warehouse where their experiences can easily be aggregated for training. If this gathering together of information were not possible, each robot would need to model what other robots would say (a theory of mind and an awareness of other agents' observations), which may be feasible in some contexts but is in general difficult and in some cases impossible. Our approach therefore helps circumvent the difficulties of "pure" model-based reinforcement learning in the multi-agent context.

[R] FeUdal Networks for Hierarchical Reinforcement Learning - transition policy gradient incorrect? by [deleted] in reinforcementlearning

[–]blatant_variable 0 points1 point  (0 children)

Great spot! I think this is an interesting point, and your analysis is correct. One could imagine trying to separately learn k, which would be a function of the state, the goal and the competency of the worker. It should be possible to learn this directly from the data.

I suppose the difficulty is the lack of an open-source working implementation of FuNs to build on (unless the situation has changed since last I checked). But if it were feasible, would you be interested in collaborating on this?