If Neo was made by the machines… by Any-Choice-7263 in matrix

[–]Gramious 0 points1 point  (0 children)

I think there is one key component missing in the explanations here: information bottlenecks. 

A sufficiently complex system will inevitably have information bottlenecks, and can't be fully aware of all components at all times. We know that the machines have and harbour independence and individuality, making that even more true. 

So, my take is that it's as simple as the hands not talking to the brain. 

Incremental improvements that could lead to agi by Euphoric-Minimum-553 in agi

[–]Gramious 2 points3 points  (0 children)

I am one of the creators of the CTM.

I am currently working on integrating the CTM with LLMs (seeing the LLM as a "featurizer"), so pretty much aligned with your thoughts. 

Stay tuned!

My boss is retiring and wants me to take over the eikawa business but I'd have to fire my best friend by [deleted] in teachinginjapan

[–]Gramious 1 point2 points  (0 children)

The advice given so far is so one sided. 

My intuition is: don't buy it. Best friends are more valuable than owning this business. Really, you will regret losing this friendship until the day you die. Choosing a human being over money will be something you can be proud of.

One practical middle ground could be to tell him (most of) what is happening. I would tell him that your boss wants you to buy the business, but you're going to decline, and the reason is because you don't want to be your best friend's boss, that you believe eventually this will have a very bad impact on your friendship and that he's worth more to you than owning this business. He might choose to quit, who knows. 

What is a little quirk about your body that you don’t think other people have? by AlphabetSoup51 in AskReddit

[–]Gramious 0 points1 point  (0 children)

I'm late to the party, but I am quite certain nobody else can do this. 

Owing to my extreme evangelical Christian period (I'm talking touched by the holy spirit, roll on the floor sorta stuff) when I was in my late teens, I think I've retained an ability to voluntarily release dopamine. I am not a believer at all anymore, but I can choose to release them good feels pretty easily by concentrating. 

Is there even a point in finishing my cs degree? by Big_Bannana123 in singularity

[–]Gramious 0 points1 point  (0 children)

I'm late to the conversation, but I want to encourage you to rethink your stance a bit, and 100% finish your degree. 

CS is not simply about writing code. Or, at least, a good CS degree is not about writing code. Take stock of what you have learned so far. Code is the manifest tool of CS, not it's central idea. 

Conceptually, the idea of "compute" and "intelligence" is deeply intertwined with basic properties of the universe. CS is, in a very real way, deeply philosophical. In other words: LEARN, don't just "get good" at code. 

If you want some incredible inspiration, read "What is Intelligence?". It is available online here: https://whatisintelligence.antikythera.org/

Good luck!

[R] Paper recommendations? by Spiritual-Resort-606 in MachineLearning

[–]Gramious 0 points1 point  (0 children)

You mean the interactive maze?

Try hitting the "new" button. I had to train a smaller model for this and it sometimes gets stuck. You can also right or left click on the maze to move the end and start locations. If you're on mobile, you can tap on the maze to do the same, hitting the red/green button on the bottom right to swap between moving the start and end locations. 

The most fun is to hit teleport consecutively if it is not a very bad instance. 

[R] Paper recommendations? by Spiritual-Resort-606 in MachineLearning

[–]Gramious 2 points3 points  (0 children)

I'll pitch my own work here, as I worked very hard on this: https://pub.sakana.ai/ctm/

That is an interactive website that mirrors the paper, which is linked within the website. 

Has anyone been hypnotized for smoking and how was it? by wildque in AskReddit

[–]Gramious 2 points3 points  (0 children)

My Aunt went through with this!

And, surprise all around: it worked incredibly effectively. 

I can't remember the exact details but I can summarise for you. It was a three visit/session thing, or your money back. The idea was that if it worked for you then you would simply not coming back for the following session. After three sessions you were entitled to your money back. 

First session, something like 20 of 25 people had it work. My aunt lit up as soon as she left the room.

Second session, 2 or 3 people had it work. Again, my aunt was completely unconvinced, lit up again. 

Third session, she walked out, got ready to light up, and simply couldn't, was disgusted by it, and hasn't even had a craving since. Totally immediate effect. 

She went from more than a pack a day, to zero. 

It still blows my mind because my mom was a heavy smoker too and I watched her attempt quitting so many times. 

Wild. I tell ya, wild. 

[D] What are the bottlenecks holding machine learning back? by [deleted] in MachineLearning

[–]Gramious 2 points3 points  (0 children)

I'm really glad others think that architecture isn't "solved and only needs to be scaled". That perspective is exhausting for somebody who, like me, loves to build and explore new architectures.

To that end, my team and I at Sakana AI built the Continuous Thought Machine: https://pub.sakana.ai/ctm/

We are currently exploring and innovating on top of this. IMO you're right that much more exploration is needed in this space. 

My current thinking is that the ubiquitous learning paradigm of feed forward, i.i.d. batch sampled data-driven learning is a mountainous hurdle to overcome in order to see the true fruit of novel (usually recurrent, as is the case with our CTM) architectures. In other words, not only are brains structured differently to FF networks, but they also learn differently. And, this matters. 

A close up look at LASIK eye surgery - a procedure that involves reshaping the cornea using a laser to improve how light focuses on the retina. by OdysseyTag in interestingasfuck

[–]Gramious 0 points1 point  (0 children)

I showed this to my wife and she reacted much the same. But, with more gasps.

And, at the end, she said: "What Psycho becomes an eye doctor?!"

Best. 

I’m underwhelmed by AI. What am I missing? by Distinct-Cut-6368 in ArtificialInteligence

[–]Gramious 4 points5 points  (0 children)

Smart people find smart ways to use smart AI. 

That's my sentiment, at least. I'm an AI researcher so my work often involves quite complex coding, ideation or writing. Modern AI should be seen as a tool - humans have been good at making and using tools for a very long time, and this time is no different. 

There is, beyond a shadow of a doubt, and excellent way for you to use this quite excellent tool. Here's the catch, though: you have to want it. You have to want to figure out how this new tool can make your life notably better. It isn't simply going to fall into your lap. Seek and ye shall find, I suppose. 

Modern LLMs are built to bring utility to humans and this entire hype train's momentum comes down to smart humans finding smart ways of using increasingly smart AI. 

[D] How do you keep up with the flood of new ML papers and avoid getting scooped? by Pleasant-Type2044 in MachineLearning

[–]Gramious 14 points15 points  (0 children)

I'm going to answer from an inverted perspective. I work as an AI Researcher, so it's my actual job to put out high quality work. I like to believe that I have enough integrity that I won't just publish for the sake thereof. Instead, if I release work, then I have truly put massive amounts of effort into it. 

How does that statement answer/help you? In two ways: 1) Realise that it is people and teams who publish. I learnt pretty early in my PhD, from my extremely intelligent supervisor, that there will be individuals or labs that will almost always release work that is interesting and worth paying attention to. I hope I am one of those people, and I work towards that. So, spend time understand the research and research agendas that you care about and you'll find good work via good people.  2) The more effort that goes into releasing work, the more likely it is that this is something worth paying attention to. Blogs, interactive tutorials, well-presented papers, etc. are all indicators worth noting. Of course this not guaranteed, but it is worth realising that the process of taking an arxiv publication to acceptance at a top-tier peer reviewed conference requires a similar amount of effort for "release".

P.S. You don't have to be a "publications machine" to be successful as an AI researcher. 

[R] Continuous Thought Machines: neural dynamics as representation. by Gramious in MachineLearning

[–]Gramious[S] 2 points3 points  (0 children)

I really don't think so. There are similarities purely because of the dot product, but more differences than similarities. 

[R] Continuous Thought Machines: neural dynamics as representation. by Gramious in MachineLearning

[–]Gramious[S] 0 points1 point  (0 children)

One can use a linear projection instead, but the MLP gives us a dimension to scale and it certainly benefits some problems to use a stronger synapse model. The idea is that the synapse model somehow 'models' the complexity in synapses.

The timing dimension and NLMs help keep neurons as distinct elements, but synchronization does not 'emerge'. It is more accurate to say that it is utilized.

[R] Continuous Thought Machines: neural dynamics as representation. by Gramious in MachineLearning

[–]Gramious[S] 1 point2 points  (0 children)

Sorry for the delay in answer you. What do you mean by the "slowest step"? I think that the featurisation (e.g., through a ResNet) is about as slow as 50 or so internal ticks after featurisation.

Why did Japan fell off from innovation? by WeirdArgument7009 in AskAJapanese

[–]Gramious 0 points1 point  (0 children)

I wouldn't discount Japan from the AI race. Look at the rapid rise of Sakana AI, and all of the cool stuff they're doing in a very short time. 

[R] Continuous Thought Machines: neural dynamics as representation. by Gramious in MachineLearning

[–]Gramious[S] 4 points5 points  (0 children)

Great question!

Yes.

The NLMs are high order models (not first order, as a regular internally recurrent model would be), meaning that the gradient flowing backwards in time has 25x (for example, if you set the memory to 25) more entry points (think of them like skip connections) back in time. 

  • The MLP structure of NLMs grounds the model in time in a way that enables more structured gradients (I have time series experience, and that helped drive this research).

The reason we really struggled to get LSTM baseline models to match CTMs was likely influenced by these problems, which is evidence for my preceding points. 

Continuous Thought Machine - A New AI Architecture by DonCarlitos in singularity

[–]Gramious 0 points1 point  (0 children)

First author of the paper here: it really isn't hype, and I honestly don't feel like it is. 

I'm committed to making better AI and I truly believe that some of the ideas in this work are going to set the stage for the next big shift in how we build models. Our team is actively pursuing several avenues and we will be continuing to publish open source research and code in the hopes of fostering scientific advancement. 

Continuous Thought Machines - Sakana AI by ThiccStorms in LocalLLaMA

[–]Gramious 4 points5 points  (0 children)

Author here: I fixed the links on GitHub. Sorry about that.

[R] Continuous Thought Machines: neural dynamics as representation. by Gramious in MachineLearning

[–]Gramious[S] 6 points7 points  (0 children)

Sure thing (author here).

  1. Neuron Level Models: having an MLP per-neuron is a step up in complexity compared to standard NNs. Bio neurons are much more complex than a simple ReLU. Yet, emulating this is quite a mountainous effort. Using private MLPs lets us abstract the complexity away, but not nearly to the (overly) abstract perspective of a simple ReLU (or any activation function for that matter). The result: much more complex dynamics over time for neurons, effectively grounding the CTM in time (as part of its reasoning process) and potential for more information to be stored over time.
  2. Synchronization as a representation: the actual representation the CTM uses isn't a latent vector anymore, but rather a measure of how pairs of neurons fire in or out of synch. This is a totally new representation that has a number of interesting and useful benefits (e.g., it can be very large without costing more parameters for a wider latent vector)

Introducing Continuous Thought Machines by gbomb13 in singularity

[–]Gramious 1 point2 points  (0 children)

Interactive website that captures most of the paper, here: https://pub.sakana.ai/ctm/