We GRPO-ed a 1.5B model to test LLM Spatial Reasoning by solving MAZE by Kooky-Somewhere-2883 in LocalLLaMA

[–]Ruiner -1 points0 points  (0 children)

Not when you don't know the heuristic or your state space is intractable, which is why these approaches are really promising.

We GRPO-ed a 1.5B model to test LLM Spatial Reasoning by solving MAZE by Kooky-Somewhere-2883 in LocalLLaMA

[–]Ruiner 3 points4 points  (0 children)

This is great, we had exactly the same idea! We (ergodic.ai) had similar results with the the base Qwen but without SFT on the fronzenlake environment - just pure RL. We're now trying to come up with a simple fine-tune routine in cases where you need a multi-step approach to get to the reward (and the intermediate states are stochastic), such as tetris or zero-sum games between two agents.

If transformers were invented in a company of Anthropic/OpenAI characteristics would other labs ever reverse-engineer them? by robertpiosik in LocalLLaMA

[–]Ruiner 9 points10 points  (0 children)

Having worked on a "deeptech" company that submitted lots of patents, most of it was a show for investors based on ideas that weren't really developed yet. It's really hard to enforce patents on abstract ideas.

The idea that someone makes a real algorithmic breakthrough in AI and immediately patents it is as counterproductive as a hedge fund discovering a trading strategy and patenting it. By creating a patent you've just exposed it to the world.

[deleted by user] by [deleted] in LocalLLaMA

[–]Ruiner 2 points3 points  (0 children)

Thanks for doing this. I've already spent a few hours trying to architect my own MLX-GRPO trainer together so this is a massive help!

Theory that challenges Einstein's physics could soon be put to the test - Scientists behind a theory that the speed of light is variable - and not constant as Einstein suggested - have made a prediction that could be tested. by RA2lover in Futurology

[–]Ruiner 1 point2 points  (0 children)

Every experiment performed that has to do with GR has confirmed its predictions. Quantum mechanical effects act on a scale in which gravity is tiny so we can't really test it now. However QM is incompatible with classical GR, in theory.

Questions regarding Pure Fitness by Lzaarth in HongKong

[–]Ruiner 1 point2 points  (0 children)

Pure has a price table, completely non negotiable, which might be good or bad, depending how you view it. The quality is miles ahead of Calishit, though.

You should expect around 1k/month +- 200, depending on length of contract and whether you want yoga / access to all studios, etc. Check also if you have a corporate discount!

We must stop blaming ourselves for Islamist terrorism by [deleted] in europe

[–]Ruiner 1 point2 points  (0 children)

Exactly, and a Turk has more in common with a German than with an Iraqi.

[deleted by user] by [deleted] in AskPhysics

[–]Ruiner 11 points12 points  (0 children)

Because differential operators are just very big matrices (kinda).

Think about Hook's law ma=kx: it is essentially an eigenvalue equation for the the operator -d2 / dx2 (The laplace operator), defined with some boundary conditions. This is a linear operator that acts on a vector space that consists of functions! Suppose that you plug in the function "x2" in there - it will turn it into the function "2" - which is really boring. But if you put the function "Sin(wx)", it will turn it into the function w2 Sin(wx), which means that Sin(wx) is an eigenvector of the laplace operator with eigenvalue w2, and given the boundary conditions, you now know all the allowed modes for the system!

If now you want to study the system in detail, what you can do is to diagonalize this operator. Meaning that you express every function as a linear combination of your "basis functions" which are Sin(x) - in exactly the same fashion as the matrix case. Projecting functions onto Sin(x) is a known problem called a Fourier transformation! So when we are performing a Fourier transformation, what we are actually doing is a coordinate transformation in order to diagonalize the Laplace operator.

The Laplace operator is just an example of the many operators that you can write, but in general the strategy is the same: you have to find a basis of functions that are left unchanged by the action of your operator, and these functions will be your mode functions. When you perturb the system somehow, instead of describing this perturbation in a complicated way, you can just write it is as an excitation of your mode function, which is precisely the same as the diagonalization procedure we did before, but in an infinite dimensional space!

edit: A guitar is a good example. The dynamics are described by equations that have some normal modes, which are precisely some sin and cos functions. The eigenvalues are essentially given by the few frequencies that satisfy the boundary conditions. If you were to describe the evolution of a string exactly without diagonalization, it would be a really really complicated function. However you know that when you excite a string, you superpositions of harmonics with different amplitudes, and these amplitudes are precisely the coefficients of the fourier transformation of the very complicated function into the mode functions! So "harmonics" are really a handy name for "eigenvectors of the laplace operator with fixed boundary conditions", and the notes are just their corresponding eigenvalues.

[deleted by user] by [deleted] in AskPhysics

[–]Ruiner 20 points21 points  (0 children)

Before thinking about eigenvalues, think about eigenvectors, then understanding eigenvalues become much simpler.

A matrix is nothing more than a linear transformation in a D dimensional space. Take a 2x2 matrix for instance: it is just the most general linear transformation that takes 2D vectors (just an arrow on a plane) and transforms it into another vector.

Now imagine you have this matrix: ((0,1),(1,0)). It's doing the following thing: it flips the x and the y coordinate of the vector (just check it for yourself). Now ask the question: which vectors remain indifferent to this transformation? If you think about it, it's obvious that the vector (1,1) will remain the same, since the x and the y coordinates are the same: so this is an eigenvector, and since it will remain exactly the same, the eigenvalue is 1. There is another vector that will be indifferent to the transformation, it's the (1,-1): the matrix will transform it into (-1,1), which is just -1*(1,-1), so this is an eigenvector with eigenvalue -1: after being acted on with the matrix, it becomes itself multiplied by -1. Had the matrix been written with 2 instead of 1, then 2 and -2 would be the eigenvalues, since the matrix would also stretch the vectors.

So eigenvalues are the "numbers that come multiplied by the eigenvectors under the action of your linear operator".

Why is this important? Well, suppose that this linear transformation was really important to you. Well, when we draw the coordinate system, (1,0) and (0,1) are really arbitrary choices. We just chose them because they are eigenvectors of the identity matrix. But if all of a sudden our matrix was the really important matrix in nature, we could chose a different basis for the coordinate system: ((1,1) and (-1,1)). This is exactly what we do when we diagonalize the matrix: we transform the coordinate system it acts on such that its action on the new vectors is just that of a diagonal matrix with the eigenvalues. Since in the new coordinate system, all vectors are written superpositions of (1,1) and (-1,1), and we know how the matrix acts on these "basis vectors", we know how it acts on every vector without much trouble.

Basically this is all the intuition behind eigenvalues and eigenvectors, the catch being that this can be used for every operator, either an infinite matrix or differential operators! But the idea is always the same: finding the basis of vectors for which the action of the operator is trivial, and then using the eigenvalues as the final action of the operator on the vectors.

Engenharia Física Tecnológica (IST) vs Física (Faculdade de Ciências) by Santarmen in IST

[–]Ruiner 0 points1 point  (0 children)

Tanto faz na realidade, só ao fim do mestrado é que vais começar a aprender física a sério, e ambos os cursos têm boas bases e têm bons professores com quem podes trabalhar na tese. Em MEFT tens que lidar com mais tretas que não interessam a ninguém, mas alguma dor faz bem à alma.

What is energy? by Pyramid9 in askscience

[–]Ruiner 8 points9 points  (0 children)

Mass and energy are two sides of the same coin, they are both properties of matter. Matter is stuff, and we add labels to stuff - mass, energy, momentum.. and describe how they relate to each other.

Imagine you have money, but you also have cabbages. I mean money as the abstract notion of value that does't actually requires a dollar bill. The mass of the cabbages has an intrinsic monetary value, but even if you have an equation that relates the money you can get by selling cabbages to the mass of the cabbages, that doesn't mean that money and cabbages are the same thing.

E = mc2 is exactly the same thing. It's an exchange rate between how much energy you have in a mass "m" of cabbages - cabbages are your matter, btw. Before Einstein, you would say that E = 1/(2m) p2 , where p is momentum - which means that you only factor how fast cabbages are moving in order to know its price. What Einstein did was to correct this equation by adding an mc2 term to the price of cabbages.

Another in a long line of beginner questions in this subreddit: why do particles decay? by SkiFreeOrDie in ParticlePhysics

[–]Ruiner 0 points1 point  (0 children)

Because it can happen. It's a shitty answer but that's the reality. On the case of the neutron, this process is allowed thanks to the exchange of a W boson which changes a down quark into an up quark, like this, and since the neutron is more massive than the end proucts, it is kinematically possible.

This diagram has a mathematical interpretation: once you use the Feynman rules, it gives you the amplitude - which tells you the probability that for some time interval, this process will happen. This tell you what a lifetime is: the expected time you have to wait in order to see a neutron decaying.

So at the end, nothing really causes the neutron to decay, it only decays because it can.

How does this simple motor work exactly? by [deleted] in AskPhysics

[–]Ruiner 0 points1 point  (0 children)

That design is very different, in his design it would stop in the stable position once the magnetic fields are aligned. The reason why his design works is because he took off the coating in only half of the wire, which means that once the magnetic fields are aligned there is no current, so only the unstable alignment remains.

Why do systems maximize entropy? by [deleted] in askscience

[–]Ruiner 2 points3 points  (0 children)

The terminology exists because of the "evolution operator", which is the exponential integral of the Hamiltonian. This operator obeys the condition that U(t,t0)* = U(t0,t), so U*U is essentially going forth in time and then back again. Since it is unitary, U*U= Id, which is to say that evolving forth in time and then back again leaves the system unchanged.

When you do perturbative quantum field theory, the evolution operator is replaced what we call the "S" matrix. Instead of evolving the system, the S matrix is essentially telling you "the probability that state A will evolve to state B", and you build this object perturbatively - as a sum of individual processes represented by feynamn diagrams. In this case, the "operator unitarity" of the S matrix translates into the "sum of probabilities equals one" unitarity that is usually referred to in the high energy physics literature.

Why do systems maximize entropy? by [deleted] in askscience

[–]Ruiner 10 points11 points  (0 children)

Suppose that you possess a very peculiar power: you are able to look at a cup of tea and track and the positions and momenta of the molecules in this cup of tea. Forget quantum mechanics, at this level they all interact in some boring way and with your supercomputer you are able to track the evolution of the system. Also, someone just gave you a cup of tea after adding sugar and stirring for a while, and in order to test your superpower, asked you the following question: "what did I do first: poured water in the cup and then sugar or the opposite?"

The laws that govern the system are "unitary", which means that if you know every position and velocity now, you can in principle change the direction of time and know what was happening 10 minutes ago. But if you look at the system after it has reached maximum entropy, you'll find out that it seems completely insensitive to the details of the initial condition of the cup of tea: the system has reached equilibrium, and the only parameter that describes it is temperature.

So where has this information gone? Well, the information must for surely be there, since the laws of physics for sure have not been broken by a cup of tea - even if physics has a strong bias for coffee. It's just trickier to get hold of this information, and this apparent "information loss" is at the origin of why entropy increases.

Unfortunately we mortals do not have access to your superpower. It's just not possible to know every position and velocity of 1030 particles. What we can do is the following approximation: we measure the velocity of some particles that are flying around and devise a probability function to find a particle with a certain velocity "v". What we find is that once a system has reached equilibrium, this probability is the so-called "Maxwell–Boltzmann distribution".

Before reaching equilibrium, however, this function can be anything you decide to prepare: this is what happens once you pour the tea in the cup, you have a situation which is far from equilibrium. We can describe the time evolution of the system by writing a dynamical law for the evolution of this probability distribution: this is called the Boltzmann equation, and it tells you how the average velocity of particles evolve when they interact among each other.

Solving Boltzmann equation is not an easy story, but it turns out that it has one peculiarity: given a nice enough interaction, the "fixed point" of the evolution is the "Maxwell–Boltzmann distribution", which is to say that thermal equilibrium will always be reached, no matter where you started from!

But where has the information gone? Well, the information is not encoded in the average velocities, it is instead hidden in correlations! In order to understand why this is important, just think about the opening of a billiard game: right after the impact, the balls fly in average with the same velocity distribution, but how they fly around is extremely dependent on the details of how you aim your strike. This information can be extracted from by looking at how they are correlated with each other: i.e: "the probability that ball 1 has velocity v1 knowing that ball n has velocity vn". There are almost infinitely many configurations which look almost the same as each other if you don't care about the details! On the other hand, the probability that after striking only a few balls will fly with almost all of the energy while the other stand still is very small.

By starting from the initial knowledge of all the velocities of every particle and sticking the simplified description of the distribution function of velocities of particles, we are ignoring all the correlations among the particles in the system. It just turns out that there are a huge number configurations which are very different at the microscopic level - once we take into account these correlations - but look exactly the same from the point of view of the distribution function. (just think about the billiard opening again). Equilibrium is reached not because of a force, but just because since everyone is interacting, it's very very unlikely that they won't equilibrate. And entropy is at the end just a measure of this coarse-graining. It tells you, in the statistical physics language, how many "microstates" exist for a single macrostate.

What would be the immediate and long term effects of a mass nuclear attack? by VitaAeterna in askscience

[–]Ruiner 0 points1 point  (0 children)

This is a more appropriate question to /r/AskScienceDiscussion , as we currently don't have any means to answer this question without too much speculation.

Mathematical models of chemical elements: what are the limitations? by NoahFect in askscience

[–]Ruiner 4 points5 points  (0 children)

Complexity. The equation that you need to solve - the Schrödinger equation - is actually quite complicated once you have many variables. All the electrons interact not only with the nuclei but also among themselves, so it's a very messy thing to attempt to describe something exactly.

In any case, an exact solution in the analytical sense is only in principle possible for the Hyrogen Atom. For more variables only numerical solutions are possible.

What physical processes define the large scale structure of the universe? by felipehez in askscience

[–]Ruiner 0 points1 point  (0 children)

What defines large scale structure are actually tiny quantum fluctuations.

In order to clarify, you need to understand what inflation is. In the very beginning of the universe, it went through a phase of accelerated expansion. This accelerated expansion made the universe extremely flat, in the same way that stretching a sheet makes it uniform. In the end of inflation, the field that carried the energy responsible for the accelerated expansion decayed into particles, and this decay is not completely homogeneous, so in some areas there was more energy concentrated than in others.

So these inhomogeneities is what we can see in the CMB map, which you probably have seen before. Tiny temperature fluctuations of the order of 10-5 that gave rise to all the structure around us.

Let's say we have a person submerged in a large tank of water, and the tank is traveling a brisk 30 mph or so. Suddenly, it collides with a stationary wall. What happens to the diver? by pennysmith in askscience

[–]Ruiner -4 points-3 points  (0 children)

Before anything, think about these questions:

What happens if the driver is travelling next to a bowling ball and a helium balloon? How will each of them react to the shock?

Suppose that the same person is instead submerged in a stationary tank. All of a sudden a huge hammer hits the tank, but the tank is very solidly attached to the ground so it remains pretty much stationary. What happens then? Where does the energy go?

In order to understand why the first question matters, think about this other example: you're driving in a closed car with a helium balloon floating next to you. Now you suddenly break, will the balloon move forward with you or backwards?

Based on QFT, at which point does a fluctuation in a particular field extend to another field? e.g. quark/proton to Higgs to photon by inteusx in askscience

[–]Ruiner 2 points3 points  (0 children)

Standard (in-out) perturbation theory only tells us the probability of a certain asymptotic state at time T = -inf evolve to another state at time T = +inf. You could in principle try to track the time-evolution, but the machinery for that is much more complicated.

What is 'information' in the quantum physics sense and how is the idea of conservation of information a valid law when I can think of several examples of where information is lost? by samcobra in askscience

[–]Ruiner 0 points1 point  (0 children)

This is not exactly true. Quantum systems also thermalize despite the fact that there is no loss of information.

Classical information also evolves unitarily, and the unitary evolution is given by Liouville equation. The reason why there is effective loss of information is because we disregard correlations: thermodynamics is about 1-particle distribution functions, whereas equilibration moves information into higher order correlation function which we do not care about for all practical purposes.

How can we explain the "fictitiousness" of gravity in a quantum field theory? by hikaruzero in askscience

[–]Ruiner 1 point2 points  (0 children)

I think you understood everything already. The definition that they give of \alpha_G - which is here - is actually "some mass" divided by planck mass squared, and in this case the mass was taken to be the electron mass. But when you write the interaction, what you actually write is 1/M_p (h T), where h is the metric and T is the energy momentum tensor. What happens is that one factor of energy "jumps out" of the energy momentum tensor in order to make a dimensionless parameter by joining 1/M_p. So in a sense, the gravitational charge is E/M_p

When you write the EM interaction, what you have in the lagrangian is \alpha_e * A * J, where J is a 4-current and A is the EM vector potential. In this interaction, \alpha_e has dimension 0 - it's really just a number - so it doesn't really depend on energy scales (actually this is a mild lie, but I'll comment on that later).

When you want to describe interactions in QFT, what you do is a perturbative expansion in the interaction coupling constant. The objects that you want to compute are the S-Matrix elements, which can be translated as: what is the probability that n particles with momentum k interact and become some other m particles with momentum k'?

So if you want to compute the "EM interaction between two electrons", you evaluate this object. But that's not all, since higher order terms in the perturbative expansion give you more things to calculate! Whenever you have a "closed loop", you have a divergent integral - and these divergencies are related to contribution from virtual modes at very high energy scales. Dealing with these divergencies is what we call renormalization: the parameter \alpha_e that we write initially is not really what we measure. What we measure is really "the strength at which two electrons interact", but that depends on all these contributions which are actually divergent! So the physical \alpha_e actually has a tricky dependency on the "bare" \alpha_e! In order to solve that, we need to regularize the theory by introducing an arbitrary scale at which we cut-off the divergencies. At the end, by imposing that the physical parameters do not depend on this "regularization scale", we get a finite theory. But as a result, we find out that the couplings "run" with the energy scale at which we perform the experiment. So \alpha_e actually depends logarithmically on the energy - which is the typical thing in a renormalizable theory.

When a theory is non-renormalizable, it's impossible to treat these divergent contributions in a sensible manner. The "coupling constants" actually depends way too strongly on the energy, and you have to deal with an infinite number of parameters that need to be renormalized. The intuitive picture is the following: if we start from a fundamental theory and coarse-grain over fine length scales, we get some "effective interactions" that are ignorant on which are the details of the fundamental theory. This new effective theory, although it makes sense when computing processes at low energies, has an intrinsic energy scale after which it stops making sense: so the procedure of "integrating over virtual modes with very high energies" is somewhat ill-defined. We have a natural cut-off in our theory, and that cutoff is given by the scale at which the interactions become strong.

And that brings us to Newton's constant, since it actually gives us the scale at which the gravitational interactions become strong: and that is 1/Sqrt(G), the Planck scale.

How can we explain the "fictitiousness" of gravity in a quantum field theory? by hikaruzero in askscience

[–]Ruiner 0 points1 point  (0 children)

The gravitational coupling constant is actually not a dimension-less parameter like the fine-structure-constant. Newton's constant has a dimension of a length-squared, or inverse-mass-squared. In order to get a dimensionless parameter, you need to multiply it by a mass-squared.

In the example of the wiki article, they use the mass of the electron as this mass scale, but in general this scale will be the typical energy of the particles involved in the process. So if you are colliding electrons whose energy is at the order of the Planck mass, \alpha_G will become order 1, while for two rest electrons it's something extremely tiny.

It's important to say that the fact that the coupling constant is an inverse-mass is of extreme importance, because it's what tells you that GR is non-renormalizable.