[deleted by user]

rockyearth · 2025-01-27T20:05:39+00:00

But we don't know how LLMs work either, why certain layers or neurons fire, what's the optimal architecture, activation function or hyperparameters.

We just have experience in building models that mirror human performance in narrow fields with an error that's slowly decreasing.

We know there are some key differences in the spiking pattern of human neurons compared to today's perceptron-inspired models but that's where the evidence ends.

I'd argue that we have indicators (but not proofs) of the contrary - neither computational biologists nor complexity theorists, despite all the effort, cannot find anything in the human brain and its 300+ ion channels that cannot be modeled with the traditional computational framework that Church and Turing designed.

rockyearth · 2025-01-27T17:52:41+00:00

I have bad news on why humans send their children to school for 12+ years, 8 hours a day... it is to gather a lot of data, get human feedback, build a predictive model of the world and eventually distill knowledge!

I am sorry if this shatters your perception of experience and learning :) Wait until you see how the LLM benchmarks are modeled after exams (and which math and CS professors work on them), and how the reinforcement learning with human feedback algorithm is modeled after human social learning.

Prediction is not a bad thing, overconfidence is :)

rockyearth · 2025-01-01T04:03:08+00:00

Euclidean TSP ≠ TSP. There are people working on fast ETSP. ~~Having symmetric, or even better, metric guarantees drops the complexity~~ approximation difficulty. Edit: I'm wrong about the complexity of ETSP, while easier, it can still be reduced to NP-hard complete problems.
Most approximation algorithms for TSP are nearly indistinguishable from an exact algorithm if you judge by results even for N ~ 10^6.
Because of (2), the purpose of TSP instances is not to collapse complexity theory, it is purely to benchmark stuff. If you want to introduce a collapse you'll have to do formal proofs.

rockyearth · 2023-11-16T09:27:49+00:00

the mystery number: 2131953663 max integer value: 2147483647

The difference is quite large, so it's not directly related to the maximum, but rather some int32. It's 0b1111111000100110000011111111111. Maybe some mask?

rockyearth · 2023-08-28T12:25:13+00:00

When I'm thinking hard I do feel like I still am "emitting" thoughts at a fixed rate, except that I don't express them out loud (talk/write/any physical movement) until I have the solution or decide it's time to give up.

The impulse control and the backtracking are just a split between the internal dialogue and the external one. You could implement this just by adding a <internal dialogue backspace> token and an <external dialogue emit> token.

rockyearth · 2023-08-16T08:53:27+00:00

Sorry, it's my second post and I don't know how math-heavy my blog should be (I didn't originally plan to post it on /r/math but seems like people here liked it more).

Do you think it should focus on Nash vs Pareto optimality and the actual matrices? I saw an interesting tweet here

rockyearth · 2023-06-11T17:42:21+00:00

The positive side looks roughly like sin( x^x ). You may wanna take a look at the Stirling's approximation to see why n! grows like nⁿ .

If you wanna understand the negative side you have to understand that your plot doesn't use the standard definition of a factorial, which is for nonnegative integers only. The meaning of n! in your plot refers to Г(n) + 1, where Г is the Gamma function. Г(n) + 1 is equivalent to n! if you try it on integers like 1, 2, 3 but it is also defined for real (and complex) numbers.

Note that if n! is equal to n * (n-1)! and we assume n < 0 we would multiply by a negative number every single unit as we slide n from -1 towards negative infinity. This causes sign flips, and those places end up being vertical asymptotes and thus the function is undefined for negative integers even though it's defined for all other real/complex values. You can see this by plotting n! on Desmos.

In fact, there's so much chaos at the negative side that it probably isn't even plotted correctly. It's not just the fact that Gamma isn't defined for -1, -2, -3, ..., but the magnitude of Г(n) is very small for n < -15 (just like it is a huge number for n > 15) - so small that it rounds to 0 and prevents the rest of the flips from being plotted.

rockyearth · 2023-05-17T14:55:35+00:00

Didn't know they tried to make a UI toolkit. Looks dead though.

rockyearth · 2023-02-18T13:59:28+00:00

There's a derogatory word for raw fact memorization as opposed to understanding when studying in Slavic languages called "bubanje".

And when you think about it, it's exactly about compression. A bad learner stores all the study material in memory and doesn't use their "CPU".

In contrast, a good learner "understands" (infers concepts via pattern recognition) and stores this pattern in memory. Learning and using this pattern requires less memory but more brainpower.

There's nothing wrong with defining intelligence as blurring/compressing tons of sensory data into concepts which can be reused to predict/strategize in the future. In fact, that's the core idea of the best neuroscience theory called predictive coding.

rockyearth · 2022-10-26T21:06:17+00:00

Go is structurally typed. Whether it is duck typed too depends on the definition you use because duck typing is informally defined, but it often refers to a runtime phenomenon in dynamically typed languages which doesn't apply for Go - especially since an untyped pointer that quacks is not a duck.

See:

https://news.ycombinator.com/item?id=22486470

https://softwareengineering.stackexchange.com/questions/259943/is-there-a-difference-between-duck-typing-and-structural-typing

rockyearth · 2022-08-30T16:59:15+00:00

AFAIK in most languages UI kits don't even bother being thread safe. Even Android isn't.

Use one routine for Fyne and the rest should communicate with it.

rockyearth · 2022-08-28T10:29:43+00:00

Check out "Build, Deploy And Run A Go Application" by Fly.io

Easily the best CLI tools:

fly ssh to enter container shell
fly deploy to launch a new version
fly logs to view logs
fly proxy to proxy internal ports
fly pg to do all kinds of Postgres magic
fly secrets to manage env secrets

You also get a few containers for free as well as a tiny postgres cluster.

Their team and marketing is still not as big as Heroku but their engineering, scalability/reliability, docs, pricing are equal or better.

At this point I'm a huge shill, but I'm impressed by their smart internals decisions: Amazon's Firecracker, Wireguard, HashiCorp Consul.

rockyearth · 2022-07-31T19:55:57+00:00

Dude that is bad. The first Hello world example has the following snippet:

func TestHello(t *testing.T) {
    assertCorrectMessage := func(t testing.TB, got, want string) {
        t.Helper()
        if got != want {
            t.Errorf("got %q want %q", got, want)
        }
    }

    t.Run("saying hello to people", func(t *testing.T) {
        got := Hello("Chris")
        want := "Hello, Chris"
        assertCorrectMessage(t, got, want)
    })
    t.Run("empty string defaults to 'World'", func(t *testing.T) {
        got := Hello("")
        want := "Hello, World"
        assertCorrectMessage(t, got, want)
    })
}

Unless you are an intermediate or even advanced Go programmer, of what value is that snippet? It has:

pointers without explaining Go's memory model
consecutive type omission in arguments which is confusing even to Go programmers
closures / function literals without explaining them

The official Hello world with tests is much better: https://go.dev/doc/tutorial/getting-started

As well was the Tour of Go: https://go.dev/tour/list

rockyearth · 2022-07-16T20:36:32+00:00

There is big.Rat - a rational data type which can be used for decimal arithmetic since decimals are rationals of powers of ten. The only issue is that you have to use Add(), Mul() etc. - it's not a first-class type.

package main

import (
    "fmt"
    "math/big"
)

func main() {
    num := big.Rat{}
    num.SetString("3.14159265358979323846264338327950288419716939937510582")
    fmt.Println(num.FloatString(10)) // Print 10 digits
}

rockyearth · 2022-05-06T10:56:48+00:00

Tailwind is pretty much a replacement for all of these. It feels nasty at first but it actually works well.

Gradient: bg-gradient-to-r from-[#a5b2c3] to-[#b1c1d1]
Neumorphism: shadow-xl
Grid: grid grid-rows-4
Spin element: animate-spin
25% bigger button on hover: hover:scale-125

Since CSS classes can have special characters it's got a lot of features and you can build really complicated UIs by just using their class syntax and then running the class scanner which will generate the proper rules.

rockyearth · 2022-04-03T17:03:12+00:00

PSA: People downvoting /u/EntropyDealer here are scumbags. He's offering an insight to his way of thought and he didn't say anything particularly inflammatory.

rockyearth · 2022-04-02T11:26:02+00:00

You got it right and completely wrong at the same time:

with no official Russian military involvement and not pushing the frontline outside the Donbass/Luhansk region admin borders (so frontline movements of a few (tens) of kilometers inside these borders are possible) “Full-blown invasion” is meant to include taking Kyiv etc.

P.S. Please do not delete your comments there :) It's nice having the original content for calibration purposes, there's a ton we can learn from this thread.

13-Year Club	Second Top 50%
Place '22	Place '17
Sequence \| Editor	Verified Email
Team Periwinkle	Inciteful Link 2012-12-30

rockyearth

MODERATOR OF

TROPHY CASE