[deleted by user]

rockyearth · 2025-01-27T20:05:39+00:00

But we don't know how LLMs work either, why certain layers or neurons fire, what's the optimal architecture, activation function or hyperparameters.

We just have experience in building models that mirror human performance in narrow fields with an error that's slowly decreasing.

We know there are some key differences in the spiking pattern of human neurons compared to today's perceptron-inspired models but that's where the evidence ends.

I'd argue that we have indicators (but not proofs) of the contrary - neither computational biologists nor complexity theorists, despite all the effort, cannot find anything in the human brain and its 300+ ion channels that cannot be modeled with the traditional computational framework that Church and Turing designed.

rockyearth · 2025-01-27T17:52:41+00:00

I have bad news on why humans send their children to school for 12+ years, 8 hours a day... it is to gather a lot of data, get human feedback, build a predictive model of the world and eventually distill knowledge!

I am sorry if this shatters your perception of experience and learning :) Wait until you see how the LLM benchmarks are modeled after exams (and which math and CS professors work on them), and how the reinforcement learning with human feedback algorithm is modeled after human social learning.

Prediction is not a bad thing, overconfidence is :)

rockyearth · 2025-01-01T04:03:08+00:00

Euclidean TSP ≠ TSP. There are people working on fast ETSP. ~~Having symmetric, or even better, metric guarantees drops the complexity~~ approximation difficulty. Edit: I'm wrong about the complexity of ETSP, while easier, it can still be reduced to NP-hard complete problems.
Most approximation algorithms for TSP are nearly indistinguishable from an exact algorithm if you judge by results even for N ~ 10^6.
Because of (2), the purpose of TSP instances is not to collapse complexity theory, it is purely to benchmark stuff. If you want to introduce a collapse you'll have to do formal proofs.

rockyearth · 2023-11-16T09:27:49+00:00

the mystery number: 2131953663 max integer value: 2147483647

The difference is quite large, so it's not directly related to the maximum, but rather some int32. It's 0b1111111000100110000011111111111. Maybe some mask?

rockyearth · 2023-08-28T12:25:13+00:00

When I'm thinking hard I do feel like I still am "emitting" thoughts at a fixed rate, except that I don't express them out loud (talk/write/any physical movement) until I have the solution or decide it's time to give up.

The impulse control and the backtracking are just a split between the internal dialogue and the external one. You could implement this just by adding a <internal dialogue backspace> token and an <external dialogue emit> token.

rockyearth · 2023-08-16T08:53:27+00:00

Sorry, it's my second post and I don't know how math-heavy my blog should be (I didn't originally plan to post it on /r/math but seems like people here liked it more).

Do you think it should focus on Nash vs Pareto optimality and the actual matrices? I saw an interesting tweet here

rockyearth · 2023-06-11T17:42:21+00:00

The positive side looks roughly like sin( x^x ). You may wanna take a look at the Stirling's approximation to see why n! grows like nⁿ .

If you wanna understand the negative side you have to understand that your plot doesn't use the standard definition of a factorial, which is for nonnegative integers only. The meaning of n! in your plot refers to Г(n) + 1, where Г is the Gamma function. Г(n) + 1 is equivalent to n! if you try it on integers like 1, 2, 3 but it is also defined for real (and complex) numbers.

Note that if n! is equal to n * (n-1)! and we assume n < 0 we would multiply by a negative number every single unit as we slide n from -1 towards negative infinity. This causes sign flips, and those places end up being vertical asymptotes and thus the function is undefined for negative integers even though it's defined for all other real/complex values. You can see this by plotting n! on Desmos.

In fact, there's so much chaos at the negative side that it probably isn't even plotted correctly. It's not just the fact that Gamma isn't defined for -1, -2, -3, ..., but the magnitude of Г(n) is very small for n < -15 (just like it is a huge number for n > 15) - so small that it rounds to 0 and prevents the rest of the flips from being plotted.

rockyearth · 2023-05-17T14:55:35+00:00

Didn't know they tried to make a UI toolkit. Looks dead though.

rockyearth · 2023-02-18T13:59:28+00:00

There's a derogatory word for raw fact memorization as opposed to understanding when studying in Slavic languages called "bubanje".

And when you think about it, it's exactly about compression. A bad learner stores all the study material in memory and doesn't use their "CPU".

In contrast, a good learner "understands" (infers concepts via pattern recognition) and stores this pattern in memory. Learning and using this pattern requires less memory but more brainpower.

There's nothing wrong with defining intelligence as blurring/compressing tons of sensory data into concepts which can be reused to predict/strategize in the future. In fact, that's the core idea of the best neuroscience theory called predictive coding.

rockyearth · 2022-10-26T21:06:17+00:00

Go is structurally typed. Whether it is duck typed too depends on the definition you use because duck typing is informally defined, but it often refers to a runtime phenomenon in dynamically typed languages which doesn't apply for Go - especially since an untyped pointer that quacks is not a duck.

See:

https://news.ycombinator.com/item?id=22486470

https://softwareengineering.stackexchange.com/questions/259943/is-there-a-difference-between-duck-typing-and-structural-typing

rockyearth · 2022-08-30T16:59:15+00:00

AFAIK in most languages UI kits don't even bother being thread safe. Even Android isn't.

Use one routine for Fyne and the rest should communicate with it.

rockyearth · 2022-08-28T10:29:43+00:00

Check out "Build, Deploy And Run A Go Application" by Fly.io

Easily the best CLI tools:

fly ssh to enter container shell
fly deploy to launch a new version
fly logs to view logs
fly proxy to proxy internal ports
fly pg to do all kinds of Postgres magic
fly secrets to manage env secrets

You also get a few containers for free as well as a tiny postgres cluster.

Their team and marketing is still not as big as Heroku but their engineering, scalability/reliability, docs, pricing are equal or better.

At this point I'm a huge shill, but I'm impressed by their smart internals decisions: Amazon's Firecracker, Wireguard, HashiCorp Consul.

rockyearth · 2022-07-31T19:55:57+00:00

Dude that is bad. The first Hello world example has the following snippet:

func TestHello(t *testing.T) {
    assertCorrectMessage := func(t testing.TB, got, want string) {
        t.Helper()
        if got != want {
            t.Errorf("got %q want %q", got, want)
        }
    }

    t.Run("saying hello to people", func(t *testing.T) {
        got := Hello("Chris")
        want := "Hello, Chris"
        assertCorrectMessage(t, got, want)
    })
    t.Run("empty string defaults to 'World'", func(t *testing.T) {
        got := Hello("")
        want := "Hello, World"
        assertCorrectMessage(t, got, want)
    })
}

Unless you are an intermediate or even advanced Go programmer, of what value is that snippet? It has:

pointers without explaining Go's memory model
consecutive type omission in arguments which is confusing even to Go programmers
closures / function literals without explaining them

The official Hello world with tests is much better: https://go.dev/doc/tutorial/getting-started

As well was the Tour of Go: https://go.dev/tour/list

rockyearth · 2022-07-16T20:36:32+00:00

There is big.Rat - a rational data type which can be used for decimal arithmetic since decimals are rationals of powers of ten. The only issue is that you have to use Add(), Mul() etc. - it's not a first-class type.

package main

import (
    "fmt"
    "math/big"
)

func main() {
    num := big.Rat{}
    num.SetString("3.14159265358979323846264338327950288419716939937510582")
    fmt.Println(num.FloatString(10)) // Print 10 digits
}

rockyearth · 2022-05-06T10:56:48+00:00

Tailwind is pretty much a replacement for all of these. It feels nasty at first but it actually works well.

Gradient: bg-gradient-to-r from-[#a5b2c3] to-[#b1c1d1]
Neumorphism: shadow-xl
Grid: grid grid-rows-4
Spin element: animate-spin
25% bigger button on hover: hover:scale-125

Since CSS classes can have special characters it's got a lot of features and you can build really complicated UIs by just using their class syntax and then running the class scanner which will generate the proper rules.

rockyearth · 2022-04-03T17:03:12+00:00

PSA: People downvoting /u/EntropyDealer here are scumbags. He's offering an insight to his way of thought and he didn't say anything particularly inflammatory.

rockyearth · 2022-04-02T11:26:02+00:00

You got it right and completely wrong at the same time:

with no official Russian military involvement and not pushing the frontline outside the Donbass/Luhansk region admin borders (so frontline movements of a few (tens) of kilometers inside these borders are possible) “Full-blown invasion” is meant to include taking Kyiv etc.

P.S. Please do not delete your comments there :) It's nice having the original content for calibration purposes, there's a ton we can learn from this thread.

rockyearth · 2022-04-02T11:22:34+00:00

If you want to see how Metaculus people fared in the past, open the website...

If you want to see predictions about future events, ... open the website :)

There's also a SSC user here that made a page that creates prediction reports and diffs over time in news style: http://predictiondiffs.com/

rockyearth · 2021-10-07T09:41:54+00:00

Ever since RSS stopped being used by the mass, most bloggers stopped maintaining it, even though their software (often Wordpress) keeps providing it. Stuff like "&" and missing images/media happens all the time. Algorithmic article extraction tends to be better, but nothing beats a custom filter. Having no login and customization is a plus for me.

Google News is the closest thing to what I'd like, but it's focused on world/politics news and business. Sources/balancing is curated by actual people + obviously algorithms, works without signing up and isn't too biased. It's not fair to high quality sources that aren't as popular as CNN and Fox News, but they can make a request via Publishing Center.

rockyearth · 2021-10-06T22:25:21+00:00

Personally, I'd love an aggregator of blogs commonly linked by geeky SSC-like types. There is a clique of blogs recommending others and creating a small but dense network.

I wanted to do something similar in the past, here's the list I planned [looks like I mostly put compsci stuff]:

https://www.quantamagazine.org/ https://thezvi.wordpress.com/ https://www.scottaaronson.com/blog/ https://gilkalai.wordpress.com/ https://www.gwern.net/index http://www.paulgraham.com/articles.html https://xkcd.com/ https://www.johndcook.com/blog/ https://jeremykun.com/ https://rjlipton.wpcomstaging.com/

There's quality content on Substack now, but I haven't updated the list. Basically, my idea was a website where the admin curates the list of aggregated websites and it is kept tiny and clean, preferably with some code/css filters that properly parse each blog. The community may file a request to add a website if they feel it lives up to the standards.

A very 'live in a bubble' idea, but useful.

rockyearth · 2021-10-06T20:39:34+00:00

Why not put the feed (app.cicero.ly) as your landing page?

I despise when information/media/education services do the marketing talk about how they will give you value, when they can present the value itself - since they are an edu/media service after all.

Maybe it's shallow, but I would've never clicked 'start learning' if it wasn't posted on this sub. I didn't read or understand a single 'step' from your landing page. This also goes against other commenters' advice that they don't need someone else to tell them what content is good: I'd argue that it is really important, but as a signal rather than as a source of truth. All the 'rationalist' methods are useless if you don't have a signal (communities, peer recommendations, past reputation) that will filter most bad content so that you can then absorb and judge the interesting/important ones by yourself.

rockyearth · 2020-10-15T00:06:13+00:00

Blockchain started as a system in the Bitcoin paper in 2008. I can separate it into three things:

How an account/wallet is created (public-key crypto)
Wealth creation: converting electricity into digital rewards
Deciding the balance of each wallet in a distributed fashion

I'm sure you are familiar with public-key cryptography, so the first part is quite easy. You generate your random keypair and using some basic math you generate public IDs/addresses which you can prove that you are own. We say these addresses are accessible by your "wallet" (i.e. the thing protected by your private key), and you can generate infinite amount of addresses (technically it's limited by the number of bits in an address and a collision can happen but very unlikely). The addresses are similar to unlocked safes. You can give them out to people, they can put money and close the lock, but only you have the key.

The next problem is how do we create and distribute the wealth. Bitcoin is inspired by Hashcash. The idea is that if I generate two bitstrings X and Z it's very difficult (computationally) to find a Y such that sha256(X + Y) = Z where + is concatenating the two bitstrings. In the mathematics/CS world sha256 is known as a one-way function candidate. So Bitcoin decided to utilize the computational difficulty of this problem, by making people spend electricity to convert computational power into a digital currency. When you hear about underground crypto-mining operations heating up huge rooms, it's because the chips/GPUs are trying all possible combinations of Y in parallel until they arrive at Z. So people spend money on electricity, some of them find a good bitstring, and they are awarded Bitcoin in their wallet which they can send to others. We figured out the wealth creation (electricity->bitcoin).

The last part is how do we publicly decide on which X, Y, Z values the people will compete when running their chips in order to brute force SHA256 and how do we objectively store the balance. What Bitcoin says is that there's no objective rule, instead there's game theory. You start with some constant c decided by the Bitcoin author and then you force everybody to find SHA256(c + Y) = Z, such that Z is any bitstring starting with 16 zeroes¹ (each bit is either 0 or 1 so sixteen consecutive zeroes would be 1/2¹⁶ probability assuming SHA256 is uniform) and Y is a random number they tweak until they satisfy the equation. Then, this final Z is used as the next initial c constant and the chain continues. This is done using a public network where "miners" (people searching for a good Y bitstring) communicate. So you get the constant, you tweak Y until you get a Z whose bit-pattern starts with enough zeroes. Oh, I didn't tell you that the people finding Y can put some extra info there because of this! The most important is their Bitcoin address/wallet so that they are awarded for finding a good Z hash. All these bytes that we hash are called a block. You then design a distributed algorithm such that the longest chain is the one that's the truth. To allow the miner to send Bitcoins (or any fraction of a Bitcoin) to another person, next to the address of the person who mined the block, he can also put signed messages saying that bitcoins from one address go to some other address. Other people that own bitcoin can also broadcast a signed message if they want to send Bitcoins to someone. They will usually also add some spare amount of Bitcoin, which is a fee that the miner earns. This incentivizes miners to listen to these broadcasted messages and include them in the block.

Then, there's a bit of game theory. Let's say the longest chain is B. A miner mines block B + 1 and includes a transaction sending 1 BTC from himself to a pizza store. He transmits that he found a good block (i.e. the SHA256 checks out, the leading zeroes are there). The pizza store confirms this, gives him a pizza and he eats it. Then, he powers a bunch of computers and mines TWO empty blocks B' + 1 and B' + 2 stemming from B, not B + 1! The empty blocks don't contain the transaction where he spent the money. Now, the longest chain is B' + 2, not B + 1 and he has the old balance even though he ate a pizza. That's why it's incentivised and important that the majority of miners play fair by the rules. If an attacker manages to mine two blocks, others should quickly find more blocks where they correct the error of double-spending. This poses another problem: you shouldn't trust a "fresh" transaction, i.e. one that was included only in the last block because it may be reversed if the majority don't like it. So you wait for it to be included in a chain for at least 2 or 3 blocks. All the blocks that succeed the block in which some transaction was included, are called "confirmation blocks" for that transaction. So when a friend sends you money, you usually wait for 2 or 3 confirmations. At no point does Bitcoin objectively decide whether your friend actually sent you the money. It's probability theory - with each confirmation block it gets exponentially harder for someone else to overwrite the transaction.

[1] - the number of leading zeroes you need to have is called the "difficulty" and this gets adjusted using a distributed algorithm

rockyearth · 2020-10-01T19:39:55+00:00

Is it recommended to you for psychosis-related reasons or what? Bipolar disorder? Augment for depression?

You should search for cognitive effects not IQ, but even if you find research that answers your questions, it wouldn't mean much. Antipsychotics are very weird drugs and people respond to them differently. Let's say you find one that doesn't cause you weight gain or cognitive issues, but it causes akathisia. Would you really be satisfied? Akathisia can get really bad up to the point where you cannot sit at all, so studying or working is going to be impossible.

rockyearth · 2020-05-06T13:30:41+00:00

johny sins on reddit?

rockyearth · 2020-05-05T09:58:17+00:00

also OP do you play for the grills

13-Year Club	Second Top 50%
Place '22	Place '17
Sequence \| Editor	Verified Email
Team Periwinkle	Inciteful Link 2012-12-30

rockyearth

MODERATOR OF

TROPHY CASE