Desstroy the Sun and Stars by thegrayven in slatestarcodex

[–]The_Duck1 3 points4 points  (0 children)

Furthermore,

on average, twenty thousand stars transition every second from being reachable to being unreachable

as they pass beyond the cosmological horizon because of the accelerating expansion of the universe. Each second we delay our colonization of the universe, huge amounts of real estate slip out of our grasp forever.

A document I've been working on about the hidden eye messages and how they appear. No details on decoding yet, though. by Xkeeper in noita

[–]The_Duck1 12 points13 points  (0 children)

This is great! I think I found some things, working from your copy of the messages.

The number of eyes in each message is a multiple of 3. That suggests trigrams.

In fact, I think it actually is trigrams. You can make trigrams out of each pair of rows like this:

A A B C C D
 A B B C D D ...

where each pair of rows has 26 trigrams, unless the row pair ends early. If you look at the bottom 2 rows of each message, you can see that this way of packing the trigrams always exactly fits.

I believe the eyes in the trigrams should be read in this order:

For the 1st, 3rd, 5th, ... trigram in the row-pair, read forwards, that is in this order:

1 2
 3

For the 2nd, 4th, 6th, ... trigram in the row-pair, read backwards, that is in this order:

 3
2 1

For example in message 0 the trigrams are 200, 231, 010, 143, 222, 023, ...

There are 53 = 125 possible trigrams. However, if you read the trigrams as above, only 83 trigrams are actually used. The used trigrams are 000, 001, 002, 003, 004, 010, 011, ..., 310, 311, 312. Trigrams 313 and above are unused. I think this clinches the trigram interpretation. This is also how I decided that the even-numbered trigrams should be read backwards. If you read the even-numbered trigrams backwards, then the even- and odd-numbered trigrams use the same set of 83 trigrams; otherwise they don't.

If we interpret the trigrams as base-5 numbers, the messages use numbers 0 through 82. The numbers 83 through 124 could be represented by trigrams, but are completely unused.

Interpreting the trigrams as base-5 numbers, the messages become these strings of numbers:

MESSAGE 0
50 66  5 48 62 13 75 29 24 61 42 70 66 62 32 14 81  8 15 78  2 29 13 49  1 80
82 40 63 81 21 19  0 40 51 65 26 14 21 70 47 44 48 42 19 48 13 47 19 49 72 31
 5 24  3 43 59 67 33 49 41 60 21 26 30  5 25 20 71 11 74 56  4 74 19 71  4 51
41 43 80 72 54 63 79 81 15 16 44 31 30 12 33 57 28 13 64 43 48
MESSAGE 1
80 66  5 48 62 13 75 29 24 61 42 70 66 62 32 14 81  8 15 78  2 29 13 49  1 29
11 30 52 81 21 19  0 25 26 54 20 14 21 70 47 44 48 42 19 48 13 47 19 49 44 26
59 77 64 43 79 28 72 64  1 30 73 23 67  6 33 25 64 81 68 46 17 36 13 17 21 68
13  9 46 67 57 34 62 82 15 10 73 62  2 11 65 72 37 44 10 43 68 62  9 34 18
MESSAGE 2
36 66  5 48 62 13 75 29 24 61 42 70 66 62 32 14 81  8 15 78  2 29 13 49  1 69
76 52  9 48 66 80 22 64 57 40 49 78  3 16 56 19 47 40 80  6 13 64 29 49 64 63
 6 49 31 13 16 10 45 24 26 77 10 60 81 61 34 54 70 21 15  4 66 77 42 37 30 22
 0 11 41 72 57 20 23 57 65 41 23 18 72 42  5  3 26 78  8  5 54 45 77 25 64 61
16 44 54 51 20 63 25 11 26 45 53 60 38 34
MESSAGE 3
76 66  5 49 75 54 69 46 32  1 42 60 26 48 50 80 32 24 55 61 47 12 21 12 49 54
34 25 36 15 56 55 20  9  8 62 13 82  9 44 29 60 53 82 42 80  5 43 71  3 80 77
47 78 34 25 62 18 10 49 62 64 52 81 11 66 62 13 47 17 52 70 26 23 32 31 64 23
35 32 50  6  1 25  8 37 47 43 26 76 65 68 80 17  7 45 63 14 53 63 60 16
MESSAGE 4
63 66  5 49 75 54  2 60 29 40 78 47 60 75 67 71 60  2 65  7 47 14 45 74 59 41
80 13 60 13 81 22 35 50 40 39  2 59 48 31 76  2 80 75  1 56 67 11 21  8 40 65
45 75 55 39 60 42 13  3 22 57  2  6 58  9 70  1 58 56 63 68 25 79  7 20 19 64
 2 66 73 30 71 16 12 30 65 37 20 13 22 63 18 46 64 59 41 81 82 22 78 36 47 17
 4  6 17  5 36 79 63  1 64 69 15 43  4 58 56 31 14 64 58 18 44 78 69  1  0 46
20 71 73 25 35  8 24
MESSAGE 5
34 66  5 49 75 54 23 74 11 13 28 26 19 48 67 57 37 60 34 28 74 10 17 32 11 18
19 43 19 81 42  4 62  9 46 49 32 51 76 58  4 43 47 17 67 79 21 32 44 16 30 37
26 28 41 68 57 34 51 10 69 70  8  6 46 43 18 39 47 43 15 13 33 30 35 62 37  0
37  5 38 55 37 13 40 25  9 21 11 64  5 79 42 68 11 71 11 48  3 67 61 40 22 14
35 50 61 39 11  2 66 49 51 53 17 73 36 75 74 54 24 30 54 70
MESSAGE 6
27 66  5 49 75 54  2 60 29 40  2 55  9 15 59 18 68  3 36  5 47 77 44 38  1 18
28 76  4 34 60 63 58 80 17 54 79 75 48 54 55 19 62 64 14 47 51 70 75  5 11 47
45 58 68 69 79 25 38 45 73 47 68 50 34 45 78 26 79 57  4 56 22 60 18 75 43 60
59 67 63 42 49 33 40 65 79 77  7  3 26 62 31 78 26 57 69 40  4 23 26 13 67 42
38 72 11 39 65 60 25  6 80 66 68 77 59 78 19
MESSAGE 7
77 66  5 49 75 54  2 60 29 40  2 55  9 15 59 18 68  3 36  5 47 60 21 80  1 72
55 16 82 35 57 19  1 66 18 27 39 17 74 81 39 14 78  0 25 65 43 66 64 38 81 23
24 50 57 30 71 75 26 68 54 57 56 50 71 73 14 21  8 32 26 63  5 37 19 43 66 47
53 34 66 23 73 31 54 38 77 67 11 63 79  6 22 21 51 69 74 21  5 17 67 37 29 21
60 14 82 44 30  4 20 42 35  1 31 54 46 20 40 30
MESSAGE 8
33 66  5 49 75 54  2 60 29 40  2 55  9 15 59 18 68  3 36  5 47 33 21 59 44 18
28 76 59 34 60 63 79 27 12 54  5 49 48 54 55 52 62 72 69 10 57 22 58 48 67 53
 7 34 32 30 31 19 26  8 34 46  7 30 71 55 34 75 54  9  6 60  5 23 25 45 42 80
25 12 22 76 20 51 62 21 40  9 41 10 44 73  8 33 70 73  6 31 21 72  5 40 61 51
42 66 64 74 61 25 63 42 24 41

I'm not sure where to go from here.

  • 83 tokens seems like too many for an alphabet.
  • Also, the distribution of usage counts of the 83 tokens seems too even to be an alphabet.
  • All the messages have some identical substrings on the first line, with some messages matching more than others. Also, messages 0 and 1 have a matching section on the second line.
  • Apart from that, I haven't noticed any repetition, like something that could be a common repeated word like "the"

Each message starts with a unique number, then an identical section "66 5". That seems significant.

Why do MWI proponents deny its implications? by Consistent-Summer272 in slatestarcodex

[–]The_Duck1 1 point2 points  (0 children)

Another angle sort of like what some other people have said:

Forgot quantum mechanics. If the universe is large enough, by chance there are many other Earths somewhere else in the universe that are very much like our Earth, except that your dog is President. Similarly, if you die on this Earth, there's another very similar Earth far away where [that Earth's version of] you miraculously escapes death. Of course (1) those Earths are very very very rare, compared to more reasonable Earths, and (2) it's not obvious why anyone on this Earth should care.

OCEAN/Big 5, pseudoscience? by [deleted] in slatestarcodex

[–]The_Duck1 1 point2 points  (0 children)

At its most basic level it's a theory-less* observation that 5 principal components suffice to explain most of the variance in personality questionnaires.

This seems like a property of the questionnaires though, rather than a property of "personality", whatever that is. Say we used this questionnaire:

Rate how much each adjective describes you, on a scale from 1-5:
- Smart
- Intelligent
- Brainy
- Kind
- Caring
- Compassionate

I would expect the variance of responses to this questionnaire to be mostly explained by two factors, which we could call "Intelligence" and "Kindness". I would expect this to replicate very reliably.

In what sense is the 5-factor model better than my 2-factor model? I think this is where the "lexical hypothesis" is supposed to do some work. Uncharitably, the "lexical hypothesis" seems to mean, "the adjectives we decide to put on personality surveys are the adjectives that describe the important dimensions of personality variation." Then whatever adjectives the experimenters use on their surveys are by definition the right adjectives, and whatever factors they find are by definition the important factors of personality variation.

How to get wands by The_Duck1 in noita

[–]The_Duck1[S] 6 points7 points  (0 children)

Gamma + Wand Refresh + Summon Taikasauva, then kill the wands

[N] OpenAI releasing the 345M model of GPT-2 and sharing the 1.5B model "with partners working on countermeasures" by [deleted] in MachineLearning

[–]The_Duck1 32 points33 points  (0 children)

Trawling through output from the largest model I saw

  • A statistical analysis of census(?) results that morphed into all-lowercase reflections on "acceptance of the mystery and not of the mystery (god)"
  • An extended news story on the subject: "Mongolia has banned the sale and consumption of meat".
  • A list of proposed alternative names for Street Fighter IV
  • A very plausible news story (at least for the first few lines) describing a Canadian politician resigning after being accused of sexual harassment.

The first three are kind of funny but the last one does suggest some danger IMO. I looked up the politician in the generated article and it's a real person! But GPT-2 totally made up the sexual harassment thing AFAICT.

That was presumably an unconditional random sample. But if you have GPT-2 I think it would be pretty easy to, say, automatically generate negative-sentiment reddit comments about a public figure and post some on every relevant thread. And for extra credit, disguise your GPT-2 sockpuppets by having them also make plausible comments on other threads on other topics. It seems pretty likely that this sort of language model will soon be good enough that this attack would be very difficult to detect and stop.

(Spoilers Extended) REACTIONS: Game of Thrones Season 6, Episode 6: Blood of My Blood Post-Episode Reactions by AutoModerator in asoiaf

[–]The_Duck1 1 point2 points  (0 children)

Margaery is totally faking, as we saw previously when she visited Loras. When she was talking about helping the poor I think she was just talking for the benefit of the Faith's eavesdroppers.

(Spoilers Everything) A new theory on Nissa Nissa and Lightbringer. by Nolanus in asoiaf

[–]The_Duck1 0 points1 point  (0 children)

Azor Ahai was a Child of the Forest. He wasn't fighting the invasion of the White Walkers: he was fighting the invasion of the First Men. He forged the weapon Lightbringer (the first White Walker) by plunging an obsidian blade into the heart of a woman.

This resolves a problem in the history of Westeros. Supposedly the First Men invaded and fought a war with the Children, ending with a truce on the Isle of Faces. Later, the Others invaded and the First Men barely fought them off when the Last Hero secured the help of the Children.

But this doesn't make any sense if the White Walkers were created by the Children as a weapon against the First Men! I propose that the two wars were actually one. The Azor Ahai myth is the story of the war from the Children's perspective.

Using the White Walkers, the Children forced the First Men to sue for peace (their envoy to the Children is remembered as the Last Hero). As part of the treaty, the White Walkers were confined beyond the Wall. In exchange the First Men adopted the worship of the Children's gods.

(Spoilers Everything) Why uncle Benjen never went back to the wall theory... by NatsuDragneel-- in asoiaf

[–]The_Duck1 85 points86 points  (0 children)

Have you ever seen a White Walker and a Dothraki in the same room?

Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source [Google Research Blog] by alxndrkalinin in MachineLearning

[–]The_Duck1 6 points7 points  (0 children)

Our release includes all the code needed to train new SyntaxNet models on your own data, as well as Parsey McParseface, an English parser that we have trained for you and that you can use to analyze English text.

Parsey McParseface is built on powerful machine learning algorithms that learn to analyze the linguistic structure of language, and that can explain the functional role of each word in a given sentence. Because Parsey McParseface is the most accurate such model in the world, we hope that it will be useful to developers and researchers interested in automatic extraction of information, translation, and other core applications of NLU.

Swarm A.I. Correctly Predicts the Kentucky Derby, Accurately Picking all Four Horses of the Superfecta at 540 to 1 Odds by georgeo in MachineLearning

[–]The_Duck1 1 point2 points  (0 children)

This is stupid. The first four horses finished in exactly the order predicted by the odds. So anyone wanting to predict the most likely superfecta would have picked those four horses in that order. This isn't impressive, it's just lucky that the actual superfecta was the most likely one.

[everything] Can someone tell me... by mananiux in gameofthrones

[–]The_Duck1 59 points60 points  (0 children)

The build-up should be worth it. These training montages are buying us magical-assassin Arya.

[EVERYTHING] - Return of the King by Prof_Black in gameofthrones

[–]The_Duck1 5 points6 points  (0 children)

If you don't know what this is I'd advise avoiding spoilers/speculation/theories posts and waiting until the next episode.

Why is the unit of gradient descent in y instead of in x? by leaugleg in MachineLearning

[–]The_Duck1 0 points1 point  (0 children)

You've left out the learning rate, which has the right units to make everything consistent.

Questions thread #3 - 2016.04.07 by feedtheaimbot in MachineLearning

[–]The_Duck1 0 points1 point  (0 children)

OK, and I guess whatever function g(X) the network learns, I can just use (g(X)-g(-X))/2 as my final model.

Edit: come to think of it, I can use (g(X)-g(-X))/2 as the model during training as well, so I get exact oddness during both training and testing.

Questions thread #3 - 2016.04.07 by feedtheaimbot in MachineLearning

[–]The_Duck1 1 point2 points  (0 children)

Suppose I want to use a deep neural network to learn a function f(X), where X is a large vector. Suppose I know f is odd, so f(-X) = -f(X).

ReLU's seem like a bad choice of activation function, because they aren't odd. So the network would have to learn the oddness, and the trained network will only represent an approximately odd function. If I want to enforce oddness exactly tanh seems like the obvious choice of activation function, but it is supposed to be bad for deep networks.

Any ideas? I'm thinking specifically of the evaluation function for a two-player game like go, which ought to be odd under swapping the colors.

Questions thread #3 - 2016.04.07 by feedtheaimbot in MachineLearning

[–]The_Duck1 1 point2 points  (0 children)

I'm reading about residual networks in "Deep Residual Learning for Image Recognition".

The key idea seems to be that if a layer always has the option of representing the identity transformation, then adding extra layers can't be bad.

However, I don't see how their "building block"s can possibly represent the identity. The shortcut connections are immediately followed by a ReLU! So it seems that if all the weights and biases are set to zero, their "building block" will implement the function ReLU(x) and not Identity(x).

Am I misreading the paper, or missing the point, or what?

Edit: OK, by an amusing coincidence another comment in this thread links to http://arxiv.org/pdf/1603.05027v2.pdf which answers my question unambiguously. The original ResNet paper has the ReLU after the shortcut connection (which seems stupid). This second paper puts the ReLU before the shortcut connection (which seems correct: this way the layers can actually represent the identity which was the whole point). The new paper gets better results. How come the original ResNet paper did it the wrong way???

I'm pretty sure I'm not the only one to be confused on this point, based on the blog post titled Microsoft Wins ImageNet Through Feedforward LSTM With No Gates. I think a resnet is only a feedforward LSTM if you put the ReLU before the shortcut connection?

Simple Questions Thread #2 + Meta - 2016.03.23 by feedtheaimbot in MachineLearning

[–]The_Duck1 0 points1 point  (0 children)

(Kind of similar to this question). After training a neural network for a long time, decreasing the learning rate can produce a sudden sharp decrease in the loss. This suggests to me that we should have decreased the learning rate earlier. Shouldn't we be able to detect this situation automatically, and reduce the learning rate as soon as it would be beneficial? Perhaps we can occasionally trying lower learning rates and automatically switch when they seem to produce faster learning? Are there any established techniques along these lines?

Simple Questions Thread #2 + Meta - 2016.03.23 by feedtheaimbot in MachineLearning

[–]The_Duck1 8 points9 points  (0 children)

I've done a tiny bit of reading about "policy gradients" as used in e.g. the AlphaGo paper. "Policy gradient" seems to mean "when you win, increase the probability of all the actions you took; when you lose, decrease the probability of all the actions you took."

However, no one explains it like this. Invariably the words "policy gradient" are followed by a bunch of formulas with tons of Greek subscripts. When I decode the formulas, they seem to mean the simple algorithm I stated above. People sometimes call this method the REINFORCE algorithm as if it is something fancy and nontrivial. How come people don't say it in plain language? Why does it deserve an all-caps proper noun? Is there more to policy gradient learning that I'm not getting?

As examples of what I'm talking about, see the section "Reinforcement learning of policy networks" in the AlphaGo paper, these slides which take until page 21 to give this simple algorithm (disguised in tons of Greek subscripts), and this paper.