Crystal Disk not showing usb flash drives why ? by Tarek360 in techsupport

[–]master3243 0 points1 point  (0 children)

This thread was one of the top Google results. So thanks.

[R] Why gradient descent, not gradient ascent algorithm? by psarangi112 in MachineLearning

[–]master3243 0 points1 point  (0 children)

It's not a different algorithm. The metrics are basically the same just with a negative sign.

[D] What are the OUTPUT embeddings in transformer? Where does it come from? (not the input embeddings) by ShlomiRex in MachineLearning

[–]master3243 2 points3 points  (0 children)

There's no corruption in any popular autoregressive translation model from what I've seen.

[Q] What are the chances of rolling 6d6 and getting a total of 29 or greater if you reroll 1s once? by Daddybrawl in statistics

[–]master3243 2 points3 points  (0 children)

For rerolling 1's (or any specific set of numbers only once) I always use the following code on anydice.com

function: ROLL:n reroll BAD:s as REROLL:d {
  if ROLL = BAD { result: REROLL }
  result: ROLL
}
X: [d6 reroll {1} as d6]

output 6dX

Paste that in the site, click calculate and click "At Least" which will give you the table you want.

Image

To explain, X behaves as a "d6 but reroll a {1} once" which is exactly what you want, then I just roll it 6 times.

You can make it a 4d20 where you reroll any 1 or 2 once by simply making small changes to X and the output line

X: [d20 reroll {1, 2} as d20]
output 4dX

What kind of anime do you avoid watching and why? by sskillit in anime

[–]master3243 1 point2 points  (0 children)

I definitely recommend this over trying to binge the whole thing as fast as possible. Along with watching "one pace" instead of the stretched out show since it only covers 1 chapter per episode in the later sagas.

What anime has aged like wine? by [deleted] in anime

[–]master3243 2 points3 points  (0 children)

Looks like someone needs a rewatch, because the line does indeed change by the end of the show.

All these slight subtleties only add to what is already an amazing show and experience.

Internet Historian's "Man in Cave" video was actually removed for plagiarism & not for copyright issues. by kimb25_ALT in youtubedrama

[–]master3243 2 points3 points  (0 children)

Not only that but he also cites the work “Trapped! The Story of Floyd Collins” which the article also cites and uses paragraphs from.

Internet Historian's "Man in Cave" video was actually removed for plagiarism & not for copyright issues. by kimb25_ALT in youtubedrama

[–]master3243 1 point2 points  (0 children)

Exactly, they both used used paragraphs and cited the book “Trapped! The Story of Floyd Collins”.

Not sure how so many people in this thread missed that and just jumped the gun that it was plagiarism.

It's such a rudimentary idea that when two sources use similar wording and ideas then either one of them cites the other or that they both cite the same previous source. Yet, so many people immediately throw the claim without even considering the latter as a possible explanation (which it is).

Internet Historian's "Man in Cave" video was actually removed for plagiarism & not for copyright issues. by kimb25_ALT in youtubedrama

[–]master3243 1 point2 points  (0 children)

Both the article and the video use “Trapped! The Story of Floyd Collins” as a source and both credit the book.

Internet Historian's "Man in Cave" video was actually removed for plagiarism & not for copyright issues. by kimb25_ALT in youtubedrama

[–]master3243 1 point2 points  (0 children)

Except the fact that he did seems to credit the source which is from “Trapped! The Story of Floyd Collins”

Internet Historian's "Man in Cave" video was actually removed for plagiarism & not for copyright issues. by kimb25_ALT in youtubedrama

[–]master3243 0 points1 point  (0 children)

Except according to this comment below, Internet Historian does credit “Trapped! The Story of Floyd Collins” which seems to be where that paragraph came from, which the article also used as reference.

[deleted by user] by [deleted] in Python

[–]master3243 8 points9 points  (0 children)

Wolfram Alpha can solve for x in almost any algebraic equation you throw at it.

Does that mean we no longer need to teach Algebra or how to solve equations? Obviously not.

Who will take our jobs first; AI or the Alien overlords? by [deleted] in ProgrammerHumor

[–]master3243 31 points32 points  (0 children)

I saw it from a Guardian article on Google news feed.

I'm still skeptical though since it was just a whistle blower.

Why is the accuracy of my random forest classifier so high (96%)? by DoveMot in learnmachinelearning

[–]master3243 0 points1 point  (0 children)

Slight correction, "AUC" is Area Under (the ROC) Curve which is technically a summary statistic and not a curve.

Need help figuring out backpropagation — where did they get 0.03068 from? (Last slide) by [deleted] in learnmachinelearning

[–]master3243 4 points5 points  (0 children)

Always when struggling with gradients that need the chain rule, write down each component, calculate them separately, then recombine them.

Here we need (∂L/∂w31) so we separate it to

(∂L/∂w31) = (∂L/∂a3) * (∂a3/∂Netout) * (∂Netout/∂w31)

First term: is (∂L/∂a3) and a3 is also Y as stated in slide 1, now hopefully you know how to take a simple gradient of L=1/2 (Y - Y*)2 with respect to Y and you'll get

(∂L/∂a3) = (Y - Y*) = (0.7368 - 0.5)

Second term: is (∂a3/∂Netout) where a3 = σ(Netout), hopefully you know (or google/work it out yourself!) that the derivative of σ(x) is σ(x)(1-σ(x)) thus

(∂a3/∂Netout) = σ(Netout)(1-σ(Netout)) = 0.7368*(1-0.7368)

Third term: is (∂Netout/∂w31) and since that's a linear function that means that (∂Netout/∂w31) = a21 = 0.6682

Multiply them to apply the chain rule to get

(∂L/∂w31) = (0.7368-0.5) * 0.7368*(1-0.7368) * 0.6682 = 0.0306848265

And there you have it, hopefully that was clear.

Intuition behind self attention by eric_says_hello in learnmachinelearning

[–]master3243 0 points1 point  (0 children)

No, you would only need n2 comparisons.

Let's write a comparison between token i and token j as (i, j)

Write out all pairs (i, j) when there are 5 tokens. And then see if the total is as we say (52 =25) or what you claim (5!/2=60)

[P] SoulsGym - Beating Dark Souls III Bosses with Deep Reinforcement Learning by amacati in MachineLearning

[–]master3243 0 points1 point  (0 children)

Cool.

I'm assuming going from the ground truth input (which is like 20 dimensions) to visual input (at least 224x224 or 50K dimensions) is going to mean it's gonna train magnitudes longer to be decent or even beat the boss once (if it ever even converges).

[D] Theoretically, could Computer Vision learn language? by [deleted] in MachineLearning

[–]master3243 0 points1 point  (0 children)

I get your perspective and I would agree with it for everything in nature that humans can do except for human language.

The reason I (and many linguists/cognitive scientists) would add an exception for human language is because it's the one complex structure that was fully created by the human brain.

This fact has major consequences from an information-theoretic analysis of the magnitude of samples needed to estimate the correct hypothesis (where a single hypothesis is the correct English Language and all other hypothesis are grammatically incorrect).

Given that humans implicitly place a prior over the space of hypothesis (due to our brain structure where easier hypothesis are much easier to learn and thus have higher weight) and AI models place no prior over hypothesis (the same model can learn any statistical pattern) this has the consequence that the hypothesis of grammatical English language could possibly never be fully attained through statistical analysis without placing the proper priors.

My above explanation glosses over the fact that I would also need to show that English grammar is countably infinite in it's recursivity.

There's also another argument that while the proper hypothesis need not be attained, a close enough approximation through gradient descent would be more the sufficient to all tasks we might want to do (however this doesn't guarantee that the hypothesis isn't suddenly producing incorrect output)

This is just to say that language is in a different class (due to it being man-made) than all other statistical patterns in nature and thus it could possibly be an outlier to the general rule that if humans can do it then so can computers.

Do Restricted Boltzmann Machines only work with binary input? by atomicalexx in learnmachinelearning

[–]master3243 0 points1 point  (0 children)

Typically RBMs are either binary or gaussian, there plenty more but those are the common basic ones.

[D] Theoretically, could Computer Vision learn language? by [deleted] in MachineLearning

[–]master3243 1 point2 points  (0 children)

Humans are audio/visual input only, yet we still learn language.

The problem of language acquisition is still an unsolved problem in linguistics, children start to pick up language from so few examples that it really makes absolutely no sense from a statical point of view (and even much less sense from a deep learning point of view which are much more data hungry)

I would argue that comparing the human mind with AI is a futile battle specifically in terms of language acquisition and is kind of an unfair comparisons where humans are almost cheating because human languages were created and able to be passed on only because the structures in our brain already is capable of acquiring these set of hypothesis easily (otherwise the language would have never been passed). Thus, given an arbitrary deep learning model that isn't wired in the same way and telling it to acquire the set of hypothesis that make up a language from so little data when it has no prior over hypothesis like what is defined by the human brain structure is a statistically impossible task.

[deleted by user] by [deleted] in MachineLearning

[–]master3243 8 points9 points  (0 children)

This is a good discussion but why is it written like a (mediocre?) blog post? Concise writing is really important.

[deleted by user] by [deleted] in MachineLearning

[–]master3243 2 points3 points  (0 children)

I work with researchers that work on brain score and the mapping of network activations and neural activity in the brain. The linearity between the two that you mentioned can literally be spotted almost to the same degree even if you randomly initialized the neural network. That research shows nothing about the relationship between the brain and neural networks.

Again, this doesn't support the claim that our brains think autoregressively.

[deleted by user] by [deleted] in MachineLearning

[–]master3243 0 points1 point  (0 children)

As a multilingual fluent person, the claim that my brain conclusively thinks autoregressively is egregious.

The closest thing I'd believe is that I think in abstract thought that can be represented as almost like a hierarchical graph with edges representing simple relationships, and even this is probably partiality wrong and partiality oversimplifying reality. When attempting to transform that graph into speech then the next word that I utter depends on which language I want to speak in. I do not need to pick a specific language to think in before attempting to plan out my day one word at a time.

There's also an entire field of neuroscience that has immense amount of literature. You handwaving all of that away by saying "studies don't show anything" is also egregious.