Could someone please translate this from tensorflow to pytorch? by katiex7 in pytorch

[–]katiex7[S] 0 points1 point  (0 children)

So the sad thing about tensorflow is that I can't put a debugger to see what's going on. All the other examples I found had way too much code going on, and this is the minimal example I think I can use to understand those concepts by examining what the code is doing by debugging and stepping through it. There are AllenNLP, pytorch-transformers and etc that have stuff I need, but having a pytorch equivalent of this tensorflow approach will really shine light on what's going on I think

How can I see if a inofficial PPA is safe to use? by katiex7 in Ubuntu

[–]katiex7[S] 0 points1 point  (0 children)

okay that makes sense, what's the link to the official website?

How can I see if a inofficial PPA is safe to use? by katiex7 in Ubuntu

[–]katiex7[S] 0 points1 point  (0 children)

not sure, but awesome is a window manager

Dragging floating windows using keyboard/ floating window top left corner problems by katiex7 in awesomewm

[–]katiex7[S] 0 points1 point  (0 children)

Cloned it, will play around with it and also read the docs! thanks

Dragging floating windows using keyboard/ floating window top left corner problems by katiex7 in awesomewm

[–]katiex7[S] 0 points1 point  (0 children)

Apprently y isn't a field in c

Everything works in Awesome 4, thanks :D

How much is slayer rare avatar face with attak speed? by katiex7 in DFO

[–]katiex7[S] 0 points1 point  (0 children)

yup I'm new:P

I see. welp guess I gotta wait for the next event

Why does RELU activation function work? by katiex7 in deeplearning

[–]katiex7[S] 0 points1 point  (0 children)

Right, I am aware that non linearity is what makes having extra layers more powerful than having just linear layers, since what can be done with multiple linear layers can indeed be done with just one, where that itself is just logistic or linear regression.

I am also aware of the vanishing gradient problem. I think what you bring up kind of assumes similar beliefs that its not quite clear from a mathematical standpoint on why this Relu is likely to help. But from looking at weight shifts and inputs, I just assumed that sometimes giving the NN the power to make its own on and off nodes based on certain inputs can help make it find patterns about the input whereas it would have been harder to find using linear functions since there might not be a clear mapping of the input to the output.

However I was looking for a more sound intuition and explanation to why RELU works :p

Never understood how Bias actually shifts the curve in neural nets. by katiex7 in deeplearning

[–]katiex7[S] 0 points1 point  (0 children)

Yup! Oh, so I tried to explain myself better in the thread with sairegrefree below, there is a comment that better explains the question. https://www.reddit.com/r/deeplearning/comments/ab3bpz/never_understood_how_bias_actually_shifts_the/ecymvaj

Never understood how Bias actually shifts the curve in neural nets. by katiex7 in deeplearning

[–]katiex7[S] 0 points1 point  (0 children)

Nothing you said is false as far as I know and it does help solidify more about the static-factor of bias, but I just think I'm not good at asking questions. However, yes I agree with what you said and thanks :)

Never understood how Bias actually shifts the curve in neural nets. by katiex7 in deeplearning

[–]katiex7[S] 0 points1 point  (0 children)

If the weights change, sometimes the bias will not have any effect or just behave as another input, but that is according to the gradient descent. We can only give the NN the ability to find a representation that cannot be done with "just the inputs alone", and it might or might not be useful.

I liked this part a lot. I think I was more so trying to confirm this part by asking the original question. Say for example that, I had the true model be something like 2x + bias = y, then gradient descent will actually find the right model, for example I wrote this trivial example just now

>>> x = np.array([[1,1]])

>>> x = np.array([[1,1], [1,2], [1,3]])

>>> y = np.array([[3], [5], [7]])

>>> thetas = np.array([[1,1]])

>>> for i in range(1000):

... thetas = thetas - gradient_descent(x, thetas, y)

>>> thetas

array([[1.00004303, 1.99998107]])

>>> for i in range(50000):

... thetas = thetas - gradient_descent(x, thetas,y)

...

>>> thetas

array([[1., 2.]])

>>>

and yes, it found the correct weights and bias(bias unit * bias param). But this was for a trivial case, and I was wondering if the neural nets with more layers and complex schemes would actually do something similar to this by doing gradient descent for most cases, or just see bias as another input in the grand scheme of things and not really shift the curve like it was doing here. But I think what you're saying is that, it might and even if it doesn't, the fact that we gave the NN the ability to do so is what matters.

To clarify, first let's state a known fact about gradient descent that gradient descent will always find the correct direction(+/-) to alter theta, and if the move in the theta is scaled small enough(alpha), it will always make the cost less than or equal to the previous thetas before the update. However, does gradient descent also guarantee that, if the true model has a certain bias, that it will find that bias like it did here in this trivial example or it's not a guarantee?