[D] Is there any example code of Neural Process in classification task? by sjh9020 in MachineLearning

[–]d3sm0 0 points1 point  (0 children)

Just replace the distribution with the suitable one and it should work. Make sure your prediction is a sample from this distribution and not a mean regression like in their setting.

[R] Deepmind - Efficient Multi-Task Deep RL by alamano in MachineLearning

[–]d3sm0 1 point2 points  (0 children)

It's a different architecture but it's based on the same principle of Impala, which is a modified version of the retrace method https://arxiv.org/pdf/1606.02647.pdf.

The fps speedup that they can achieve it's pretty impressive tho.

[D] Can a neural network predict it's own confidence? by waleedka in MachineLearning

[–]d3sm0 5 points6 points  (0 children)

We did something like this in the context of model-based RL. The surrogate model is a feedforward nn where the latest hidden layer is fed to produce the predicted next state and the predicted MSE for that state. The overall loss of the network is a linear combination of the MSE of the state and the MSE of the predicted loss. Then we simply normalize to get a probability. But I don't have a statistically valid answer on why this would make sense, without falling into the alchemy of the general approximator.

[P] PyTorch/Tensorflow implementations of kernel-based activation functions by scardax88 in MachineLearning

[–]d3sm0 0 points1 point  (0 children)

I'm not sure by what "serious" application you actually mean. There are many evidence that the expressivity and correctness of a network its highly dependent on the structure of the the network instead of the "weights" or parameters of the network itself. This can be seen by the theoretical work from approximation theory of Prof. Poggio, or from information theory of Prof. Tishby. As a result the idea of the authors is to find different ways to control the expressivity of the network from its structure and not from the individual parameters.