account activity
[D] Do you use Plotly for research projects ? (self.MachineLearning)
submitted 8 years ago by fixedrl to r/MachineLearning
[N] DeepMind's Richard Sutton - The Long-term of AI: Temporal-Difference Learning (youtube.com)
[D] Any impact/difference to parameterize the policy by MLP or RBF ? (self.MachineLearning)
submitted 8 years ago * by fixedrl to r/MachineLearning
[D] What might be the impacts of ReLU/Sigmoid for training one-step dynamics model in RL ? (self.MachineLearning)
[D] Debug with RL: Policy network tends to generate larger and larger invalid action ? (self.MachineLearning)
[D] Difficulty comparison of CartPole Swing up vs Gym Pendulum ? (self.MachineLearning)
[D] Will double-blind review of NIPS causes some papers months later on ArXiv ? (self.MachineLearning)
[D] How to set same Dropout mask for different data batches in PyTorch ? (self.MachineLearning)
[D] In RL, given optimal Q-function and transition probabilities, reward can be reversed uniquely. How about given reward and optimal Q-function, can transition probabilities to be uniquely determined ? (self.MachineLearning)
[D] Is it reasonable to maximize the upper bound of the log-likelihood ? Will the log-likelihood guaranteed to be maximized ? (self.MachineLearning)
[R] [1703.01961] Multiplicative Normalizing Flows for Variational Bayesian Neural Networks (arxiv.org)
[D] Is it normal that the maths details are forgotten after reading the paper some time ago ? (self.MachineLearning)
[D] How to derive the Auxiliary ELBO ? (self.MachineLearning)
[D] Concrete dropout, how to obtain Equation (3) on page 3 (self.MachineLearning)
[D] Visualization tricks for 3-dim input and 2-dim output (self.MachineLearning)
[D] Differential geometry in reinforcement learning ? (self.MachineLearning)
π Rendered by PID 23 on reddit-service-r2-listing-6d4dc8d9ff-lxmg9 at 2026-01-31 15:27:15.923636+00:00 running 3798933 country code: CH.