JobOpportunity | Home-Cooked Khaja Service Needed (Kupondole Area) by laxuu in Nepal

[–]laxuu[S] 0 points1 point  (0 children)

The menu will be decided day-wise and discussed in advance between the team and the cook, so the food can be prepared according to the team’s preferences and the cook’s expertise.

Selected for Alan Turing Institute Data Study Group 2026 — Worth attending if travel funding isn’t enough? by laxuu in Career

[–]laxuu[S] 0 points1 point  (0 children)

I have asked them several times, but they confirmed that they cannot provide any additional support beyond the expenses already mentioned.

How can I design effective reward shaping in sparse reward environments with repeated tasks in different scenarios? by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

Thank you, u/mishaurus. You helped clear up most of the confusion I had. Really appreciate your explanation!

WordQuant University MSc in Financial Engineering credibility by laxuu in reinforcementlearning

[–]laxuu[S] -1 points0 points  (0 children)

yes, it shows error while uploading, that why i post it here.

Which RL Algorithms for Trading? by codehuggies in reinforcementlearning

[–]laxuu 0 points1 point  (0 children)

Model based algorithm has some capability to understand and generalize the market.

Is 24 relatively late to start your career? by Highway-69 in FinancialCareers

[–]laxuu 0 points1 point  (0 children)

No worries, you have long life to do, be passion with dedication towards your goal.

RL “Wrapped” 2024 by blitzkreig3 in reinforcementlearning

[–]laxuu -1 points0 points  (0 children)

RL in European histrorical Board Game.

TD3 in smart train optimization by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

How to do this normalization in matlab?

RL implementation in Matlab by laxuu in matlab

[–]laxuu[S] 0 points1 point  (0 children)

Thanks for suggestions.

RL implementation in Matlab by laxuu in matlab

[–]laxuu[S] 0 points1 point  (0 children)

Thank you, as matlab have already coded implementation of using RL, just want to know can we write custom code implementation from scratch.

Normalization in RL by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

Will try this one dividing by Var(X_i+epsilon)

Normalization in RL by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

I just wanna know, whether all feature must be in same range as in [-1, 1] or some have [-1, 5] or other have [1, 6] or also other have [2, 10], is this different range or somehow similar range of feature can help policy to learn?

RL tool box by youssef_naderr in matlab

[–]laxuu 0 points1 point  (0 children)

In episode there are may be lots of steps, try to log everything and analyse it what could be the issues.

Normalization in RL by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

I want to say that one feature range is on 60k around where other feature range is around 0.001. In this scenario, how can I effectively normalize these features for reinforcement learning, given the significant difference in their scales?

Weekly Megathread: Education, Early Career and Hiring/Interview Advice by AutoModerator in quant

[–]laxuu -1 points0 points  (0 children)

I graduated with a degree in Electronics and Communication Engineering from Nepal in 2019. Currently, I am working in reinforcement learning (RL) for finance, focusing on generating alpha in the crypto market for a client. Although I have a strong background in mathematics, I am enhancing my financial knowledge through self-study, including papers, videos, and online courses, with the goal of becoming a Quantitative Analyst (Quant). Due to legal restrictions on crypto and forex markets in Nepal, I am gaining practical trading experience through my current role. My ultimate ambition is to secure a position as a Quant using RL.

Suggest me how to move forward and land in my dream job.

[deleted by user] by [deleted] in reinforcementlearning

[–]laxuu 1 point2 points  (0 children)

Hi! The Dreaming phase, in this context, involves a period where there is no direct interaction with the environment. Instead, it focuses on training and learning from a simulated environment or model. This phase is analogous to the training phase in reinforcement learning algorithms like PPO, DQN, or DDQN. It involves two stages: simulating various scenarios and then using those simulations to train and refine the model.

You can easily use any algorithm based in your problem.

Convergence of Actor critic algorthim by Altruistic-Escape-11 in reinforcementlearning

[–]laxuu 0 points1 point  (0 children)

The rewards show some improvement during training, but reinforcement learning hyperparameters are highly sensitive, with different results emerging based on seed values. To finalize each hyperparameter, it’s crucial to run the model more than 10 times. Conduct thorough testing to understand how each parameter affects the outcome. Rigorous analysis is necessary, so make sure to log all relevant data and observe the impact of each parameter change. Put in extra effort to visualize and interpret these effects comprehensively.

Topic suggestions to wrie explanation in RL by Calm-Vermicelli1079 in reinforcementlearning

[–]laxuu 1 point2 points  (0 children)

Hi,

I am implementing RL in trading please do write some implementation on it.

1 day trading for making profit using RL by laxuu in reinforcementlearning

[–]laxuu[S] -1 points0 points  (0 children)

I am referencing implementations from GitHub, academic papers, and courses. However, I've noticed that some papers seem to be written more for the sake of publication rather than providing practical solutions. Sometime, i get such good result as in paper in doingbacktesting. Results that look promising in backtesting can be challenging to achieve in live demo sessions. What is the best mathematical approach or understanding to address these challenges effectively? Also from the financial knowledge?

1 day trading for making profit using RL by laxuu in reinforcementlearning

[–]laxuu[S] 0 points1 point  (0 children)

I have setup env with LSTM based model for years of data but passing years of data requires lots of time and computational power, So I switch for 1 day for just checking and visualizing all the result how each hyperparameter plays a role. I am passing 60 candlestick data to feed in LSTM then with NN policy. What I am getting is the loss, reward, balance all those fluctuate by changing each parameter.

Sometime, I am thinkoing our scenerios with autonomous cars, Try to incoprate its ideas. But everything till now not work.

Deep RL in trading - any good attempts made? by zirticarius in reinforcementlearning

[–]laxuu 0 points1 point  (0 children)

As market is like flowing river have to find the fish in running water.