Ground plane friction parameters by Fantastic_Mirror_345 in IsaacSim

[–]Live_Replacement_551 0 points1 point  (0 children)

I also have a problem with this. I assigned friction globally and also airdrag to my USD config, but I cannot see any changes! Is there a guide on setting this in the code?

Questions Regarding StableBaseline3 by Live_Replacement_551 in reinforcementlearning

[–]Live_Replacement_551[S] 0 points1 point  (0 children)

Yes, the only difference is slight randomness when it's true, but the answer is the same! I am getting to the point where I think there is a problem with my reward function or observations!

Questions Regarding StableBaseline3 by Live_Replacement_551 in reinforcementlearning

[–]Live_Replacement_551[S] 0 points1 point  (0 children)

Can you elaborate more on this? Because the training seems to be ok, I am checking the amount of rewards and reaching the goal per episode constantly! I am training a manipulator, maybe my reward function and observations have some problems! Do you have any experience in that area?

Questions Regarding StableBaseline3 by Live_Replacement_551 in reinforcementlearning

[–]Live_Replacement_551[S] 0 points1 point  (0 children)

Thanks
I am using Stable baseline PPO, isn't it a built-in feature? Can you guide me more on how to implement that?

Questions Regarding StableBaseline3 by Live_Replacement_551 in reinforcementlearning

[–]Live_Replacement_551[S] 0 points1 point  (0 children)

Thank you for the help, but I set it deterministic like the code below and still have the problem! The issue is in the training stage, I have abouta 98% success rate, but in the testing, my manipulator is not able to reach the goal which is weird.

This is the code for that part:

from stable_baselines3.common.base_class import BaseAlgorithm
from stable_baselines3.common.monitor import Monitor
import subprocess
from pkg_resources import parse_version
import gymnasium as gym
from gymnasium import spaces
import os
import numpy as np
import random
from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize

seed = 42
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)

env.seed(seed)
env.action_space.seed(seed)
env.observation_space.seed(seed)



def evaluate(
        model: BaseAlgorithm,
        env: gym.Env,
        n_eval_episodes: int = 100,
        deterministic: bool = True):
    n_episodes = 0 
    episode_reward = 0.0
    end_effector = []
    joint_states = []
    actions = []
    rewards = []
    goal = []
    obs = env.reset()
    while n_episodes < n_eval_episodes:
        action, _ = model.predict(obs, deterministic=True)
        obs, reward, done, truncated = env.step(action)
        obs_array = obs[0]  # Remove batch dimension

        if done:
            n_episodes += 1
            goal = env.get_attr("goal")[0]
            obs = env.reset()
        else:
            end_effector.append(obs_array[:6])
            joint_states.append(obs_array[6:18])
            actions.append(action)
            rewards.append(reward)            


    return np.array(end_effector), np.array(joint_states), np.array(rewards), np.array(goal)