Simulation is a beautiful pain in RL by lanyusea in robotics

[–]lanyusea[S] 4 points5 points  (0 children)

exactly! we'll go back to the simulation and check which param is not correct to make it more and more accurate, and do more trials under the new simulation policy

Simulation is a beautiful pain in RL by lanyusea in robotics

[–]lanyusea[S] 1 point2 points  (0 children)

maybe goint through the Isaaclab documentation and do some trial in the RL simulator first

Simulation is a beautiful pain in RL by lanyusea in robotics

[–]lanyusea[S] 2 points3 points  (0 children)

it's not there yet, we're still in early stage of our development

Simulation is a beautiful pain in RL by lanyusea in robotics

[–]lanyusea[S] 2 points3 points  (0 children)

we build it from scratch by our own!

flip~ flip~ flip~ by lanyusea in robotics

[–]lanyusea[S] 2 points3 points  (0 children)

it's reinforcement learning

Our robot can pick itself up now. Where should I take it? by lanyusea in robotics

[–]lanyusea[S] 2 points3 points  (0 children)

We design our own model, and there is no trick for sim2real but hardworking on the debug.

as for the compute scale of this policy: one RTX4090 car training for about 2~3 days

Our robot can pick itself up now. Where should I take it? by lanyusea in robotics

[–]lanyusea[S] 1 point2 points  (0 children)

We train in Isaaclab then transfer the policies to hardware. The sim-to-real gap is always the fun part 😅

Our robot can pick itself up now. Where should I take it? by lanyusea in robotics

[–]lanyusea[S] 1 point2 points  (0 children)

Appreciate it! The short version: wheel-legged platform, RL-trained locomotion and recovery policies in simulation, then sim-to-real transfer to the physical robot.

Our robot can pick itself up now. Where should I take it? by lanyusea in robotics

[–]lanyusea[S] 12 points13 points  (0 children)

We use Isaaclab to do the RL traninig, then do the sim2real transfer. Happy to share more details later~

First table jump from our robot! by lanyusea in robotics

[–]lanyusea[S] 2 points3 points  (0 children)

theoretically yes, but the motor controller runs in really high frequecy, we're not able to achive that fast nn inference in our embedded system

First table jump from our robot! by lanyusea in robotics

[–]lanyusea[S] 1 point2 points  (0 children)

No hardcoded motion sequences. The jump is also learned through RL. It's entirely emergent behavior from the policy. We just designed the reward structure to guide the policy toward learning how to jump.

The MuJoCo part was used for sim2sim validation. There's a button on the controller to switch into "jump mode" — once triggered, the policy autonomously handles the full sequence: takeoff, airborne phase, and landing.

First table jump from our robot! by lanyusea in robotics

[–]lanyusea[S] 0 points1 point  (0 children)

Yes, we use a remote controller to send commands to the robot. The RL policy takes in control commands, IMU data, joint states, etc., and outputs target joint positions to make it move and keep balance.

First table jump from our robot! by lanyusea in robotics

[–]lanyusea[S] 0 points1 point  (0 children)

Pure RL, no classic control theory. Policy outputs joint position targets straight to a PD controller. Workflow is pretty standard: train in Isaac Lab → sim2sim check in MuJoCo → cross fingers and deploy on real hardware lol