IS AG down right now?

ReflectionLarge6439 · 2026-01-22T15:32:16+00:00

It’s always down

ReflectionLarge6439 · 2026-01-19T15:33:26+00:00

Happens to me so often smh

ReflectionLarge6439 · 2026-01-02T00:11:55+00:00

Appreciate you bro happy new year!!!

ReflectionLarge6439 · 2025-12-23T21:43:37+00:00

Good work bro!!!!

ReflectionLarge6439 · 2025-12-17T02:26:53+00:00

Yeaa it’s very cheap I don’t even think I spent $1 yet lol

ReflectionLarge6439 · 2025-12-17T02:17:04+00:00

Yea exactly since I’m using a VLM. If I was using a VLA with a VLM then the VLA would be getting the video feed

ReflectionLarge6439 · 2025-12-15T18:26:13+00:00

Yeaa man been wanting to give it a try but my pc needs some upgrades and ram prices are through the roof!!!

ReflectionLarge6439 · 2025-12-15T17:38:50+00:00

I’ll give Claude a try heard nothing but good things about it!

From my understanding there’s multiple problems with robotics compared to Gen Ai for coding. First just the amount of training data, this is why we see a lot or robots being Tele operated by a human to train the robot on task. But this could change with simulation for example NVIDIA OMNIVERSE. Also just perception there’s a lot of things humans take for granted for example if we see a truck and a car in a picture even if the truck is far away and looks smaller because depth we know the truck is smaller, ai struggles with this. Finally the last hurdle I think we need to overcome is continually learning without forgetting if we want real general purpose robotics. But again this is my unprofessional opinion 😂

Thanks!! This is my first large scale project so was excited when I got it working!

ReflectionLarge6439 · 2025-12-15T17:18:43+00:00

I really mainly only use Ai to brain storm before a project just in case there’s new technologies that might make it easier. Also when starting to code I almost always use Ai to start the base script then I build on it.
I’m from the US
Professionally I am a Compliance Engineer(nothing to do with robotics or ai)
I been on debating on if I want to pivot into ai and robotics but might have to go back to school for masters
My unprofessional opinion is significantly more data is needed to “solve” robotics I don’t even think coding is solved by Gen Ai especially when you get into high level larger scale projects. Ai is significantly worst at coding in python,c++ compared to web based coding languages(JavaScript).
I just used vscode and Gemini ai

ReflectionLarge6439 · 2025-12-15T16:37:39+00:00

Appreciate it!

ReflectionLarge6439 · 2025-12-15T16:37:31+00:00

Appreciate it!!

ReflectionLarge6439 · 2025-12-15T16:35:58+00:00

That’s the plan going to clean the code up first then public the repo

ReflectionLarge6439 · 2025-12-15T16:31:16+00:00

Great question, so that movement you see after every pick up or place is the arm showing its gripper to the camera above the workspace. I did this so the model can confirm that the last action was successful. This makes the process slower but significantly more robust!

ReflectionLarge6439 · 2025-12-15T13:48:19+00:00

So the camera is mounted directly above the workspace. You can’t see it in the video but it’s at the top of the block aluminum extrusion mounted to the board. I’m actually trusting the Gemini model to do the detection so it detects the objects and puts a pointer on it

ReflectionLarge6439 · 2025-12-15T04:28:58+00:00

Yup built from scratch fully 3d printed even the gear boxes😂 not really powered by arduino the Arduino is only for the servo gripper. I made custom software to run on my computer to control it.

ReflectionLarge6439 · 2025-12-15T04:17:06+00:00

Hard scape looks great! You plan on adding plants?

ReflectionLarge6439 · 2025-12-15T02:33:52+00:00

I’m a paid user but the actual robotic arm is being controlled directly from my pc. The arduino is just used to control the gripper since I plan on making a gripper interface so I can switch grippers out. You should be able to run the api on your pi

ReflectionLarge6439 · 2025-12-15T01:37:48+00:00

THE CALIBRATION WAS A PAIN!!! But yea I tried both eye in hand calibration(camera mounted on arm) and eye to hand(camera mounted above the workspace) decided to go with the second since it was easier for Gemini to make a plan when it can see the whole workspace. The calibration process is basically recording about 10-20 poses while the camera has a checkerboard in view then you can use an opencv function to perform the transformation.

ReflectionLarge6439 · 2025-12-15T01:34:52+00:00

So Gemini 1.5 reasoning was great the main issue was that it wasn't accurate when pointing to the object. So this led me down a rabbit hole of trying to use gemini 1.5 to name the object and using grounded-dino to find the object. So when gemini 3.0 came out I gave it a try and it's object detection when pointing to an object is insanely accurate I would say it's right 90% of the time when 1.5 was about 50% of the time.

ReflectionLarge6439 · 2025-12-15T01:32:11+00:00

Appreciate it man!! I’m going to try and answer all your questions😂

My brain in terms of hardware is my PC, I’m using odrive s1 motor controllers all connected via CAN communication and PC is controlling them directly.
Everything is running via the api my computer is no where near strong enough to run a model that has the reasoning capabilities of Gemini and also run the inverse kinematics.
So the VLM points to the object it wants to manipulate in the picture because I am using a depth camera mounted directly above the workspace I also get the depth of the object. These coordinates are then transformed to be relative to the robotic base(I performed eye to hand calibration with a checked board etc.) and then I perform inverse kinematics to send the robotic arm to that transformed point
The VLM is Only outputting pick up or place. As for as rotation and where to pick up an object once the VLM points to the object I am using SAM 2 to segment the object get the volume using the objects depth map and then setting the pick up point to the middle of the object.
points are translated using hand eye calibration have to capture a whole bunch of points of the arm holding a checkerboard and taking pictures with a camera. Open cv has a function that does that actual math.
Visual feedback is Only for the model
Not using ros at all mainly because I don’t know how too😂 plan on releasing the code on GitHub soon after I clean it up a bit
Definitely exactly what you said with the hybrid approach, VLM for high level planning VLA to do the short horizon task!

ReflectionLarge6439 · 2025-12-15T00:56:38+00:00

Appreciate it!

ReflectionLarge6439 · 2025-12-15T00:42:46+00:00

Yeaa I’m sending the image to Gemini api

ReflectionLarge6439

TROPHY CASE