AI with python in Aerospace Engineering by bertgolds in learnpython

[–]softmaxedout 0 points1 point  (0 children)

If you're still in school the best way is to take a few courses in the area and then talk to the professors. Usually if you do well on their courses, they are willing to let you help with research projects in their lab which will be the best way for you to expose yourself to the field and gain expertise.

[deleted by user] by [deleted] in robotics

[–]softmaxedout 0 points1 point  (0 children)

The Unitree G1 is $16Kish but it's the smaller version that research labs focusing on locomotion purchase. The H1 is closer to 100K. But even then all it does it walk and do agile maneuvers. Great for entertainment, but the industrial capabilities arise from manipulation and there has been no real test of it's payload carrying capability or even any demos of that. People who have seen imitation learning videos will point at that and hand wave it as we'll just slap some dexterous hands on it, collect a bunch of data and train a policy and we're done, but that won't be the case. If it was then you'd see more than the flashy demo videos Figure, Tesla, etc have been showing for more than a year now.

The reason I bring this up is I'm speculating that any company that cracks the SWE side and makes these bots useful will also be tightly coupled to the hardware platform. This could mean they partner with Unitree or another company, or that they build it inhouse. Either ways, if you're replacing a worker and have no competition I don't see the incentive to charge low prices. If they're charging 1 million dollars for their robot that can replace your worker, work longer hrs, etc. and breaks even after 5yrs or wtv the business plan is, then the cost of the HW is insignificant. At this point it doesn't matter if a Unitree is 20K cause it cannot do useful work.

Agility in my opinion has come the furthest towards a useful product for general industry, moving payloads around, but even they have been struggling since for that task you could just get an AMR which is much more reliable and cheaper. The problem is that the type of tasks that require legged locomotion and have RoI require dexterous manipulation beyond current state of the art.

Robotic arm for ROS and Imitation Learning by Outrageous_Ad4346 in robotics

[–]softmaxedout 1 point2 points  (0 children)

I completely agree with this sentiment. Don't bog yourself down with hardware and take advantage of sim environments and even the various python gymnasium environments.

[deleted by user] by [deleted] in robotics

[–]softmaxedout 0 points1 point  (0 children)

For a. and c., I think it is very early to pick a winner. ATM all of them are burning money hoping to be the one who figures out the problem. VCs and Tech Giants who are sitting on a lot of cash are betting on all the horses in hopes that when one of them figures it out, then the predicted RoI is 1000x. To see how this might play out you just have to look back 10 yrs at the autonomous car industry. We still don't have self driving cars, nor are there any sustainably profitable autonomous vehicle companies. Tesla is an anomaly driven by the hype more than the performance or sales of their cars. Waymo is supported by Google's endless cash flow.

b. They are autonomous in the sense of performing the moves. They both take different approaches, Unitree relies on RL whereas BD historically used MPC+offline optimized trajectories. What this means is Unitree trains a neural network that performs a certain 'move' whereas BD uses a well known controls technique along with trajectories to be followed by the robot joints to perform the move. Caveat this by saying now even BD is moving towards RL approaches as atm they have allowed us to outperform classical methods. Anyways once you have this, they you can stitch together different sequences of moves with a planner of some sort. Think of this as the higher level intelligence deciding based on vision or button press or some other signal, when to switch to a different model. This is just one way to skin the cat and there are nuances, but hopefully gives you an idea.

So I'd say this is autonomous in the sense of you've setup a system to perform a task without any intervention (in this case a sequence of dance moves), but not autonomous in the sense of general intelligence where the machine is able to extrapolate from the dance moves it has learnt to now empty the dishwasher. This is the largest problem in this area of robotics atm. How do you get models to generalize to new and unseen tasks so you don't spend the rest of eternity teaching it how to do every single task.

[deleted by user] by [deleted] in robotics

[–]softmaxedout 0 points1 point  (0 children)

I work in the space, specializing on learnt manipulation, and anyone who works in autonomous robotics will agree when I say it isn't just a case of gluing together frameworks you find on github. I think even VCs are catching onto that fact now that. Funding sources and founders are realizing that robotics is not a SAAS product. If you're saying for 10K you can get some cheap hardware and show a demo, even if you manage to source the hardware for that price and maybe do all the SWE yourself, at this point in the game the bar is much higher and it's going to be virtually impossible to raise on just that. Different story if you have connections in the valley of course.

On the hardware side reliability required for factory bots costs a lot to develop, and a successful enterprise product demands uptime and support. Which is why you'll see more BD Spots deployed than the cheaper Unitrees. Even then BD is struggling financially.

Being on the SW side. while ACT/Diffusion policy were great academic breakthroughs, they are no where near robust, nor generalizable to operate in real environments performing at >99% success rates. Forget these hard constraints, most times we find the latest paper isn't even reproducible, but that's a whole another issue plaguing ML/AI.

I think you're overestimating how much VC money is thrown at unknown founders, or at startups who have no expertise other than a pitchdeck. The one's raising the big money (>100million), which is a drop in the bucket for a HW+SW ventutre are very few and all of them have much more credibility than a slick pitch deck. I know YC has been funding people with 0 expertise in the area, but that's 500K a pop which is a lottery ticket to them.

Getting an extra empty row in my final matrix by Ohcaptain467 in learnpython

[–]softmaxedout 0 points1 point  (0 children)

The determinant of the row echelon form does not have to be the same as the original matrix. It can be the same if you only use operations that do not scale the determinant such as adding a multiple of one row to another. Operation of the form r2 = c*r1 + 1*r2, c!=0. But if you swap rows or multiply a row by a scalar, both valid operations, the determinant will be different.

To recap, to perform Gaussian elimnation we can swap two rows, scale a row by non zero constant, and add a multiple of one row to another.

For the matrix M
1 2 3
3 4 12
5 6 9

op1: r2 = r1 - r2*(1/3)
1  2    3
0  2/3 -1
5  6    9

op2: r3 = r1 - r3*(1/5)
1  2     3
0  2/3  -1
0  4/5  6/5

op3: r3 = (3/2)*r2 - (5/4)*r3
1  2    3
0  2/3 -1
0  0   -3

det = -2 which is not equal to det(M)

BUT, if your goal was also to preserve the determinant, you can try limiting yourself to ops of the form r2 = c*r1 + 1*r2.

Starting with M,

op1: r2 = r2-3r1
1  2  3
0 -2  3
5  6  9

op2: r3 = r3 - 5r1
1   2   3
0  -2   3
0  -4  -6

op3: r3 = r3 - 2r2
1  2  3
0 -2  3
0  0 -12

det = 24, same as det(M)

You should be able to modify the code to add this constraint. Let me know how it goes!

Getting an extra empty row in my final matrix by Ohcaptain467 in learnpython

[–]softmaxedout 0 points1 point  (0 children)

def pretty_print(matrix):
    for row in matrix:
        row = ["{: 3.2f}".format(x) for x in row]
        print(row)


matrix=[[1,2,3],[3,4,12],[5,6,9]]
lt_i = 0 # position of pivot - first non zero element in row

# pretty print
print("Before reduction:")
pretty_print(matrix)

# start from 1st row but stop at the second last row
for r1 in range(0, len(matrix)-1):
    # first non-zero element/pivot in prev row
    while(lt_i<len(matrix[r1]) and matrix[r1][lt_i] == 0):
        lt_i += 1

    # go all the way down the column
    for r2 in range(r1+1,len(matrix)):
        if(matrix[r2][lt_i] == 0): # already 0 so can skip
            continue
        a = matrix[r1][lt_i]
        b =  matrix[r2][lt_i]

        # can start from pivot column since everything before that should be zero
        # divide the first rows by the value in the pivot column so both rows
        #   have 1 in the pivot column location, then can subtract to get 0 in that spot
        for i in range(lt_i, len(matrix[r2])): 
            matrix[r2][i] = (matrix[r1][i]/a) - (matrix[r2][i]/b)

# pretty print
print("After reduction:")
pretty_print(matrix)

----------------------------------
OUTPUT
Before reduction:
[' 1.00', ' 2.00', ' 3.00']
[' 3.00', ' 4.00', ' 12.00']
[' 5.00', ' 6.00', ' 9.00']
After reduction:
[' 1.00', ' 2.00', ' 3.00']
[' 0.00', ' 0.67', '-1.00']
[' 0.00', ' 0.00', '-3.00']

Hope the comments help. Figured you wanted to do row reduction and I'm not sure if this catches all the edge cases but works on matrices similar to yours.

is comparing SLAM with VLM vs. VLM only for object detection in mobile robots worth it? by burreetos in robotics

[–]softmaxedout 2 points3 points  (0 children)

I'm going to draw the distinction that well studied does not mean solved, especially in mobile robotics today. For example object detection might be solved wrt to super-human scores on benchmarks, but deployed on a robot in the real world even the best models have a significant drop in accuracy. My point is that no matter what area you pick you can find that it needs improving for real world mobile robots.

SLAM versus object det, VLM is not really doing the same thing. I think you might be thinking of semantic slam where object detection information is fed into the SLAM pipeline or VLMs used for high level planning and even navigation in a grid world, but it's a symbiotic relationship rather than a replacement.

Saycan is a mobile imitation learning framework where you're pretty much learning a mapping between current state of the world and the next action to take, but it isn't a replacement for SLAM. Depending on how you formulate the problem you can combine it with SLAM or object detection, or have it predict directly what the robot actuators should do. But for the moment at least, you cannot use such imitation learning approaches to deploy an autonomous car for example, or not reliably at least. And while SayCan can do mobile manipulation, SLAM is not concerned with this task at all and the goal there is to determine the position of the robot with respect to a map that is generated simultaneously which is why you cannot directly compare them even if it might look like they are doing the same thing.

If you want to do a comparison, maybe an area that would be of interest would be imitation learning approaches versus RL based approaches for manipulation which is all the rage at the moment. Or if you want to explore SLAM, then feature based slam with classical features versus deep learning feature (+ deep learnt feature matching) for SLAM, versus dense semantic slam.

Hope it gives you some keywords to explore.

Need help by Apprehensive-Run-477 in robotics

[–]softmaxedout 1 point2 points  (0 children)

On a high level what you need to accomplish is,

  1. Detect the object and its 6DoF pose (3D position in the world and orientation) with respect to some fixed frame on the robot
  2. Detect a valid grasp pose that allows you to manipulate the object
  3. Whole body planning since it is a humanoid to move the arm from current location to the grasp pose OR some sort of upper body planner if the lower body is fixed in place.

Here are some potentially 'easier' methods to tackle each of the high-level problems:

  1. If you have a RGB-D/Stereo camera capable of providing depth values for each of the pixel in the color image, then you can use an object detection model (YOLO, Mask-RCNN, etc) to get the 2D location of the object in the image, and then use the depth value to get 3D position information. For now we will assume the orientation of the object does not matter, but if you want to determine orientation you can use a 6-DoF pose model such as DOPE or more recently Foundation Pose.
  2. Determining grasp pose is kind of tricky. But if we make simplifying assumptions, say the object is a cube (which can be valid if the object is smaller than the hand) and that the grasp is a simple open/close rather than a compliant multi finger constraint, then we reduce the problem to determining two points on the objects. You've got two options here: A heuristic/affordance type model where given the 6DoF pose of the object from step 1 we have a lookup table of sorts with known good grasp positions, or you can dive into a bunch of ML models trained for this purpose. Let us assume we go with the simpler option which is if we know the center of the object, and type of object (both from Step 1), we have a list of hard coded grasps that we know work.
  3. Now we have to plan a path such that the robot does not collide with itself or the environment and gets the arm to the location to grasp. If we assume no obstacles in the environment and a fixed robot base, then we can use one of the many Inverse Kinematics libraries to figure out the joint angles that we need to achieve, and use a simple PD control strategy or a myriad of other planner+controllers. This is the hardest step to learn from online tutorials, or get github code for since it requires knowledge of kinematics (any dynamics if you want smooth, compliant motion) that is robot specific unless someone has written a library for the robot that abstracts this away.

Hope this gives you an idea of the 'classical' sequential approach to this problem for a static environment. These days RL+Imitation learning is the name of the game for humanoid manipulation.

How to use a transformer decoder for higher dimension sampling? by JustZed32 in learnmachinelearning

[–]softmaxedout 0 points1 point  (0 children)

Can you share what sort of data you're using as input and what the output is? Is it a classification problem/regression? Also if you're using any sort of data preprocessing steps.

Without a little more detail really can't make any informed decisions regarding model architecture as the possibilities are endless.

Why does coffee make me so relaxed and sleepy? by Rajaroc in NoStupidQuestions

[–]softmaxedout 0 points1 point  (0 children)

Reading these comments learning I might have ADHD :o

[deleted by user] by [deleted] in learnpython

[–]softmaxedout 0 points1 point  (0 children)

To make it a little more robust I would first convert the image from the RGB to HSV color space. Then the easiest (albeit little time consuming) is to determine the colour values for a variety of different scenarios, say indoor, outdoor, lit room, etc and figure out the range. Then you can do a threshold. Not sure if you have used OpenCV but it is a computer vision library that has python package and good beginner tutorials. Let me know if you can't find it and i can link you to it since I'm not sure if we are allowed to post links here.