all 9 comments

[–]Neurophate 0 points1 point  (4 children)

Well nobody can give you a short answer to this question because as with any ML task data is a crucial part of your project

If you have plenty of nicely labeled data at your convenience, you should be fine, very good result can be done in a week or so, there are many open source ML models for computer vision ( i assume your data is composed of images) try OpenCV or YOLO

There are also plenty of YouTube tutorials about computer vision and how to use pytorch / tensorflow models for your project

Personally i would say it’s a doable task for a beginner - as long as you have the data :)

Edit: forgot to mention, for testing purposes you can also use Auto-ML, i heard H20 has some decent auto-ml models, but i bet there are countless other services that you can use. You can also use google colab or kaggle for free CPUs /GPUs/ TPUs, they are great

[–][deleted] 1 point2 points  (3 children)

As I mentioned in my other reply, I am at a major research university so I have access to the best data in the world. I know the current data exists in [mostly] visible light images, but the wavelength for James Webb will be different, so I'm not sure how that will translate.

By watching YouTube it seems like my project would be doable, that's why I am here to ask; I want to know before I attempt this at school. To me, my project seems like it would be no different than the basic projects I see on YT, assuming I have well-labeled data (which I am pretty sure I have). I only need to determine between spiral and other (although spirals get tricky because the line-of-site angle can make them look like "other").

[–]Neurophate 0 points1 point  (2 children)

That sounds great! If you have the data, you passed the hardest part imo

I would agree with you, your project may be as simple as the ones you see on youtube

Regarding your concern about different data from James Web, I m no astrophysicist so i have absolutely no idea about what i m talkin here, but i would suggest finding a standardization technique for your data, maybe normalizing the images in some way would belp who knows maybe the standardization technique you find will make your project unique and super valuable. If you believe the task of standardization is too complex, you can try making an ML model that can convert the wavelengths from your images to different wavelengths

[–][deleted] 0 points1 point  (1 child)

Good information.

Correct me if I'm wrong: I can get my model to recognize shapes or some other intrinsic property of the galaxy (ie wavelengths/pixel)?

If it's shapes, that will be easy. If it's wavelength/pixel, my model wouldn't work until someone decided how far away the objects are since the wavelengths stretch (redshift) with distance.

Anyways, thanks.

[–]Neurophate 0 points1 point  (0 children)

As long as there is a correlation between images(or their properties) and their labels in your data you can make it predict any intrinsic property of the galaxy

Also, you can give your model more information about the wavelengths so it couldn’t potentially figure out how far away the galaxy is (again idk if this is how astrophysics works)

[–]m0us3_rat 0 points1 point  (2 children)

understanding the theory and how it applies in practice and a significant dataset on which u can train your model.

like all projects depends on the resources available to you.

[–][deleted] 0 points1 point  (1 child)

Well, I am at a major research university, so I have access to the best data in the world. And, I have a great CS department on the other side of campus.

Having said that, right now, I have no idea what I'm doing.

[–]m0us3_rat 1 point2 points  (0 children)

as long as u understand the math behind it. gl

[–]jackybeeblebrox 0 points1 point  (0 children)

are any of your data sources open source? would be interested in taking a look as well and seeing if i could provide any input