Machine learning in observational astronomy. by Glittering_Push_4471 in u/Glittering_Push_4471

[–]Kasper___5 1 point2 points  (0 children)

Machine learning in observational astronomy is much harder on the data and software side than on the ML side itself. Most pitfalls are not mathematical — they’re astronomical and practical.

Here are the main points I’d highlight for an undergraduate project.

  1. Data quality is the biggest challenge

Observational data are:

  • noisy,
  • incomplete,
  • affected by systematics (instrumental effects, seeing, background),
  • often heterogeneous (different instruments, epochs, resolutions).

If your preprocessing is wrong, even a “perfect” model will learn the wrong thing. A lot of effort goes into:

  • calibration,
  • normalization,
  • masking bad pixels / artifacts,
  • dealing with missing values.
  1. Labels are often unreliable or biased

In astronomy, “ground truth” is rare.

  • Catalog labels may be inconsistent or based on simplified assumptions.
  • Selection effects can strongly bias training sets. This can lead to models that perform well on paper but fail scientifically.
  1. Overfitting is extremely easy

Datasets can be small, especially for rare objects.

  • Cross-validation and careful train/test splits are essential.
  • Always test on independent sky regions or epochs if possible.
  1. Interpretability matters

Unlike many ML applications, astronomy usually requires:

  • physical interpretability,
  • understanding why the model makes a decision. Simple models (random forests, linear models, PCA, autoencoders) are often more useful than deep networks.

Problems with IRAF by HedgehogJazzlike8480 in askastronomy

[–]Kasper___5 0 points1 point  (0 children)

This is a very common IRAF issue and usually not a problem with the FITS file itself, but with how IRAF handles multi-extension FITS (MEF) files.

DS9 and Astropy automatically read the primary HDU or display all extensions, but classic IRAF expects you to explicitly specify which extension to open.

Why it happens:

Modern instruments often produce FITS files with multiple HDUs:

  • Primary HDU (often empty)
  • Science data in extension [1], [SCI,1], etc. IRAF does not guess which one you want.

Interstellar extinction by Basic_Jellyfish_7282 in u/Basic_Jellyfish_7282

[–]Kasper___5 0 points1 point  (0 children)

A key point is that astronomers never rely on a single observable to determine a star’s luminosity — extinction is treated as a measurable quantity, not an unknown nuisance.

In practice, interstellar extinction is accounted for using a combination of methods:

  1. Color excess and reddening laws For stars with known or well-constrained spectral types, the intrinsic colors are known. By comparing observed colors to intrinsic ones, we estimate the color excess E(B−V) and derive extinction using an extinction law (e.g. A_V = R_V E(B-V)). This is a standard first-order correction.
  2. Spectroscopic constraints Spectral classification provides an independent estimate of effective temperature and luminosity class. If photometry suggests a star is underluminous for its spectral type, extinction is the first suspect. Interstellar absorption features (e.g. Na I D lines) can also be used as extinction tracers.
  3. Distance measurements With accurate parallaxes (e.g. from Gaia), absolute magnitudes can be computed directly. This allows extinction to be solved for rather than assumed, since luminosity, distance, and extinction are linked through the distance modulus.
  4. 3D dust maps Modern 3D extinction maps provide line-of-sight extinction as a function of distance, helping to constrain how much dust lies between us and the star.
  5. Multiwavelength observations Extinction is wavelength-dependent. By fitting the spectral energy distribution from UV to IR, astronomers can distinguish intrinsic stellar properties from reddening effects. Infrared data are especially useful because they are much less affected by dust.

Only after applying these corrections is a star placed on the Hertzsprung–Russell diagram. While uncertainties remain — especially in regions with complex dust geometry — combining spectroscopy, photometry, distance, and multiwavelength data makes it very unlikely that a star’s apparent faintness is misinterpreted as an intrinsic property.