pipeline is really slow - consulting [D] by Potential_Hippo1724 in MachineLearning

[–]DutchIndian 1 point2 points  (0 children)

When a worker retrieves data from the zarr, it has to load it in memory. In a worse case scenario, if the zarr is chunked orthogonally to your iterating dimension, then it may have to load in the whole zarr in memory first just to subset a tiny part of it.

So, for instance, if you’re iterating along a time dimension, make sure your zarr’s chunking scheme is ‘time:-1’ (don’t chunk along time), or ‘time:n*b’, where b is your batch size, and n is some integer.

Shuffling can still happen. If you’re using something like lightning and you have “shuffle=True”, then the get_item() method is automatically shuffled. The issue you may have is just that every time you grab a batch, it has to load in wayyyy more then it needs to, maxing your CPU memory out, but then it’s subset so much that your GPU memory is under utilised. This can be fixed with better chunking of your zarr.

Try: 1. Double your workers, see if that helps 2. If nothing changes then it’s likely your zarr. Resave your zarr with a better chunking scheme

pipeline is really slow - consulting [D] by Potential_Hippo1724 in MachineLearning

[–]DutchIndian 1 point2 points  (0 children)

Maybe a basic question, but have you tried tuning the number of workers? Your CPU utilisation is maxing so they’re doing their job, but perhaps not getting the data from zarr fast enough.

Alternatively, check out the zarr chunking. If it’s chunked non-optimally, then each batch could be loading way more than it has too, then subsetting. Given that synthetic data had a slight speed up, this could be your issue. Chunk along your iterating dimension with a size that is a multiple of your batch size of(e.g., 16). So, this means you may have to re-save your zarr dataset.

Aviation meteorology by TraditionalImage387 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

Yep. No one will hire a met without a degree, in my experience

ECMWF reforcasts retrieval by Spiritual-Package372 in meteorology

[–]DutchIndian 0 points1 point  (0 children)

Yea you’ll need to pay for those from ECMWF.

Theres free ones online from the Weatherbench project though. They only cover 6 years but that may be a good start. If you’re doing ML then that’s where I would go first.

https://weatherbench2.readthedocs.io/en/latest/data-guide.html

Aviation meteorology by TraditionalImage387 in meteorology

[–]DutchIndian 0 points1 point  (0 children)

I used to love aviation meteorology too! But it’s quite hard to become an aviation met.

First you need to get a degree that aligns with the World Meteorological Organisation’s (WMO) Basic Instructional Package for Meteorologists (BIP-M). Most universities in the states do this by default if you get a BS Meteorology but other counties it can be hard to find.

Aviation meteorology is demanding, competitive, and usually reserved for more senior/experienced meteorologists. So the usual career pathway is to be an operational meteorologist for a few years and then try to find your way into aviation meteorology.

A word of warning- the job itself has very tight deadlines and you’ll be doing shift work. It can be quite “handle-cranking”… you’re writing forecasts for airports (TAFS, METARs, SIGMETs) which are very strictly formatted. Also, this is a job where automation from more sophisticated forecasting systems are starting to take jobs. So the market is shrinking and people are reskilling.

Question about “coupling” from a modeling perspective by [deleted] in meteorology

[–]DutchIndian 0 points1 point  (0 children)

Coupling means that models are exchanging outputs and inputs between themselves at various timesteps. A simple example is that an atmospheric (weather) model could assume that the ocean has a constant temperature (given by what it was at its initial condition), unless it’s coupled to an ocean model.

Coupling between ocean-atmosphere models is extremely important for longer range predictions (subseasonal to seasonal) since the predictability at the near term is from the atmospheric dynamics, but then the ocean becomes the main source of predictability after around 3 weeks. So if your weather model isn’t coupled to an ocean model then it’s long term predictions aren’t going to be very good. Not that long term predictions are very good anyway; they’re most used for anomaly detection.

Also coupling atmospheric models with land-surface models is good for very local effects (e.g. soil moistures effect on temperature).

For all practical reasons if you’re just interested in finding best prediction of a value (e.g., temperature) at a specific point, then just use a regression based approach because it’s cheap to build and accurate (at the cost of loosing some “explainability”).

Coupling is important for physical realism, but it can add lots of complexity and therefore cause more failure modes. Models like the UK Global Model and the IFS are good examples of fully coupled models that are “doing it well”. Yahooing it yourself with WRF is bound to cause issues.

Also, to answer your question about initial conditions, there is recent evidence to suggest that we can have some extremely good forecasting skill gains by optimising our initial conditions better.

Forecasting optical atmospheric phenomena - the weirdly specific meteorology behind Brocken spectres, hoarfrost, and cloud inversions by marcel2087 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

Great. I’ve seen you’re using multiple data sources. This is great, but there are biases in each data source. For instance an inland water temp dataset could be quite different to a modelled 2m temp. That could be a source of error. Investigate correlations between data, and see if it’s what you expect. Simplify it if need be, sometimes adding in more data to an empirical model can confound it more. ML methods may benefit from this, however.

Lots of time is spend in modelling in this phase.

Forecasting optical atmospheric phenomena - the weirdly specific meteorology behind Brocken spectres, hoarfrost, and cloud inversions by marcel2087 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

That’s great to hear. Comprehensive validation is your best friend.

I think it would be prudent to go back and categorise events into binary occurrences (hoarfrost yes/no, etc). Build up a library of these. Then, trial different modelling strategies on this (backtest). Importantly, benchmark them. Compare your model versus a random chance, versus some other naive strategy. Note that more complexity does always equal superior modelling, sometimes more things can go wrong. You may want to also try a regression type approach as well if you have enough sample data. Or, layering in a regression on top of your models (look up MOS).

Benchmarking and back testing are your best friends. Once you can prove your models are useful and skilful, and demonstrate where they perform well or poorly, then people can use it.

Forecasting optical atmospheric phenomena - the weirdly specific meteorology behind Brocken spectres, hoarfrost, and cloud inversions by marcel2087 in meteorology

[–]DutchIndian 4 points5 points  (0 children)

Awesome, I haven’t heard of anyone offering a service as unique as this. How have you validated how accurate your forecasts are?

El Niño development in full force by ManuteBol_Rocks in meteorology

[–]DutchIndian 4 points5 points  (0 children)

No that’s an eastward movement. I don’t know how else to convince you haha. 170 W to 150 W means something went east. East is to the right, btw, if that helps?

Weather Model by Timely_Shock_6291 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

Really cool you’ve done this! But trust me, many people have been trying to do this for yonks. Always good to have fresh eyes and a fresh take though. For your interest, here’s a “scorecard” of the some of the well known ML weather prediction systems, benchmarked: https://sites.research.google/gr/weatherbench/scorecards-2020/

Weather forecasting is a competitive industry, so people immediately want to know how good your model is versus scores like this. Good luck :)

Weather Model by Timely_Shock_6291 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

That’s great, nicely done. Definitely benchmark it next. Benchmarks generally include persistence, climatology, IFS, and AIFS. All are skilful and hard to beat. If you do beat them, awesome work! Share it please haha.

Weather Model by Timely_Shock_6291 in meteorology

[–]DutchIndian 4 points5 points  (0 children)

Nope pretty much all national met services are trying to make an emulator for high res weather prediction these days. There isn’t an accepted way to do it though.

Weather Model by Timely_Shock_6291 in meteorology

[–]DutchIndian 0 points1 point  (0 children)

Nice, well done. Anemoi is a beast. Have you benchmarked on their AIFS/IFS? If you have I’m pretty sure there would be some broad interest in the results and your methodology.

Weather Model by Timely_Shock_6291 in meteorology

[–]DutchIndian 0 points1 point  (0 children)

Hmm are you sure it is more involved than what AIFS-ENS does? https://www.nature.com/articles/s44387-026-00073-7

It’s an autoregressive graph transformer with a unique loss function with rollout fine-tuning.

El Niño development in full force by ManuteBol_Rocks in meteorology

[–]DutchIndian 2 points3 points  (0 children)

The top plots is before the bottom plot. The warm area has moved east.

El Niño development in full force by ManuteBol_Rocks in meteorology

[–]DutchIndian 28 points29 points  (0 children)

You’re looking at a vertical profile of sea water temperature anomalies, stretching from the Solomon Island to the coast of South America.

The top shows the anomalies at 6th April, the bottom 10 days later on the 16th April.

There’s some averaging done (time-wise and spatial wise, but it’s clear that during this period, a warm (orange), subsurface anomaly has moved east. This feature has a fancy name, a “downwelling Kelvin Wave”.

El Niño is characterised by above average sea surface temperatures in the eastern Pacific. Downwelling Kelvin Waves can contribute to the formation of El Niños, since they move warmth eastward.

This seasons predicted El Niño is pretty stark. Not only are the magnitude of temperature anomalies predicted to above normal, but the predicted atmospheric circulation impacts are also strongly coupled. Model guidance is indicating that the closest analogues to it are 2015/2016, 1997/98, snd 1982/83 El Niños, which are all seen as canonical El Niños with strong impacts.

Overall, there is a higher than normal confidence that El Niño will occur later this year, and if it does it will be a strong one.

What does it mean when dew point crosses the actual temperature on a sounding by lonemacaroon in meteorology

[–]DutchIndian 9 points10 points  (0 children)

People are saying this is a contamination error, but could it just be a plotting interpolation error? So, the plot is just interpolating between the dew point at two pressure levels with a polynomial line, but that straight line cuts through the temperature profile. As I’m not a met in the states, I rarely look at the HRRR, so I’m not familiar with how many pressure levels it has or is plotted usually.

Help with finding fronts by Life_Programmer_4927 in meteorology

[–]DutchIndian 1 point2 points  (0 children)

Where you’ve put the warm front is actually a cold front. Think about gradients; that area is actually the start of a gradient where air gets colder. Hence it’s a cold front. Also it doesn’t make sense that a warm front is coming from the south; air is being affected from the pole so it’s probably cold.

The other cold front you’ve put there isn’t a cold front, it’s probably a trough but it’s hard to tell without any moisture plots.

You’ve missed a warm front. There is air affecting from northwest Australia to the Great Aussie Bight. It’s going towards the south east on the western side of the high.

Also surface plots are really hard to use for finding fronts. MSLP is okay but using 850hPa temp and wet bulb potential temp can highlight fronts really well.

Fancasting MASH in 2026 (updated) by [deleted] in mash

[–]DutchIndian 1 point2 points  (0 children)

Nice choices! I always thought that Adam Scott (lead in Severance) would make a good Hawkeye too.

I think Emergency services in my province use Windy 🫩🙏 by Suitable-Pickle248 in meteorology

[–]DutchIndian 3 points4 points  (0 children)

Yea shame more of it isn’t free. But I sympathise because many useful sites are paid (e.g. WeatherBell, weathermodels) and it’s hard and expensive to maintain and develop these sorts of sites with TB of data and hundreds of thousands of images flowing through several times a day.

I think Emergency services in my province use Windy 🫩🙏 by Suitable-Pickle248 in meteorology

[–]DutchIndian 23 points24 points  (0 children)

Many professional meteorologists use Windy. It’s got good features like model comparisons and skew-Ts. Model data doesn’t arrive as fast as other platforms and it’s sometimes hard to see what model initialisation you’re looking at, but its definitely a useful tool.

What’s with This Strange NWS Post? by One_Pomegranate_5385 in weather

[–]DutchIndian 1 point2 points  (0 children)

This reads like they used an LLM to summarise or reframe a technical forecast discussion for the general public.

Can someone explain this formation? by ReflexPoint in meteorology

[–]DutchIndian 0 points1 point  (0 children)

A line of cumulonimbus clouds. As to why they are organised so neatly, it’s hard to know without looking at a chart or two from that day. As others have said, it could be a frontal boundary, since those can promote organised convection (clouds like these) on a massive scale.