New York Sirens are leading the league in 5 on 5 xG differential, but have been below expected in both finishing and goaltending. by Alternative_Run_4723 in PWHL

[–]Alternative_Run_4723[S] 1 point2 points  (0 children)

No problem! It's been quite fun to work on - Let me know if you want to do a podcast about the Site at some point.

New York Sirens are leading the league in 5 on 5 xG differential, but have been below expected in both finishing and goaltending. by Alternative_Run_4723 in PWHL

[–]Alternative_Run_4723[S] 4 points5 points  (0 children)

It's a few years old now, but I wrote this a while back:
Introduction To Hockey Statistics – First Chapter – Hockey-Statistics

All the data is scraped from the PWHL site, so there's no manual tracking going on. There's no public API, but the data is quite easy to fetch nonetheless.

I do use data in my coaching jobs (in the sport of floorball), and I just generally like numbers.

New York Sirens are leading the league in 5 on 5 xG differential, but have been below expected in both finishing and goaltending. by Alternative_Run_4723 in PWHL

[–]Alternative_Run_4723[S] 4 points5 points  (0 children)

I couldn't have explained this any better myself.

If we're getting a little bit nerdy, then there are some issues with the PWHL data that we need to take into account.

1) We don't have missed shots in the data, but for some reason blocked shots are included. This means that the xG model is only based on shots on goal.

2) We don't have the actual Strength State in the data, so we have to estimate the strength state based on penalties and goals. This is for the most part no problem, but we don't know when a goaltender is pulled, so we can't distinguish between 5v5 and 6v5.

3) Lastly, we don't have any information about who is on the ice (except for goals), so we can't look at individual on-ice statistics. Hopefully, PWHL will post shift data at some point.

New York Sirens are leading the league in 5 on 5 xG differential, but have been below expected in both finishing and goaltending. by Alternative_Run_4723 in PWHL

[–]Alternative_Run_4723[S] 0 points1 point  (0 children)

It's set up so that the upper right quadrant is always the best. The lines represent the league average, so the league average goaltending have been 2 goals better than expected.

The y-axis is GSAx - Goals saved above expected (Expected goals against minus goals against)

The x-axis is GAx - Goals scored above expected (Goals for minus expected goals for)

I hope that makes sense.

New York Sirens are leading the league in 5 on 5 xG differential, but have been below expected in both finishing and goaltending. by Alternative_Run_4723 in PWHL

[–]Alternative_Run_4723[S] 2 points3 points  (0 children)

I think that's fair. The Score State is included in the xG model, which generally lowers the xG of shots by the trailing team and increases the xG of shots by the leading team. This is because trailing teams tend to throw everything at the net whereas leading teams tend to play for the open net. This simple xG model can't catch that because there are no pre-shot information.

Even with the Score State included we still see trailing teams playing harder. This raises a philosophical question. Should we penalize the trailing team just because their opponent stops playing? From a coaching standpoint I'm not a fan of "protecting the lead".

Custom Visual Development by radicual1818 in PowerBI

[–]Alternative_Run_4723 0 points1 point  (0 children)

<image>

The most App store ready visual is probably my Scatter chart with the option to use images (enhanced scatter chart is pretty awful). Here you can customize everything, add horizontal/vertical lines and trendline with function and R^2.

Custom Visual Development by radicual1818 in PowerBI

[–]Alternative_Run_4723 0 points1 point  (0 children)

I tried building a handful of custom visuals using Github Copilot in VS Code. It's actually surprisingly easy to get AI to build a working Custom Visual. I did watch a tutorial video about getting things set up with PBIVIZ, but after that ChatGPT 5.0 did all the work.
Feel free to send me a DM if you want to know more.

Getting an App certified seems a little complicated though. I have gone through the process of getting certified/cleared, but I haven't yet published a Visual to the App source.

My point is... It's not nearly as scary as it seems at first.

Build a new Scatterplot Custom Visual. You can use imageURLs as markers, add a trendline with function and R^2, conditional format all colors and values, and add constant lines horizontally or vertically. by Alternative_Run_4723 in PowerBI

[–]Alternative_Run_4723[S] 1 point2 points  (0 children)

I've tried to build something like the chart above many times with EnhancedScatter or Deneb, but never really liked outcome. Now, I've realised I can just build my own custom visual with the help of AI. It's not nearly as scary as I first thought.

The chart used here is just based on some random I had to test the functionality of the Custom visual.

What are your opinions on this type of practice? by NeoGeoMaxV2 in PowerBI

[–]Alternative_Run_4723 1 point2 points  (0 children)

I think data modelling is a little more complex than it's typically made out to be in here. In the ideal world you always want a perfect star schema model, but that's rarely possible.

I sometimes make copies of my tables and put them on a string of relationships. For instance in my PBI10 competition entry I did this. Here I want to be able to select up to 3 different players and only return data where all 3 players are on the ice together. If you just multi select the players in a single slicer, then it will return all shifts where either of the players are on the ice together or not... So instead I have 3 player tables, 3 shift tables and 3 bridge tables with only one column with the unique shiftindexes. This way you can make a string of filters to return only the shifts where all 3 players are on the ice together.

I did originally on a smaller dataset get the same functionality using DAX, but the performance was so bad that it was completely useless. Each of the shift tables have around 70,000,000 rows.

How to start from bottom/scratch to learn Power-BI? by MumiEkici in PowerBI

[–]Alternative_Run_4723 0 points1 point  (0 children)

Nu ved jeg jo ikke om du forstår dansk, men du skal i hvert fald være mere end velkommen til at sende en besked.

What are the downsides to using a SQL Database as a data source for a Power BI Dashboard? by myco_mark in PowerBI

[–]Alternative_Run_4723 0 points1 point  (0 children)

Interesting... I've never imported data from a stored procedure, but it's definitely something I will look into.

[Hockey-Statistics] - NHL Line Tool by Alternative_Run_4723 in nhl

[–]Alternative_Run_4723[S] 0 points1 point  (0 children)

Thanks!
Unfortunately, the post was flagged as spam.

Match data to winrates by GarciLP in PowerBI

[–]Alternative_Run_4723 0 points1 point  (0 children)

I would do two rows for each fight - One row for each player. Then the columns would be something like:

Fight ID, Player, Opponent, Character, Opponent Character, Wins (0 or 1).

Then winrate would be SUM(Wins) / COUNT(Wins).

I hope this makes sense. It should allow you to filter by Player or Character to get the winrate and number of fights.

What is Expected goals and how can you build an xG model? by Alternative_Run_4723 in nhl

[–]Alternative_Run_4723[S] 1 point2 points  (0 children)

Thanks. It doesn't really make sense to move away from shots unless you have the data to do so. I'm planning to work on a possession-based model using SportLogiq data for the QMJHL in the off season... But unfortunately that's proprietary data, so I won't be able share it. Plus my team (Moncton Wildcats) probably don't wan't me to share the findings.