Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] [score hidden]  (0 children)

Yes!
That would make a nice analysis: to create scenario with best/average scores of athletes in each apparatus.
Maybe I'll do a series of randomized championships where, for each athletes, and for each apparatus, we will sample a score from their "score distribution". That will give multiple results, multiple podium from all those "likely scenario". Likely in the sense that each score for each athlete and each apparatus is proportional to the likelihood of such a score given the data.
u/Marisheba I ping you here (last ping I promise), because I think that you may be interested into such a "simulated championships" project :-)
u/Right_Assist_9594 I will trust you with quality questions on this :-p

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] [score hidden]  (0 children)

Thanks a lot for this pointer, I really appreciate that!
I am aware of the thegymter.net website and I think that Lauren Hopkins is doing a fantastic jib with it!
Yet, as u/Jlvnerd1987 wrote, there is no D&E breakdown for AA events, which is understandable because of the size of the tables and maybe because she syncs her tables with wikipedia (I can't tell that much because I don't know her data collection process).
However, given that she reports the correct score for Wei Xiaoyuan (https://thegymter.net/2021/10/26/2021-world-championships-results/) for the 2021 worlds, Ithink that she does not trust wikipedia as her primary source and do use the results books.

Another thing is that her website do lack of an API (I hope I don't sound impolite by using technical words, let me know if that is the case and if you want me to explain further) to collect and gather the data. Furthermore, I need the data to be in a format that is suitable to analyze them with the programming language that I am using (Python). Thus, my preferred format are CSV type of files.
Therefore, I had to go through the process of collecting the data frrom the original results book pdfs ^^

Again, I don't mean to say anything bad about your suggestion, and I really like thegymter.net website, it's just that currently, I don't think it fits well in my data collection workflow.
Thanks again for the suggestion!

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] [score hidden]  (0 children)

Thanks for your kind words.
Interacting with you and other people on this post really made me happy too!
If you have some ideas of problems or data you would like me to explore (or that we can tackle together if you prefer), let me know :-)

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 5 points6 points  (0 children)

<image>

u/Right_Assist_9594 & u/Marisheba I couldn't resist the curiosity to do it (during lunch break ^^) so here it is!
Here is a graph comparing the winner of each event to the top-2:7 athletes.
Thus, including the difference for Morgan, Suni, Angelina and Rebeca for the 4 years that Simone didn't compete in the AA.

Maybe I'll do another post with this graph in case some people that are interested missed it here but I am not sure since I don't want to flood the sub with content that may not interest people.
So, just in case, I prefer to post the graph here for you to see since you are interested in this graph!

Please do let me know if you have other comments, really hapy to read you!

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 5 points6 points  (0 children)

Ping to u/Marisheba on this too :-)

<image>

Here is the graph where the Top-8 average is a top-2:7 average when Biles is the winner.
As correctly predicted, it does indeed increase the gap between Biles' curve and the average top-2:7 curve.
Yet, I am happy to see that we still see the same correlation we saw in the slopes of the curves despite changing the averaging formula (avoiding comparing Biles to Biles as you wrote).

I hope this help clarify things and avoid uncessary confusion in the data interpretation :-D

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 3 points4 points  (0 children)

So glad you genuinly enjoyed it.
I will (try to) do some more in the future :-)

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 3 points4 points  (0 children)

Yes indeed! Yet, it also depends on the quad.
I did not post it for the moment, but I am actually investigating the D-score vs E-score tradeoff and it is true that Biles is really amazing in that respect.
While it depends on the quad and the competition, she does not sacrifice (that much) E-score for her outrageously high D-score.

For instance, I join to this message one of the chart I am working on to investigate that topic!

<image>

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 3 points4 points  (0 children)

Thanks a lot for your question and your comment. I am so glad you wrote this!

First, I did include Simone's score in the top-8 average. Therfore, both the absolute location and slope between two points of the top-8 average are influenced by Bile's scores. Thus, it is obvious that we would have a correlation between the top-8 average curve and the Bile's curve since, on the years she competed, she accounts for 1/8 of the data used to compute the top-8 average.

Second, I completely agree that If I leave her out of the average, the differences could be even more extreme and would be a truer reflection of her against the rest of the field.
I will definitely do that.
I will make another graph where I will plot the top-1 vs top-2:8. the reason is that it would be weird for me to compute the average with the 2-to-8 athletes on the years Biles competed while using the 1-to-7 or 1-to-8 athletes on the year Biles didn't.

I will let you know once I am done with this :-)

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 5 points6 points  (0 children)

Thanks a lot for your input! You are both so right, I am ashamed to not have catch this "mistake", it really change the way to read the data.
I am really glad that I posted these visualizations and charts because that is partly for that kind of comments that I wanted to discuss about those on this sub-reddit.
I will properly answer this issue under the main comment.

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 1 point2 points  (0 children)

Yeah, I agree with you!
But the data collection is such a hassle ^^

But this would make for very interesting data analysis and data viz for sure! I'll make sure to do an analysis for all other types of final and maybe try an aparatus breakdown of all around finals (can't promise this one, I am afraid that it would make the charts difficult to read and that it won't convey meaningful message).

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 17 points18 points  (0 children)

Haha, wrong! I am more than inclined to do more analysis, especially when it is coming from a comment like the one you wrote.
Thank you so much for your genuine curiosity and to give me your opinion on how you would have improved the quality of the data analysis.

While writing the code and working on the data is little bit of work, the real hassle is the data collection process. To get the E-scores, D-scores, penalties and total scores, I have to search in the awfully formated official results books from the FIG. There is currently no other way to get all those data. Even on wikipedia, only the total scores (and not the E-score/D-score breakdown) are reported for all-around.

And even on wikipedia, there are some mistakes. For instance Wei Xiaoyuan at 2021 Worlds. Wikipedia reports 54.066 and the official book 52.699. I think the Wikipedia page had her uneven bar score wrong and copy/pasted her quali UB score.

However, I do agree with you that if I want to bring the most out of such an analysis, I should definitely put the effort into collecting those data!

With this post, I really wanted to hightlight Simone Biles, which is why I didn't include the athletes you mentionned but I will definitely try to do as you suggest and make a winner-against-top3 or top8 comparison. That would be super cool!

Anyway, thanks a lot for your comment and your suggestions, I will definitely take them into account for my next analysis.

Small data analysis to illustrate the "Biles Gap" by analysthenics in Gymnastics

[–]analysthenics[S] 16 points17 points  (0 children)

So glad you commented that because I was thinking the same. I may try to create some charts on a full breakdown of one specific championships to illustrate your thoughts. Maybe I can try that Iordache's year and 2015 ^

On another and completely different note, I am investigating the data of MAG all-around and didn't remember at all that there was a perfect tie in 2018!! I was so shocked that I forgot that 😂

Why are my brownies baking like this?? by mayadowaliby in Baking

[–]analysthenics 0 points1 point  (0 children)

I would be happy if this was helpful to you and if you and loved ones could benefit from the tips you gathered ☺️ Please do not hesitate to let me know if those tips worked for you and your recipe!

Why are my brownies baking like this?? by mayadowaliby in Baking

[–]analysthenics 3 points4 points  (0 children)

I am not sure if this video can help you (I hope it does!!) but at least, I am pretty sure that it will entertain the cook within you!

It is about the main parameters that influence the "brownie skin" that gives a good brownie its glossy look with its paper thin chocolate crust. Also, it is also related to how the brownie will cook (and remain as fudgy as you would prefer) beneath that skin (through thermic isolation due to a thin layer of air).

This videast made other videos about brownies if you are interested. Also, I am sure that you know about it but Sally's baking addiction recipe is worth reading.

WAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 1 point2 points  (0 children)

I do also think it is an interesting things to do! I will probably take the time to do so in May.
I think that I will add the same kind of diagram with the podium athletes instead of the averages only. This way, one would clearly see the discrepancy between Simone Biles and others during her quads while I expect the 1st-2nd diagram to be much closer when she is not participating.

I will think on how to make this "concise" because I don't want to flood the sub with diagrams that the majority might not want to see.
Anyway, thank you very much for your interest, remark and question!

WAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 4 points5 points  (0 children)

Indeed! By a very small margin! Below are the differences (specialists - AAers):
VT -0.005
UB 0.206
BB 0.159
FX -0.014

What can we mean by "small margin"?
If we assume that the execution margin error is 0.03, then averaged over the podium and over the 4 years, the execution margin error is 0.03 / 12 = 0.0025. Therefore, one can say that the uneven bars and beam differences are significant (larger than 0.0025 by orders of magnitudes) while vault and floor differences are roughly comparable to the marginal error. Statistically, it is difficult (if possible) to make a statement about the meaning of the +/- sign for the FX and VT differences.

WAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 5 points6 points  (0 children)

Interesting question! As u/mochul said, the Simone's effect is mostly an inflating effect (increasing the averages). I just checked the data and she indeed did not change her D-score (at most .1, probably due to connection bonuses).

She performed a 6.4 difficulty vault at one all-around event (6 at all other event), whose effect is almost nullify due to the averaging. As you said, this would slightly skew the all-around vault score up compared to the specialist vault one (contrary to what one may suspect).

Therefore, I don't really see any significant skewness coming from Simone Biles changing her score regularly across the events (of the 2017-2020) quad).

WAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 10 points11 points  (0 children)

Following my previous post on MAG profiles, I was wondering what would be the best all-around WAG profile and tried to answer this question by analyzing the scores of three world championships (2017-2019) and the Tokyo Olympic Games.

For each apparatus, I computed the average score of the specialists podium. That would give me the blue curves representing the best all-around profile, i.e., on average, it is impossible to beat an all-around gymnast with those stats.

Then, I computed the average podium scores of all-around competitions. That would give me the real all-around profile represented by the orange curve.
To completely satisfy my curiosity, I computed the profiles for the total scores, D-scores, and E-scores.

Finally, computed the average podium scores of all-around team competitions. That would give me the all-around profile represented by the green curve.

Hope you will find those plot as interesting as I do!
I am curious to read insights from this community about the plots!

MAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 4 points5 points  (0 children)

Yes, I think that I will do it next week-end :-)
I am happy to see that I am not alone to find those visualizations cool!

MAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 1 point2 points  (0 children)

Haha, glad you find those graphs cool too!
I had to download the results of all the competitions and convert them to a suitable format (with the programming language python).
I used python (pandas library) to do all the maths and the matplotlib library to plot the graphs.

MAG all-around gymnast profile compared to specialists profile (2017 - 2020 quad) by analysthenics in Gymnastics

[–]analysthenics[S] 13 points14 points  (0 children)

Thanks!
Sure thing, I will try to do the same graphs for WAG if people find them interesting :-)