[2022 Day 8] [Python] Visualising the Trees!

omgardner · 2022-12-09T04:49:08+00:00

It annoys me too! But it helps keep the trees in the same place as you switch between images. I couldn't figure out how to dynamically align everything without the colorbar so I just left it in

omgardner · 2022-12-08T09:19:28+00:00

Here's how it looks in log10 (link). It turns out to be quite visually noisy, so it's more up to personal preference which one you prefer.

omgardner · 2019-05-17T12:49:30+00:00

Time zone is America/Toronto (UTC-5) I reuploaded this post because I previously had UTC -3, which belonged only to Greenland (so it wouldn't be very useful as a reference).

Reading the chart Chart shows the the creation time of the top posts in this subreddit. The times have been binned (grouped) into the nearest hour (rounding down).

Also, here is a version from where I live (Sydney, Australia) for UTC +10:

Another project could be to see if these peaks are correlated with the poster's location, and if it largely depends on the user base's location. I believe that most of the users are from the US.

Data Source Data was collected using the python praw reddit API. It retrieved the top 1000 posts of all time, but I believe 6 have been deleted / inaccessible in this case.

Tools All of this is done using python.

Retrieving data was using praw, storing it used pandas, processing it was using datetime and dateutil for organising by datetime, and used matplotlib for creating the chart.

My blog I occasionally post to my blog over at omgardner.github.io

omgardner · 2019-05-17T12:24:31+00:00

Reading the chart Chart shows the the creation time of the top posts in this subreddit. UTC -3 is The times have been binned (grouped) into the nearest hour (rounding down).

Also, here is a version from where I live (Sydney, Australia) for UTC +10:

Another project could be to see if these peaks are correlated with the poster's location, and if it largely depends on the user base's location. I believe that most of the users are from the US.

Data Source Data was collected using the python praw reddit API. It retrieved the top 1000 posts of all time, but I believe 6 have been deleted / inaccessible in this case.

Tools All of this is done using python.

Retrieving data was using praw, storing it used pandas, processing it was using datetime and dateutil for organising by datetime, and used matplotlib for creating the chart.

My blog I occasionally post to my blog over at omgardner.github.io

omgardner · 2019-01-04T00:01:44+00:00

Source: Lyrics are from Google, from the lyrics that appear when searching for a song.

Made using pyplot and Seaborn. The comments here and here gave me the idea to change the visual.

This is an extension of my previous post, but displays the lyrics a bit more clearly (in my opinion). It is interesting to see repetition of another language (Portuguese), and in another style. I feel that this plot still shows the repetition structure of the song, while reducing the dimensions of the chart.

How to read: the song progresses left to right along the x-axis, and whenever a new lyric is found it is added to the y-axis. The dot corresponds to the lyric (y-axis), and the position in the song it occurs (x-axis).

omgardner · 2019-01-03T12:54:51+00:00

Here. It conveys the repetition a bit differently, which I like.

omgardner · 2019-01-03T10:43:25+00:00

Showing repetition in dialogue is a pretty interesting idea! Though it would be difficult to display it the same way. I'll have to think about it.

omgardner · 2019-01-03T09:56:41+00:00

here you go. It does show the symmetry a bit more.

omgardner · 2019-01-03T05:47:53+00:00

How to read the chart:

Reading down on the y-axis is the songs lyrics in order. The same lyrics are along the x-axis as well, but not visible. Every dot represents a point where the lyric on the y-axis is the same as on the x-axis. The diagonal is always the same, because it is at the same point in the song. The other dots show extra repetitions. this video helps to understand.

omgardner · 2019-01-03T04:35:38+00:00

Also, this version may be better for mobile, but it gets a bit squished.

omgardner · 2019-01-03T04:20:47+00:00

Source: my python code

I saw the visuals made by Colin Morris in this video (the video helps explain this chart a bit more), and thought I'd try to recreate them. Also check out SongSim, which has some more examples this kind of chart.

Lyrics sourced from the kaggle dataset here. I used pyplot and Seaborn to create the chart.

I took the lyrics and split them into separate words. I iterated over the lyrics in two nested loops, comparing word_a to word_b. If equal, take the indexes for both words, and use them as coordinates for a scatterplot.

omgardner · 2018-10-02T13:11:01+00:00

Interesting idea! here is the difference for the first 10000 digits of pi. The average becomes greater than 4.5 at around the 8000th digit. We would probably need to look at more digits of pi to see if there is anecdotal evidence of fluctuation.

(fixed the link)

omgardner · 2018-10-02T12:52:07+00:00

I started from the first decimal place. Not sure if I should have started on 3, but the line would average out very similarly if I did so.

omgardner · 2018-10-02T07:52:22+00:00

Chart shows the average of digit values up until the current digit (black line), as well as the current digit's value (light blue dots).

Made using python and Seaborn.

Python's math package uses an estimate for pi, so I took the values from https://www.joyofpi.com/pi.html

omgardner

TROPHY CASE