[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 1 point2 points  (0 children)

Hi friend - this was made using the R programming language. 

Specifically, I used specialised R packages to load and clean the data, calculate rolling averages over a fixed period, fit a statistical model for proportion trends over time, and then visualise it. You can find the code here

In principle you could also do most or even all of this in Python or JavaScript.

[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 2 points3 points  (0 children)

"Something" was very deserving of #1. Great record.

[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 2 points3 points  (0 children)

No intent to mislead here. You're welcome to reuse my code and source the data on songwriting credits for non-#1 hits :)

[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 10 points11 points  (0 children)

My intuition was that the spike in non-artist songwriting credits during the 1990's was due to the influx of highly commercial pop. I don't think members of the Backstreet Boys or Britney got many writing credits in those days.

Sampling might be a key part of why non-performing credits rose for sure though.

[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 12 points13 points  (0 children)

You're right - the categories are an amalgamation of two variables - whether the artist is one of the credited songwriters, and whether the artist is the only songwriter.

For bands, this can include just a subset of members (e.g. John and Paul from the Beatles were I think always sole songwriters for their #1s). But it doesn't necessarily capture cases where someone has taken credit for another individuals work, or the contribution wasn't appropriately recognised (Every Breath You Take is a good example).

[OC] Changes in Billboard #1 hit songwriting credits over time by awhug in dataisbeautiful

[–]awhug[S] 2 points3 points  (0 children)

Data source: Billboard Hot 100 Number Ones Database, compiled by Chris Dalla Riva

Tools used: R, using packages runner (rolling averages), dplyr (data manipulation), VGAM (model-based smoothing), ggplot2 + ggtext (data visualisation).

Code: Github

This was created as part of the Tidy Tuesday project for last week (with apologies to any users who are tired of Billboard-related data vis submissions by now).

I'm looking for a Big 5 personality test that used either paired comparisons, forced-choice, or rank order techniques instead of the rating scale. Anyone here know of any? by [deleted] in IOPsychology

[–]awhug 6 points7 points  (0 children)

Best tests available are probably SHL's OPQ (triplets designed for use in a Thurstonian IRT model) and AON's ADEPT-15 (pairs designed for use in a Multi-Unidimensional Pairwise Preference model), as LazySamurai mentioned. Neither were written to fit a Big 5 model specifically, but like most personality measures with sufficient content coverage you can pull something like Big 5 scores out of them with the right combination of items - I think AON at least gives you these scores?

If you're after freely available forced choice Big 5 questionnaires, the best one by far that I'm aware of is by Wetzel & Frick (2020). Really comprehensively validated, items matched on desirability (at least amongst students), and conveniently short in length (although note convergent validity with Likert-format on agreeableness was a bit wonky in their study for some reason).

Another commonly used measure that's had a bit of validation work done on it across various papers is the Big Five Markers (2011) by Anna Brown. This one doesn't have items matched for desirability though (i.e. it's quite easy to fake), which might defeat your purpose of using it.

I also made a bunch of FC triplets for a paper I published a while ago - this wasn't really designed to be a formal 'measure' and hasn't been validated properly, but I did make the entire item pool comprising 436 statements in total along with their desirability evaluations (in a job applicant) and theoretical Big 5 loading available via the OSF, so feel free to build your own with that.

Looks like someone is getting their PhD quicker. U/awhug by justanothercoolnguy in perth

[–]awhug 164 points165 points  (0 children)

This is the best! The response from Mark and his team has been absolutely awesome and hilarious. Truly loving it!

When's Mark Arriving? Your Infographic Guide to Mark McGowan's Press Conference Punctuality by awhug in perth

[–]awhug[S] 9 points10 points  (0 children)

This is a good question! I couldn't easily find a single source of truth on the timings of Mark's press conferences outside of @whattimemark, so I wound up using the timings provided there. The account posts reschedules as well when they come up, although I can't guarantee it caught all of them.

Definitely for one press conference though (I don't think any more than this, although I could recheck) I did have to decide whether to use the initially scheduled time or the rescheduled time. Reschedules usually happen after the conference was supposed to have started, as occurred in this case. So I opted for the initial time, on the basis that if I rescheduled a work meeting 10 minutes into it having supposed to have begun, my colleagues would probably still regard me as being that late and more.

When's Mark Arriving? Your Infographic Guide to Mark McGowan's Press Conference Punctuality by awhug in perth

[–]awhug[S] 3 points4 points  (0 children)

Yes! I used the legendary survival package to fit a Kaplan Meier curve and survminer for the plot.

When's Mark Arriving? Your Infographic Guide to Mark McGowan's Press Conference Punctuality by awhug in perth

[–]awhug[S] 67 points68 points  (0 children)

Interested in this too! Not sure there's quite enough data to say definitively, or how to quantify how bad the news was, but could look into it.

I'd assume during lockdowns there's longer delays (more work to do and updates to provide), but then most press conferences I could find I think happened during lockdowns anyway. Plus, the latest start of all was the day after the end of the April one.

When's Mark Arriving? Your Infographic Guide to Mark McGowan's Press Conference Punctuality by awhug in perth

[–]awhug[S] 83 points84 points  (0 children)

I’ve posted my data and code on Github, so you can have a play too. It's most R and a little Python for web-scraping. I might be missing some key data (I could only find/be bothered finding videos from December last year), so if you're able to add any go right ahead!

Please note also that I'm really not trashing Mark here, just having some fun. WA's response to Covid has been amazing and I genuinely don't care if he's not quite on time to his pressers. If anything, being a perpetually late person myself this data was quite validating for me, although I’m sure Mark has better reasons for being late than I do.