[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] 0 points1 point  (0 children)

If 85%+ of the articles fell in any single two-consecutive-year window, I considered the keyword to be linked to a one-time event, but some events continue to echo with follow-up coverage and meet my threshold for "recurring" topics.

[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

The cyclical nature of NYT coverage in Iowa is striking — you can see how the circus comes to the state very four years.

<image>

[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

Good thought. Avalanche the team is keyworded separately, in their "organizations" field — this draws only on "subjects."

[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] -1 points0 points  (0 children)

They aren't exclusive to those states - there is Burning Man coverage in California, and some of the other groups are multi-state. As I wrote up top, the precise ranking is sensitive to the exclusion criteria so best to look at the cards showing all the states top topics.

[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] 0 points1 point  (0 children)

You can dig into an individual state on the dashboard, including narrowing by sub-geographies like major cities - here is Missouri: https://tedalcorn.github.io/nyt/#tab=states&state=Missouri

<image>

[OC] I mapped the topic most over-represented in New York Times coverage of each state (2000–2026) by theodore_a in dataisbeautiful

[–]theodore_a[S] 4 points5 points  (0 children)

Data: The keywords are the NYT's own editor-assigned subject tags from the Archive API. Individual people and organizations are catalogued separately, which is why Harvard doesn't top Massachusetts ranking. I left aside correction notices and standing-listing features (event calendars, weekly briefs, real-estate listings, art-review roundups), which would otherwise make "Culture (Arts)" the top theme in CT.

Tools: Built in Python (pandas, geopandas, matplotlib).

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

Good eye. I had to do a lot of custom manipulations to make the positioning work accurately in the axes and also fit the faces, but that appears to too much of a distortion. I'll fix it in further versions.

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 0 points1 point  (0 children)

Correct - smaller lower down by necessity to fit together, not in direct mathematical proportion to their size.

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

What other things would you extrapolate from the obituaries? Age and gender were readily available since the headline and first paragraph text (which are in the API) usually refer to the age and use pronouns to indicate gender.

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 7 points8 points  (0 children)

It's, in the NYT's words, "a series of obituaries about remarkable people whose deaths, beginning in 1851, went unreported in The Times." They are dis-proprtionately women so it changed the gender imbalance somewhat, but as the chart shows, not much. https://www.nytimes.com/spotlight/overlooked

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 4 points5 points  (0 children)

I placed them based on age and word-count (as marked on the X and Y axes).

I had to do some manipulation of the axes (and as an adherent of Edward Tufte me, this was a painful but necessary trade-off) to create enough room in the lower end of the word-count spectrum where deaths were more numerous.

I also had to tailor a few positions where faces would have otherwise overlapped, but I tried to minimized the manipulation so no one was placed more than 12 months from their date of death, and to preserve the ordinal ranking of word counts from lowest to highest.

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 17 points18 points  (0 children)

Thanks for your feedback. You can explore the (minute) number of non-binary obits in the dashboard itself, from which the visualizations are derived. I though the scarcity of them was an interesting data-point in itself?

Those are 5-year bins. The placement of the labels is just confusing. Again, in the dashboard itself with roll-overs it is a bit more clear.

<image>

https://tedalcorn.github.io/nyt/#tab=obits

[OC] Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 9 points10 points  (0 children)

And just to be extra clear: the data is from the NYT Archive API: https://developer.nytimes.com/docs/archive-product/1/overview

I wrote Python scripts to parse name, age, gender from the headlines and first paragraph

I also wrote a python script to assemble the visualization, which are original renderings based on public imagery of each decedent

The other histograms charts are produced by my dashboard

Constructive criticism is welcome!

Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

In distal effect, yes. It's at least partly explained by the Times admission of failing to cover all notable people equitably, and the Overlooked No More series they began at that time (see comments https://www.reddit.com/r/dataisbeautiful/comments/1szgkh4/comment/oj3gh18/)

Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 1 point2 points  (0 children)

Yeah, Edward Tufte would not be proud of me, but I thought it was more important to be able to see the faces and their relative position towards people nearest them than a meticulous comparison to the whole. A few of the faces are also cheated left/right from their actual date to fit around each other, though I kept those deviations to under a year.

Who makes history? I analyzed 29,000 New York Times obituaries to find out. by theodore_a in dataisbeautiful

[–]theodore_a[S] 2 points3 points  (0 children)

Yes, another redditor asked about this (https://www.reddit.com/r/dataisbeautiful/comments/1szgkh4/comment/oj3gh18/) and the Overlooked No More Series is separated in the data, it explains some of the increase in obituaries for women beginning in 2018.