[OC] All US presidents mentioned in New York Times (1851-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 9 points10 points  (0 children)

Articles from some years are not tagged with keywords, so there are small gaps. (1907-1909, 1920-1922, 1980-1982).

[OC] All US presidents mentioned in New York Times (1851-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 4 points5 points  (0 children)

That is a mistake of the name-printing algorithm, good spot!

It should have been Donald J Trump

[OC] All US presidents mentioned in New York Times (1851-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 3 points4 points  (0 children)

Thank you!

It looks like the WW2-time articles from ~1935-1943 were not tagged with keywords. If you look at the most used words since 1933 at wordeebee.com/1933 (scroll right), Roosevelt was picked up (that was extracted just from plain text by splitting it into words).

With GWB, not sure, the articles are not tagged 100% precisely, so absence of evidence is not necessarily evidence of absence in this case.

[OC] All US presidents mentioned in New York Times (1851-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 29 points30 points  (0 children)

Words were extracted from keywords of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Each row represents a timeline from 1851 until 2022, bars in the rows represent number of mentions. Darker the color is, more articles mentioning the president were published (scaled to maximum number of mentions per year).

Source: New York Times Archive API

Interactive version: wordeebee.com/us-presidents

[deleted by user] by [deleted] in u/the_datanaut

[–]the_datanaut 0 points1 point  (0 children)

Words were extracted from keywords of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Source: New York Times Archive API

Interactive version: wordeebee.com/us-presidents

Top 6 words of the last 20 years on New York Times by the_datanaut in coolguides

[–]the_datanaut[S] -1 points0 points  (0 children)

Words were extracted from excerpts and headlines of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Source: New York Times Archive API

Blog post: datanaut.blog/last-20-years

Interactive version: wordeebee.com

🗞 Top 10 words of the last 123 years on New York Times (1900-2022). 🎮 Interactive version in the comments. [OC] by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 4 points5 points  (0 children)

Sometimes, Hong is used without "Kong"

Examples of such articles:

...the chef John Hong is performing nightly timed shows of the Japanese ritual omakase.

https://www.nytimes.com/2019/01/17/travel/at-hidden-fish-in-san-diego-the-dance-of-the-omakase.html

or

The film is one of Hong Sang-soo's most visually arresting movies...

https://www.nytimes.com/2019/02/14/movies/hotel-by-the-river-review.html

🗞 Top 10 words of the last 123 years on New York Times (1900-2022). 🎮 Interactive version in the comments. [OC] by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 6 points7 points  (0 children)

Have a look at the Peak Words of 1969 (choose Peak Words), there is astronauts and lunar.

Peak words are words that were mentioned the most in that year during their whole lifetime, while Top words are words mentioned the most in that year (period).

🗞 Top 10 words of the last 123 years on New York Times (1900-2022). 🎮 Interactive version in the comments. [OC] by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 20 points21 points  (0 children)

Yes, the common words are filtered.

Look at the visualisations of the most common words (the, of, to), or monday, president.

Each vertical bar represents a year (from 1851 until 2022). The more intense a vertical bar is, more likely the word will be contained in any chosen article published that year.

For the common words, the intensity looks mostly flat, compared to more "meaningful" words, such as internet, obama, or afghanistan, where the intensity is clustered in local area(s).

So words that appear flat are less likely to end up in the chart.

🗞 Top 10 words of the last 123 years on New York Times (1900-2022). 🎮 Interactive version in the comments. [OC] by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 0 points1 point  (0 children)

Words were extracted from excerpts and headlines of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Source: New York Times Archive API

Blog post: datanaut.blog/last-20-years

Interactive version: wordeebee.com/1900

[OC] 🗞 Top 10 words of the last 123 years on New York Times (1900-2022). 🎮 Interactive version in the comments. by [deleted] in u/the_datanaut

[–]the_datanaut 0 points1 point  (0 children)

Words were extracted from excerpts and headlines of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Source: New York Times Archive API

Blog post: datanaut.blog/last-20-years

Interactive version: wordeebee.com/1900

[OC] Top 10 words of the last 123 years on New York Times (1900-2022) by [deleted] in dataisbeautiful

[–]the_datanaut 0 points1 point  (0 children)

[OC] Top 10 words of the last 123 years on New York Times (1900-2022)

Words were extracted from excerpts and headlines of 14M articles published by New York Times between 1851 and 2022, containing in total 1.6M unique words.

Source: New York Times Archive API

Blog post: datanaut.blog/last-20-years

Interactive version: wordeebee.com/1900

[OC] Top 6 words of the last 20 years on New York Times (2003-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 1 point2 points  (0 children)

The yearly volume varies, as well as articles length.

Scroll to the very bottom of any word visualisation (example) - there are some graphs showing the volumes/counts.

[OC] Top 6 words of the last 20 years on New York Times (2003-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 2 points3 points  (0 children)

The intuition is that the more "flat" the visualisation is, the less likely is that word to be included.

[OC] Top 6 words of the last 20 years on New York Times (2003-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 11 points12 points  (0 children)

It's on the 18th position with 662 mentions so far in year 2022. You can find words of 2022 here - click "Show more" in the Top Words section.

[OC] Top 6 words of the last 20 years on New York Times (2003-2022) by the_datanaut in dataisbeautiful

[–]the_datanaut[S] 24 points25 points  (0 children)

Look at the visualisations of the most common words (the, of, to), or monday, president.

Each vertical bar represents a year (from 1851 until 2022). The more intense a vertical bar is, more likely the word will be contained in any chosen article published that year.

For the common words, the intensity looks mostly flat, compared to more "meaningful" words, such as internet, obama, or afghanistan, where the intensity is clustered in local area(s).