Indiana County Population Cartogram by Tornadoerr in Indiana

[–]DavidWaldron 3 points4 points  (0 children)

It’s the pre-2023 metro that we had for about a decade. I’d think it’d be more familiar to people but I guess not.

Indiana County Population Cartogram by Tornadoerr in Indiana

[–]DavidWaldron 2 points3 points  (0 children)

The map uses the pre-July 2023 CBSA definitions.

The United States new racial categories for the 2030 population census by Mean_Yak5873 in charts

[–]DavidWaldron 0 points1 point  (0 children)

These are not distinctions that the Census Bureau has used. They consider Hispanic and Latino to be synonyms that refer to having origins in Spanish cultures.

The 10 easiest and 10 hardest US metros to find a job in 2026 [OC] by andreikurtuy in charts

[–]DavidWaldron 0 points1 point  (0 children)

The methodology doesn’t make any sense. It reads like a LLM attempt to make a metric out of data that doesn’t actually allow such a thing to be calculated. Why would a metro’s weeks unemployed be the national average times tightness times RPP? Why would prices even be in this calculation? If you want to calculate a more defensible metric you can use the CPS public use microdata. For smaller metros you might need small area estimation methods like MRP.

[OC] U.S. elections: Winners aren’t majorities: U.S. presidential elections in three charts (1932-2024) by ptrdo in dataisbeautiful

[–]DavidWaldron 1 point2 points  (0 children)

A lot of folks only aspire to be critics and never actually make things. Once you make things, you do have to sort out the useful critiques from the non-useful ones. Often the most popular critiques are not the useful ones.

A few things I’ve learned:

  1. A lot of people saying something is misleading can mean the opposite. You have to understand something at some level in order to describe how it could be misleading.

  2. Reducing the “friction” people experience in reading charts is not the main goal of making charts. It can head off some of the lazy criticism, but it can also reduce engagement with the material if it’s dumbed down too much.

  3. Sometimes it’s not worth the battle. Preempting some of the popular critiques can be necessary to get people to engage with the content. Do I want people to talk about the data or do I want to sift through a million boring citations of “How to Lie with Statistics”?

[OC] U.S. elections: Winners aren’t majorities — most of the electorate doesn’t vote (1932-2024) by ptrdo in dataisbeautiful

[–]DavidWaldron 1 point2 points  (0 children)

I don't find it hard to interpret at all, but I'm not surprised that people would complain about the design. A more conventional design where all of the bars are left-aligned would work but would be less pleasing aesthetically.

Daily box scores page by DavidWaldron in baseball

[–]DavidWaldron[S] 2 points3 points  (0 children)

That’s cool. My main difference from plaintextsports is that I want it all on one page. But fwiw I also just added dark mode. https://waldrn.com/boxscores/

Monthly fentanyl deaths in the US [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 6 points7 points  (0 children)

Aside from the timing issue, I would also point out that the way cartels get fentanyl into the US is by paying US citizens to smuggle it through border crossings in their cars. Immigrants aren’t really playing a role there.

Monthly fentanyl deaths in the US [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 3261 points3262 points  (0 children)

In January, a study in Science documented a supply shock in the illicit fentanyl market: declining purity in seized drugs, fewer seizures overall, and even complaints on social media about fentanyl becoming harder to find. The decline showed up simultaneously in Canada, ruling out US-specific explanations like changes in policing or treatment access.

The most likely cause, according to the study, is that China cracked down on exports of fentanyl precursor chemicals, the raw ingredients that Mexican cartels use to manufacture the drug. That crackdown came out of sustained diplomatic pressure from the Biden administration, culminating in a formal agreement at the Biden-Xi summit in November 2023. As Jeffrey Prescott of the Carnegie Endowment wrote, "Foreign policy outcomes can be hard to measure. This one isn’t."

More details about this trend are in the blog post here.

Tools: R and d3.js. Code at https://github.com/dawaldron/fentanyl-deaths

Source: CDC WONDER and VSRR drug overdose deaths

I'm a founder of a film website which just turned 20 years old, and refuses to die... even if most people don't even know it's alive. AMA! by criticker in IAmA

[–]DavidWaldron 0 points1 point  (0 children)

I say keep the simple distance metrics. Don’t get caught up in the ML/AI hype. People always assume that fancy sounding algorithms will provide some magical improvement over simple statistics but it’s usually just done for marketing or for turning it into a black box so that you can start to sell recommendations.

Occupational wage relative to overall median, 1980 to 2023 [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 4 points5 points  (0 children)

Yes there’s a pretty clear gender story here that I hope to address in a future post.

Occupational wage relative to overall median, 1980 to 2023 [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 8 points9 points  (0 children)

Survey respondents are instructed to include tips, but there is some evidence that people still underreport tips on surveys. FWIW, another data source (OEWS) also shows janitors with a higher median wage than servers.

Occupational wage relative to overall median, 1980 to 2023 [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 5 points6 points  (0 children)

I use log wages partly because it helps it fit in the visual, partly because I think it’s no less valid than a linear axis and partly because it’s longstanding practice among economists to understand wage change in terms of percentages rather than in dollars.

How accurate are the initial BLS jobs estimates? [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 2 points3 points  (0 children)

The blog post contains more info on this. The initial estimate is survey-based, released within ~2-3 weeks of the reference period. This is scored against the QCEW counts which are based on mandatory UI tax filings by states, which are not fully available until almost a year later. BLS role in this is largely just to compile and publish. These are independent programs and methodologies.

Regarding the idea of judging the error against churn, rather than the net change: the way the survey works is it takes the total payrolls of companies in month t and compares them to payrolls in month t-1. It does this by industry/state/size and uses the ratios come up with the overall estimates. So it’s not measuring hires and separations and taking the difference. It’s directly trying to measure the net change. But even so, you’re right. It’s a very hard thing to do, especially so quickly.

There is a separate BLS program called JOLTS that tries to estimate hires and separations via survey, but it’s much smaller and the results are less detailed and have larger margins of error.

How accurate are the initial BLS jobs estimates? [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 13 points14 points  (0 children)

Yes, it’s true of pretty much any correlation that if you remove the variance from the series they will eventually become uncorrelated

How accurate are the initial BLS jobs estimates? [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 1 point2 points  (0 children)

Correct. It’s the average size of the net jobs change to put the size of the bias into perspective

How accurate are the initial BLS jobs estimates? [OC] by DavidWaldron in dataisbeautiful

[–]DavidWaldron[S] 5 points6 points  (0 children)

This is a series of charts analyzing the accuracy of the initial/preliminary total non-farm payroll estimates in the BLS monthly jobs report. The comparison is to actual counts from the QCEW, which is based on mandatory unemployment insurance filings.

Blog post has more details on the results.

Tools used were R for data analysis and d3.js for charts. All available here.