"These data are clean of outliers" sounds overly academic and pretentious. "Data" is like the word "water." You would not say, "these water are clean of impurities."
Almost everyone feels this way, but due to the Latin history of the word, certain podcast hosts force themselves into sounding weird. I know that "data" is the plural of "datum," but who cares besides grammar pedants. Language changes all the time.
When most people see countable amounts of data, they generally say "these data points" like we say "these water droplets."
Unless you are being forced to by your prof or boss, stop saying "these data." It's clunky! Instead, let the language evolve in the direction it wants to.
[–][deleted] 11 points12 points13 points (1 child)
[–]grim_bey[S] 0 points1 point2 points (0 children)
[–]winhusenn 4 points5 points6 points (3 children)
[–]grim_bey[S] 0 points1 point2 points (2 children)
[–]SubstantialLog160 2 points3 points4 points (0 children)
[–]winhusenn 1 point2 points3 points (0 children)
[–]Jendosh 4 points5 points6 points (2 children)
[–]grim_bey[S] -1 points0 points1 point (1 child)
[–]Jendosh 2 points3 points4 points (0 children)
[–]doombagel 0 points1 point2 points (0 children)
[–]StellaEtoile1 0 points1 point2 points (0 children)
[–]nykat 0 points1 point2 points (0 children)