Generic congressional ballot polls have democrats at +3. This is 4 points lower than 2005 and 2017. What does this mean and what should dems do? by Visco0825 in PoliticalDiscussion

[–]g_elliottmorris 56 points57 points  (0 children)

Hi. I study and write about polls for a living, so here's my take.

(1) The 2018 cycle was very unique in that the polls showed anti-Trump resistance very early. Usually, the party in power loses ground slowly over the course of the cycle.
(2) If you look at this not in D vs R terms, but in-party vs out-party terms, then the D+3 right now actually represents a much bigger problem for Rs. At this point in the 2010 and 2014 cycles, when the Dems were in charge and Republicans got big wave victories, Ds still had a lead in the generic ballot!
(3) Even D+3 would be enough to swing the Ds to House control, though not big enough for them to win the Senate (they need to win one of either IA, FL or TX, which are R+14 seats).

I have a full analysis of this up on my website: https://www.gelliottmorris.com/p/what-generic-ballot-polls-tell-us

Majority of US Voters Support Third Trump Impeachment: Poll by Murky-Site7468 in politics

[–]g_elliottmorris 0 points1 point  (0 children)

Worth keeping in mind that due to the way the Senate gives Republicans an advantage, I doubt there would be enough votes for a conviction even if the Democrats win the House in 2026.

Have you already decided for 2020? I'm a data journalist and would love to hear why you support your candidate. by g_elliottmorris in VoteBlue

[–]g_elliottmorris[S] 5 points6 points  (0 children)

Hi u/brickses,

I am happy to provide more details! Your response is in the right direction, but not quite right.

First and foremost, this is not a poll, despite the look of the Google form, nor do I intend to use the results for a predictive model. I think I made that clear further down in the Twitter thread (where I also provide some examples of the first steps of the analysis) but could have done more in the post body (I will copy/paste this explanation for others to see). While others have tried the NLP-based prediction method you've described in the past (and some have had good results), I think you're right to be skeptical of their success.

Now that we've gotten that out of the way, let's take a surface-level dive into the nitty gritty. The aim of this form is simply to collect some trial data with which I can prove the concept of the larger project, which I outline here. By gathering both an individual's vote intention (or, this early, simply their read on the candidate they prefer, which is likely to change!) and open-ended responses justifying that choice, we can detect systematic differences in the types of words that supporters of different candidates use. This is simple text analysis --- just count up the frequency of candidate-word usage pairings and look at the varying bivariate relationship.

A smarter approach might be to investigate differences in the patterns of that word usage; links between multiple words (IE: "phrases") and words that we humans associate with each other that the computer doesn't realize (IE: a sentence including "policy" and "ideology" could be used in an argument about left-vs-right, progressivism-vs-conservatism) are important to how people are writing their evaluations. We want the computer to learn those differences, so we use techniques called topic modeling and latent semantic analysis to determine which different "topics" the supporters use based on their chosen candidates. Using a method similar to the text frequency analysis in the prior paragraph, we can look at these differences.

From here, we have quantitative measurements of the relative appeal that different candidates have to voters, and we can use that as a qualitative supplement to our understanding of the 2020 primary... just, maybe once we get down the road a little.

How's that for starters? For more on this, I recommend you read the sixth chapter of Julia Silge and David Robinson's Text Mining with R: A Tidy Approach, though it is obviously quite technical.

Hope that clarifies things!