| Many people were scratching their heads Wednesday morning (or late Tuesday night), when we realized that South Carolina Democrats had nominated a literal "Some Dude" - an unemployed veteran living with his father - who somehow managed to front the $10,000 needed to run.
South Carolina State Senator Robert Ford weighed in on the matter later on Wednesday, remarking:
No white folks have an 'e' on the end of Green. The blacks after they left the plantation couldn't spell, and they threw an 'e' on the end. (If you're wondering about the title, Gadsby is a 260-page novel that contains no instances of the letter E.)
Both Greene and Vic Rawl were relative unknowns and we'll assume no voter knowledge of either candidate . Given the campaigning by both candidates (or lack thereof), I think this is a relatively tenable assumption.
So, let's start at the county level - what's the relationship between the percent of non-white registered voters and the percentage Greene received?
Here are two maps, with the non-white voter percentage on the left and Greene's percentage on the right.
Is there a relationship? Maybe - hard to tell. Tom Schaller goes into this in more depth than I do.
However, thanks to the relatively good South Carolina State Election Commission website, we can go further to the precinct level. The geographic data for mapping precincts simply isn't available, but we can still look at the numbers. (Sidenote: Absentees and provisionals can't be attributed to a specific precinct and are tossed from here on out.)
Here's a scatterplot of the non-white RV percentage and the percentage that Greene received on Tuesday and a simple regression line through it. Below that are the Stata output from a simple regression taking the non-white RV% as the independent variable.
The regression tells us two things:
- For every 1% increase in the non-white percentage of RVs, Greene's percentage can be expected to increase 0.22%.
- For a hypothetical county with 100% white RVs, Greene's expected percentage should be (!!) 51.6%.
But is the relationship there? Hard to say - it is statistically significant, but the R-squared is a measly 0.1425, meaning the other 85.75% of variance in Greene's percentage is explained by something else.
Statistics disclaimer: Go ahead and skewer me for using a linear regression. (What else was I going to do?) I know the estimators here are going to be far from unbiased - that's a picture-perfect example of heteroskedasticity if I've ever seen one...
I'm hesitant to rely solely on percentages though - there were plenty of precincts with few RVs and where few votes were cast (as you can tell by the 100% Greene precincts floating along the top edge). We can also consider this in terms of numbers: the number of non-white RVs and the number of votes for Greene in a given precinct.
Now, the regression tells a few things again:
- For every additional non-white voter, Greene's vote count can be expected to go up 0.09. (Keep this in context of 24% voter turnout between both primaries!) This effect, again, is statistically significant, and very much so.
- For a hypothetical precinct with no non-white RVs, Greene will receive 7.8 votes.
- 62.6% of variance in Greene's vote totals by precinct can be explained by the number of non-white RVs.
So again, is the relationship there? I think the second method presents a stronger case for the "E"-phenomenon than that first. But that said, is this instance of identity politics any more extraordinary than other instances? Does this have to do with voters having very little information (paging Scott Lee Cohen)? The second analysis, I might add, is also confounded in part by varying turnout across precincts...
Robert Ford may be on to something, but it's all hard to say. (Lastly - if you haven't realized the difficulty in writing with no Es, this post excluding Stata outputs, contains 438 of them.)