Introduction

About Exploratory Analysis

Data analysis is an iterative process. A tab deck such as this is typically the first real step in the process, after—and often during—cleaning. The goal of an exploratory data analysis is not to come up with conclusions, but rather to come up with more questions. In this stage we don’t set out to “prove” anything, we study the data and use it for hypothesis generation. This report doesn’t formally constitute an analysis. These are the raw tabs we would use as reference while continuing the analysis and digging deeper. EDA gives you the “What” in data analysis—it establishes facts. But it does not answer the all important question of “Why?” After looking through this report, you should be asking yourself “Why?” over and over. You will use those questions to generate hypotheses to explain what you see in these tabs. It’s the job of a further, more refined, analysis to answer those questions.

This document is not about making value judgements1, this document is about establishing facts and asking questions. Even though the subject matter is highly political, as analysts we must approach the work with a scientific mindset. Our personal political views have no bearing on our approach to this analysis. As we review the tabs, we make observations about the findings, but only interpret and speculate to the extent that we can raise questions to explore further.

About ANES

The American National Election Studies is a research survey conducted since 1948 to study public opinion and voting behavior in US presidential elections. As a joint project between Stanford and the University of Michigan and funded by the National Science Foundation, the ANES is the most comprehensive and methodologically rigorous survey of political opinion available today. In stark contrast to the slew of polls one encounters daily in various news media, the ANES is designed and conducted to extremely high standards and with a budget that allows for the collection of high quality data. The skepticism one may justifiably hold regarding the polling industry is simply not applicable to the ANES.

Although a full critique of the polling industry is rather tangential to the goals of this analysis, a brief outline of the methodological concerns involved is necessary in order to quickly defray the argument that ANES data can be dismissed due to an association with polls (“all polls are wrong, therefore the ANES data is wrong”). Most public opinion surveys operate with severe constraints of both time and budget that limit their validity. To fulfill their primary objective of providing timely survey data for immediate publication, polls predominantly rely on telephone surveys conducted with Random Digit Dialing. Historically, this has been the closest way of cheaply approximating a probability sample. However, RDD response rates have plummeted to the low single digits2 and are continually declining while the proportion of the population reachable using this method is also declining. With a budget of only a few dollars per completed interview, it is simply not possible to provide a quality representative sampling of the US population.

This is not a blanket condemnation of an entire industry. I have worked with many people in the polling industry, as well as its sibling industry of market research in which I personally spent many years, and it is full of smart people artfully solving complex problems. The quality of polling varies greatly in this market and there are certainly many respectable research firms that invest in the rigor and science of their survey methodology. Ultimately, however, I would argue that the time and budget constraints imposed by the market are incongruent with the goal of a representative sample.

The dismal performance of the election polls of 2016—and sadly again in 2020—are not surprising because of these constraints. Despite the fact that a lot of good work is being done to improve the predictive utility of the polls, notably among aggregators like Fivethirtyeight, the GIGO principle retains its rule, even at scale. An aggregate of a hundred polls, each conducted hastily with a meager budget and a shaky methodology, will still suffer from the same stench as if there were only one. This is not a problem that can be solved by developing fancier models using MRP.

While the media organizations reporting these polls would certainly like for them to be accurate, they have little incentive to pay for a proper survey3. There is no argument to be made that viewership or readership would be increased by investing in an expensive methodology when all their competitors are using the same cheap polls. There is no downside to using the cheap polls because, everyone is doing it and after all, this is all guesswork, right? The media will never do retrospectives on why their polls were wrong because by the time this comes to light the news cycle is on to something completely different. The unfortunate consequence of this race to the bottom is that now nobody trusts survey research, even when it’s done right.

My point is that the ANES is not subject to these flaws. With a multi-million dollar annual budget, a trained staff, principal investigators who are leaders in their fields, constant refinement of their survey methodology and questionnaire design, and an organizational history dating back to 1948, the ANES is not in the same league as public opinion polls. I emphasize this because the distrust in polls could be used as a shield to levy immediate criticism of the ANES data by anyone who may find results that disagrees with their ideological bent. They could appeal to the now established distrust of polls to cast doubt on anyone who uses ANES data in their research, again: “all polls are wrong, therefore the ANES data is wrong”. This prologue is simply a preemptive warning that such an argument will not fly here. You do not get to dismiss these data as “fake news” just because you find something you dislike.

The ANES is not perfect—no survey can be—but it’s design is more than sufficient to buttress itself against these dismissive criticisms of traditional public opinion polls. One hypothesis for the pronounced bias of political polls in both 2016 and 2020 is partisan nonresponse: essentially, that nonresponse is highly correlated along partisan lines. With a 1% response rate on an opt-in population, correlated nonresponse is clearly a massive problem. I would not expect to generalize from a sample given such conditions. Now, to what extent does this affect the ANES, with a 40% response rate drawn from a probability sample with nearly full coverage of the population? Of course it can’t be zero, but it is highly unlikely that there is a segment of the population with unique political views that could not be captured by this survey.

About this report

I was motivated to do this analysis for two reasons. First, I have been searching for a rich, clean, freely available, and substantively interesting dataset to use in developing a series of tutorials on How to Analyze Data. I will be using the ANES data in these tutorials as a way of teaching the process of data analysis. Equally as important, I came to this project out of sheer desperate frustration with the state of the political climate during the 2020 election season. As with most of us, my peers and media choices largely reflect my own views, and I have been struggling with ways to escape the filter bubble. In trying to understand how US politics has evolved into its present hyper-partisan divide, I have been trying to find high-quality datasets that can shed light on these political differences. For all those times I have found it inconceivable that people on the opposite side of the political spectrum have come to the conclusions they have, I finally have a dataset that can help explain why they think like they do. I believe the ANES can greatly contribute to this understanding. It is my hope that well-meaning people on all sides of the political fence will turn to resources like the ANES to learn why others think as they do.

I am not a political scientist—my training was in sociology and I have been a data scientist for 25 years4, so I know my way around a set of data. Approaching large complex surveys for the first time is daunting. Since this will be a common task for anyone who wishes to leverage the ANES in their own analysis, I hope this document will help ease some of the laborious time required when becoming acquainted with the data.

There are 1,381 variables in the current version of the ANES 2020 dataset and despite the length of this report, I am just barely scratching the surface of what there is to learn here. I hope that some of you will be able to re-use this code to apply it to your own research topics in this highly valuable dataset. While exploring the tabs, I highly encourage you to download the User guide as well as the actual survey questionnaire for reference. This way you will have the full context into how the questions were asked and how the available response categories were worded.

My approach to each variable is to begin with an unrecoded frequency table using the appropriate weighting variable, which in this report is primarily the full-sample pre-election weight variable V200010a. I am not including unweighted tabulations because these are expected to be released with the documentation in the final release of the dataset. The primary crosstab variable I am considering in this analysis is by the repondent’s expressed vote, intended vote, or preference for either the Democratic or Republican Presidential candidate, based on the variable V201075x. This crosstab is also weighted, but I remove missing values from the table (Don’t know, Refused, Inapplicable, etc). Following this I include a further demographic breakdown by voting intent along gender, age, race, and education. The demographic breaks are hard-coded in the demotable function which is available in the file anes_functions.R and also reproduced in the appendix. I’ve tried to build some modularity into this function and it should be fairly straightforward to change the demographic breaks or add additional ones if needed in your analysis. I include a few visualizations using ggplot where I felt it would communicate better than a table. There are still many more variables that would benefit from plotting that I have not had time to develop. I may add to this report as time permits.

In the demographic table I report the point estimate and the 95% confidence interval in parentheses below it. The confidence intervals are calculated using srvyr’s survey_mean function which relies on the svyciprop function from the survey package. Since the ANES is not a simple random sample, variance calculations must take account of the survey design to generate accurate results. This is well documented in the ANES User Guide and Codebook in the section on “Data Analysis, Weights, and Variance Estimation”. While the documentation uses Stata in most of the discussion, the same methods are available in R using survey or srvyr. A good introduction to the topic is Survey Data Analysis with R in the Statistical Consulting pages at UCLA.

I’ve included confidence intervals to assist in gauging the statistical significance of the demographic breaks. This is especially important because in some of the segments the sample sizes do get very thin. These should be used as helpers but not as formal stat tests. If confidence levels do not overlap, then the difference is statistically significant, but the converse is not necessarily true. A difference may be significant even if there is some overlap in the confidence intervals5.

This report is generated using RMarkdown with knitr, Pandoc, and the kableExtra package. I make heavy use of the tidyverse. The haven and labelled packages enable R to easily read Stata files along with metadata such as value labels. Procedures that require calculating standard errors (confidence intervals, t-tests, chi-squared tests, regression, etc.) must use either the survey or srvyr packages.

The source repository for this report is available here. All code is freely usable under the terms of the MIT license; however, the analysis is my own and I ask that you do not reproduce it without contacting me first. Please also contact me if you find any errors, obtain different results when replicating this analysis, or if you have questions about the methods I am using, or suggestions for improvement.

Lastly, if I have committed crimes of omission, it is due merely to a shortage of time, not to any conscious effort to portray the results with any particular spin. There is so much material here I cannot possibly comment on each variable. Although I may add to it over time as I develop my tutorials, if anyone wants to contribute please open a pull request and I will consider adding to the document with a proper citation. Do stick to the ethos of an EDA: no value judgements, merely a summary of the facts and the questions they raise that may benefit from further analysis and spark hypothesis generation.

Now, let the fun begin!

Environment setup

options(scipen=9)

library(tidyverse)
library(haven)
library(labelled)
library(kableExtra)
library(survey)
library(srvyr)
library(scales)

anesfile <- "~/data/nov3/ANES/anes_timeseries_2020_stata_20210324.dta"

# See appendix for complete source code
source("anes_functions.R", local = knitr::knit_global())
source("anes_recodes.R", local = knitr::knit_global())

Dataset Validation

In this section we perform a few checks on the dataset to ensure that counts match what is expected. The current version of the dataset is a preliminary release, and the ANES team is continuing to validate the data, add additional summary variables, and refine the survey weights. As a result, these values may change as additional versions of the dataset are released.

Dataset version and unweighted counts

anes <- read_dta(file = anesfile)
anes <- anes %>% all_recodes()

t1 <- anes %>% count(version)
quicktable(t1, colnames = (c("Version", "N")), title = "Dataset version")
Dataset version
Version N
ANES2020TimeSeries_20210324 8,280

Sample mode

Compare these counts to the reported counts on the ANES study page.

t1 <- anes %>% count(sample_mode)
t2 <- anes %>%
  count(sample_mode, post_complete) %>%
  pivot_wider(names_from = post_complete, values_from = n)

k1 <- quicktable(t1, colnames = c("Sample mode", "N"), align = "lc")
k2 <- quicktable(t2, colnames = c("Sample mode", names(t2)[2:ncol(t2)]), align = "lcc")
twintables(k1, k2)
Sample mode N
ANES 2016-2020 Panel 2,839
Fresh Cross-Sectional Sample 5,441
Sample mode Complete Incomplete
ANES 2016-2020 Panel 169 2,670
Fresh Cross-Sectional Sample 658 4,783

Weights and Survey Design

Full-sample pre-election and post-election survey design objects with weights. This report is focused on the pre-election survey, so we will use the anespre design object for any calculation involving standard errors. The construction of the survey object is slightly different when using the srvyr package. As an example, see the demotable function in the appendix.

# survey design objects
anespre   <- svydesign(id=~V200010c, strata=~V200010d, weights=~V200010a, data=anes, nest=TRUE)
anespost  <- svydesign(id=~V200010c, strata=~V200010d, weights=~V200010b, data=anes, nest=TRUE)

t1 <- anes %>% count(sample_mode, wt = V200010a)
t2 <- anes %>% count(sample_mode, wt = V200010b)

k1 <- quicktable(t1, colnames = c("Sample mode", "N"), title = "Pre-election weighted count")
k2 <- quicktable(t2, colnames = c("Sample mode", "N"), title = "Post-election weighted count")
twintables(k1, k2)
Pre-election weighted count
Sample mode N
ANES 2016-2020 Panel 2,808
Fresh Cross-Sectional Sample 5,472
Post-election weighted count
Sample mode N
ANES 2016-2020 Panel 2,665
Fresh Cross-Sectional Sample 4,788

Confirm that the survey design objects are also weighting correctly.

t1 <- as_tibble(svytable(~sample_mode, design=anespre))
t2 <- as_tibble(svytable(~sample_mode, design=anespost))

k1 <- quicktable(t1, colnames = c("Sample mode", "N"), title = "Pre-election weighted count")
k2 <- quicktable(t2, colnames = c("Sample mode", "N"), title = "Post-election weighted count")
twintables(k1, k2)
Pre-election weighted count
Sample mode N
ANES 2016-2020 Panel 2,808
Fresh Cross-Sectional Sample 5,472
Post-election weighted count
Sample mode N
ANES 2016-2020 Panel 2,665
Fresh Cross-Sectional Sample 4,788

Pre-election vote/intent/preference

This constructed variable is based on a series of questions asked to determine whether the respondent will likely vote for Biden or for Trump in the 2020 election. I will use this as the primary crosstab variable throughout the remainder of this report. This will enable us in additional analysis to formulate hypotheses about the differences between likely Biden voters and likely Trump voters.

Why not skip directly to the post-election survey and use the respondent’s actual vote in the election? We could—and should—replicate all these crosstabs based on the actual post-election vote. I decided on pre-election intent for a few reasons. First, the pre-election dataset is a massive enough source of variables that we will not lack for interesting topics to study. Working with the pre-election data is also easier as a first-timer to this dataset because we don’t need to worry about the effect of post-election nonresponse. Although the pre to post completion rate is quite high, as you can confirm in the Sample mode section above, there may be some issues with correlated nonresponse and I’d rather not deal with that in this first EDA. Finally, I would also make the case that a respondent’s expressed voting intent in the pre-election will be very highly correlated with their actual vote. So, this should be a durable representation of the two sides of the 2020 electorate.

One area of interest for future work is to examine how much of the variability in voting intent can be explained purely by demographics. Note that the ANES does not provide any geographic data, not even a simple urban/suburban/rural distinction. Still, it would be interesting to see how well we could predict voting intent based solely on age, gender, ethnicity, and education.

t1 <- anes %>% count(V201075x) %>% mutate(p = n / sum(n)) %>% as_factor()
t2 <- anes %>% count(V201075x, wt = V200010a) %>% mutate(p = n / sum(n)) %>% as_factor()
t3 <- t1 %>% left_join(t2, by = "V201075x")
t3$p.x <- percent(t3$p.x, accuracy = 0.01)
t3$p.y <- percent(t3$p.y, accuracy = 0.01)

quicktable(t3,
  colnames = c("", "Unweighted N", "Unweighted %", "Weighted N", "Weighted %"),
  title = "Vote/intent/preference",
  align = "lrrrr"
)
Vote/intent/preference
Unweighted N Unweighted % Weighted N Weighted %
-Inapplicable 523 6.32% 598 7.22%
Democratic candidate selected (vote) 267 3.22% 277 3.35%
Republican candidate selected (vote) 118 1.43% 111 1.35%
Other candidate selected (vote) 9 0.11% 8 0.09%
Democratic candidate selected (intent to vote) 3,759 45.40% 3,704 44.74%
Republican candidate selected (intent to vote) 3,016 36.43% 2,937 35.48%
Other candidate selected (intent to vote) 363 4.38% 350 4.23%
Democratic candidate selected (preference) 84 1.01% 103 1.24%
Republican candidate selected (preference) 135 1.63% 185 2.24%
Other candidate selected (preference) 6 0.07% 6 0.07%

Components

Examine the relative contribution from each of the components of pre-election vote/intent/preference for both D and R. Since early voting was possible while the pre-election survey was ongoing, some of the respondents had already voted by the time of their interview. The vast majority of this variable is made of those who haven’t yet voted but express a clear intent for whom they will vote. Another small group did not express an intent, but did express a preference in a follow-up question for one of the candidates.

t1 <- anes %>% count(preintent, vip, wt = V200010a) %>%
  filter(!is.na(preintent) & !is.na(vip)) %>%
  group_by(preintent) %>%
  mutate(p = n / sum(n)) %>%
  select(preintent, vip, n, p)

t2 <- t1 %>%
  pivot_wider(id_cols = vip, names_from = preintent, values_from = c(n, p)) %>%
  mutate(p_D = percent(p_D, accuracy = 0.1), p_R = percent(p_R, accuracy = 0.1))

# kable(t2,
#   col.names = c("Component", "D Count", "R Count", "D %", "R %"),
#   format.args = list(big.mark = ","), digits = 0, align = "lrrrr"
# ) %>%
#   kable_classic(full_width = FALSE, html_font = "Verdana")

quicktable(t2,
  colnames = c("Component", "D Count", "R Count", "D %", "R %"),
  title = "Vote/intent/preference",
  align = "lrrrr"
)
Vote/intent/preference
Component D Count R Count D % R %
Vote 277 111 6.8% 3.4%
Intent 3,704 2,937 90.7% 90.8%
Preference 103 185 2.5% 5.7%
ggplot(t1, aes(x = preintent, y = p, fill = vip)) +
    geom_bar(position = "fill",stat = "identity") +
    scale_y_continuous(labels = scales::percent_format()) +
    scale_color_discrete(name="Vote/intent/preference") +
    labs(
        title = "Components of Pre-election Vote/intent/preference",
        subtitle = "among likely Democratic vs Republican voters",
        x = "", y = "", fill = ""
      ) +
      theme(
        plot.title = element_text(hjust = 0.5),
        plot.subtitle = element_text(hjust = 0.5)
      )

Let’s do a chi-squared test to see if the difference in the distributions is significant. As of the preliminary release, the differences are statistically significant. Thus, we could consider limiting the analysis to the “intent” category, excluding the early voters as well as those with a preference but not an intention to vote. The “intent” catgory is the most dominant among both D and R sides. At this time I don’t see this as a good enough reason to throw away cases. You should always be skeptical of any analysis that removes legitimate observations from the analysis!

Given the empirical results of the 2020 election that Biden voters were more likely to have voted by mail and to have voted early, to remove this group from the analysis may bias the results by excluding a dedicated segment of Biden voters. I would argue that this variable is an adequate operationalization for distinguishing between likely Biden and likely Trump voters. Still, let’s keep the subtelty involved in this variable in mind throughout the analysis. It may become necessary later to disentangle the 3 components, or even rely on another measure altogether—such as the actual post-election vote.

For brevity, I may refer to the groups as “Biden voters” vs “Trump voters” or even just “Ds” vs “Rs” instead of the much wordier technical definition: “eligible voters who are either registered to vote or intend to register to vote prior to the election, who have either already voted for, intent to vote for, or express a preference to vote for Biden or Trump”.

anespre <- svydesign(id=~V200010c, strata=~V200010d, weights=~V200010a, data=anes, nest=TRUE)
svychisq(~vip + preintent, design=subset(anespre, !is.na(preintent) & !is.na(vip)))
## 
##  Pearson's X^2: Rao & Scott adjustment
## 
## data:  svychisq(~vip + preintent, design = subset(anespre, !is.na(preintent) &     !is.na(vip)))
## F = 21.155, ndf = 1.97, ddf = 262.01, p-value = 0.00000000379

Likability of candidates

Now we enter the actual substantive questions in the survey that we will crosstab with our voting intent variable.

Is there anything the respondent likes about the Democratic Presidential candidate? Note that fully 17% of likely Biden voters don’t actually like anything about Biden! Wow, this is certainly one of those “Why?” moments I mentioned above. In the demographics, we see that older Ds, whites, and college graduates are more likely to like Biden, there is noticeable hesitance among Non-whites and non-college graduates.

anes %>% tabtemplate1(V201106, "Like anything about the Democratic candidate")
Like anything about the Democratic candidate
Count Pct
-Refused 10 0.12%
-Don’t know 2 0.02%
Yes 3,994 48.24%
No 4,274 51.62%
Like anything about the Democratic candidate
D Voters R Voters
Yes 82.8% 12.1%
No 17.2% 87.9%
Like anything about the Democratic candidate
Yes No
Vote Intent
D 82.84%
(81.06%, 84.48%)
17.16%
(15.52%, 18.94%)
R 12.10%
(10.62%, 13.74%)
87.90%
(86.26%, 89.38%)
Gender
D - Male 83.58%
(81.10%, 85.82%)
16.42%
(14.18%, 18.90%)
D - Female 82.24%
(79.76%, 84.48%)
17.76%
(15.52%, 20.24%)
R - Male 12.84%
(10.62%, 15.44%)
87.16%
(84.56%, 89.38%)
R - Female 11.26%
(9.30%, 13.54%)
88.74%
(86.46%, 90.70%)
Age and Gender
D - Male - 18 to 34 80.92%
(75.26%, 85.54%)
19.08%
(14.46%, 24.74%)
D - Male - 35 to 54 79.42%
(74.28%, 83.74%)
20.58%
(16.26%, 25.72%)
D - Male - 55+ 90.52%
(87.24%, 93.02%)
9.48%
(6.98%, 12.76%)
D - Female - 18 to 34 72.90%
(66.90%, 78.16%)
27.10%
(21.84%, 33.10%)
D - Female - 35 to 54 80.70%
(76.36%, 84.40%)
19.30%
(15.60%, 23.64%)
D - Female - 55+ 89.64%
(86.10%, 92.36%)
10.36%
(7.64%, 13.90%)
R - Male - 18 to 34 16.12%
(10.64%, 23.66%)
83.88%
(76.34%, 89.36%)
R - Male - 35 to 54 12.34%
(8.98%, 16.72%)
87.66%
(83.28%, 91.02%)
R - Male - 55+ 12.30%
(9.02%, 16.56%)
87.70%
(83.44%, 90.98%)
R - Female - 18 to 34 10.78%
(6.68%, 16.92%)
89.22%
(83.08%, 93.32%)
R - Female - 35 to 54 12.32%
(9.48%, 15.86%)
87.68%
(84.14%, 90.52%)
R - Female - 55+ 9.56%
(6.92%, 13.06%)
90.44%
(86.94%, 93.08%)
Race
D - White 87.22%
(85.24%, 88.96%)
12.78%
(11.04%, 14.76%)
D - Non-white 77.26%
(73.88%, 80.32%)
22.74%
(19.68%, 26.12%)
R - White 11.20%
(9.62%, 13.00%)
88.80%
(87.00%, 90.38%)
R - Non-white 16.04%
(12.28%, 20.68%)
83.96%
(79.32%, 87.72%)
Education
D - College Grad 87.30%
(85.00%, 89.30%)
12.70%
(10.70%, 15.00%)
D - Not college grad 79.30%
(76.64%, 81.72%)
20.70%
(18.28%, 23.36%)
R - College Grad 16.16%
(13.12%, 19.74%)
83.84%
(80.26%, 86.88%)
R - Not college grad 10.56%
(9.04%, 12.28%)
89.44%
(87.72%, 90.96%)

Is there anything the respondent likes about the Republican Presidential candidate? We can see that likely Trump voters more often report they like something about Trump (91%) than likely Biden voters for Biden (83%). This is well sustained across gender, with older Rs even more agreeable to Trump than younger Rs.

The proportion of likely Trump voters who find something they like in Biden (12%) is roughly the same as the proportion of likely Biden voters who like something about Trump.

anes %>% tabtemplate1(V201110, "Like anything about the Republican candidate")
Like anything about the Republican candidate
Count Pct
-Refused 11 0.14%
-Don’t know 3 0.04%
Yes 3,692 44.58%
No 4,574 55.24%
Like anything about the Republican candidate
D Voters R Voters
Yes 11.4% 91.3%
No 88.6% 8.7%
Like anything about the Republican candidate
Yes No
Vote Intent
D 11.38%
(9.98%, 12.94%)
88.62%
(87.06%, 90.02%)
R 91.26%
(89.72%, 92.58%)
8.74%
(7.42%, 10.28%)
Gender
D - Male 12.98%
(10.98%, 15.30%)
87.02%
(84.70%, 89.02%)
D - Female 10.16%
(8.58%, 11.98%)
89.84%
(88.02%, 91.42%)
R - Male 91.64%
(89.68%, 93.26%)
8.36%
(6.74%, 10.32%)
R - Female 90.82%
(88.56%, 92.66%)
9.18%
(7.34%, 11.44%)
Age and Gender
D - Male - 18 to 34 13.22%
(9.50%, 18.10%)
86.78%
(81.90%, 90.50%)
D - Male - 35 to 54 15.40%
(11.62%, 20.14%)
84.60%
(79.86%, 88.38%)
D - Male - 55+ 10.88%
(8.20%, 14.30%)
89.12%
(85.70%, 91.80%)
D - Female - 18 to 34 11.86%
(8.70%, 15.98%)
88.14%
(84.02%, 91.30%)
D - Female - 35 to 54 10.08%
(7.80%, 12.92%)
89.92%
(87.08%, 92.20%)
D - Female - 55+ 9.10%
(6.64%, 12.32%)
90.90%
(87.68%, 93.36%)
R - Male - 18 to 34 85.54%
(78.90%, 90.34%)
14.46%
(9.66%, 21.10%)
R - Male - 35 to 54 90.94%
(87.68%, 93.38%)
9.06%
(6.62%, 12.32%)
R - Male - 55+ 95.22%
(92.98%, 96.78%)
4.78%
(3.22%, 7.02%)
R - Female - 18 to 34 82.26%
(74.30%, 88.14%)
17.74%
(11.86%, 25.70%)
R - Female - 35 to 54 90.82%
(86.48%, 93.86%)
9.18%
(6.14%, 13.52%)
R - Female - 55+ 94.94%
(92.96%, 96.38%)
5.06%
(3.62%, 7.04%)
Race
D - White 11.28%
(9.40%, 13.50%)
88.72%
(86.50%, 90.60%)
D - Non-white 11.64%
(9.62%, 14.06%)
88.36%
(85.94%, 90.38%)
R - White 92.30%
(90.66%, 93.66%)
7.70%
(6.34%, 9.34%)
R - Non-white 86.94%
(81.76%, 90.82%)
13.06%
(9.18%, 18.24%)
Education
D - College Grad 12.06%
(10.18%, 14.20%)
87.94%
(85.80%, 89.82%)
D - Not college grad 10.62%
(8.86%, 12.68%)
89.38%
(87.32%, 91.14%)
R - College Grad 93.20%
(91.16%, 94.80%)
6.80%
(5.20%, 8.84%)
R - Not college grad 90.32%
(88.30%, 92.02%)
9.68%
(7.98%, 11.70%)

Dislikability of candidates

Is there anything R dislikes about the candidates?

20% of likely Biden voters find something they dislike about Biden, while 25% of likely Trump voters find something they dislike about Trump. I would be very curious to see how this question played out in prior elections, however, merging the 2020 dataset with the ANES cumulative data file will take quite a bit of work.

For likely Biden voters about Biden, the younger ones, whites, and college graduates more often find something they dislike about him. For likely Trump voters about Trump, it’s also the younger ones and college graduates but Non-whites much more so than whites.

anes %>% tabtemplate1(V201108, "Dislike anything about the Democratic candidate")
Dislike anything about the Democratic candidate
Count Pct
-Refused 15 0.18%
-Don’t know 2 0.02%
Yes 3,894 47.02%
No 4,369 52.78%
Dislike anything about the Democratic candidate
D Voters R Voters
Yes 20.5% 81.8%
No 79.5% 18.2%
Dislike anything about the Democratic candidate
Yes No
Vote Intent
D 20.52%
(18.60%, 22.56%)
79.48%
(77.44%, 81.40%)
R 81.84%
(80.06%, 83.50%)
18.16%
(16.50%, 19.94%)
Gender
D - Male 23.00%
(20.20%, 26.08%)
77.00%
(73.92%, 79.80%)
D - Female 18.46%
(16.08%, 21.08%)
81.54%
(78.92%, 83.92%)
R - Male 83.80%
(81.54%, 85.82%)
16.20%
(14.18%, 18.46%)
R - Female 79.76%
(76.88%, 82.36%)
20.24%
(17.64%, 23.12%)
Age and Gender
D - Male - 18 to 34 28.06%
(22.38%, 34.56%)
71.94%
(65.44%, 77.62%)
D - Male - 35 to 54 25.14%
(20.48%, 30.48%)
74.86%
(69.52%, 79.52%)
D - Male - 55+ 17.34%
(14.04%, 21.20%)
82.66%
(78.80%, 85.96%)
D - Female - 18 to 34 25.98%
(21.18%, 31.40%)
74.02%
(68.60%, 78.82%)
D - Female - 35 to 54 19.86%
(16.14%, 24.18%)
80.14%
(75.82%, 83.86%)
D - Female - 55+ 12.50%
(9.98%, 15.52%)
87.50%
(84.48%, 90.02%)
R - Male - 18 to 34 81.98%
(76.36%, 86.50%)
18.02%
(13.50%, 23.64%)
R - Male - 35 to 54 83.46%
(79.62%, 86.68%)
16.54%
(13.32%, 20.38%)
R - Male - 55+ 85.88%
(82.02%, 89.00%)
14.12%
(11.00%, 17.98%)
R - Female - 18 to 34 67.80%
(59.24%, 75.30%)
32.20%
(24.70%, 40.76%)
R - Female - 35 to 54 76.46%
(71.22%, 81.00%)
23.54%
(19.00%, 28.78%)
R - Female - 55+ 87.42%
(84.44%, 89.90%)
12.58%
(10.10%, 15.56%)
Race
D - White 23.00%
(20.60%, 25.58%)
77.00%
(74.42%, 79.40%)
D - Non-white 17.54%
(14.96%, 20.48%)
82.46%
(79.52%, 85.04%)
R - White 83.98%
(82.16%, 85.64%)
16.02%
(14.36%, 17.84%)
R - Non-white 72.64%
(66.84%, 77.76%)
27.36%
(22.24%, 33.16%)
Education
D - College Grad 25.64%
(22.90%, 28.58%)
74.36%
(71.42%, 77.10%)
D - Not college grad 16.52%
(14.28%, 19.04%)
83.48%
(80.96%, 85.72%)
R - College Grad 86.64%
(83.68%, 89.12%)
13.36%
(10.88%, 16.32%)
R - Not college grad 79.82%
(77.42%, 82.04%)
20.18%
(17.96%, 22.58%)
anes %>% tabtemplate1(V201112, "Dislike anything about the Republican candidate")
Dislike anything about the Republican candidate
Count Pct
-Refused 17 0.20%
Yes 5,078 61.32%
No 3,185 38.46%
Dislike anything about the Republican candidate
D Voters R Voters
Yes 90.1% 25.2%
No 9.9% 74.8%
Dislike anything about the Republican candidate
Yes No
Vote Intent
D 90.06%
(88.64%, 91.32%)
9.94%
(8.68%, 11.36%)
R 25.22%
(23.34%, 27.20%)
74.78%
(72.80%, 76.66%)
Gender
D - Male 89.72%
(87.62%, 91.50%)
10.28%
(8.50%, 12.38%)
D - Female 90.30%
(88.56%, 91.80%)
9.70%
(8.20%, 11.44%)
R - Male 26.26%
(23.66%, 29.04%)
73.74%
(70.96%, 76.34%)
R - Female 24.20%
(21.50%, 27.14%)
75.80%
(72.86%, 78.50%)
Age and Gender
D - Male - 18 to 34 90.12%
(85.00%, 93.62%)
9.88%
(6.38%, 15.00%)
D - Male - 35 to 54 88.14%
(83.76%, 91.46%)
11.86%
(8.54%, 16.24%)
D - Male - 55+ 90.56%
(87.56%, 92.90%)
9.44%
(7.10%, 12.44%)