Record of American Democracy Project Data Frequently Asked Questions

The following are a some frequently asked questions about the Record of Democracy Project Data. Most of this information, and much more, is in the full codebook on our documentation page.

## General Questions

3. ### What is the Record of American Democracy?

4. Our Record Of American Democracy (ROAD) data include election returns, socioeconomic summaries, and demographic measures of the American public at unusually low levels of geographic aggregation. The NSF-supported ROAD project covers every state in the country from 1984 through 1990 (including some off-year elections). One collection of data sets includes every election at and above State House, along with party registration and other variables, in each state for the roughly 170,000 precincts nationwide (about 60 times the number of counties). Another collection has added to these (roughly 30-40) political variables an additional 3,725 variables merged from the 1990 U.S. Census for 47,327 aggregate units (about 15 times the number of counties) about the size one or more cities or towns. These units completely tile the U.S. landmass. This collection also includes geographic boundary files so users can easily draw maps with these data.

7. ### How do I cite the data or codebook?

8. Gary King; Bradley Palmquist; Greg Adams; Micah Altman; Kenneth Benoit; Claudine Gay; Jeffrey B. Lewis; Russ Mayer; and Eric Reinhardt. 1997. The Record of American Democracy, 1984-1990,'' Harvard University, Cambridge, MA [producer], Ann Arbor, MI: ICPSR [distributor].

11. ### Where can I get the data from this project?

Data for the project is also catalogued and available through ICPSR. (Every member institution that requests the data will receive it, free of charge. Additional CD's, and CD's for non-member institutions are available for a fee.)

13. ### Where can I get similar data for other years or other countries?

14. To our knowledge, however, no exact counterpart to the ROAD project exists elsewhere, nor are we collecting more data.

Data for more recent years for California is available from IGS (see below). Precinct-level election data for some other years may be available for individual states -- check with the local State Data Center

We are aware of two NSF-funded data collections that have some similarities to ROAD the Federal Elections Project and the Southern Politics Project, both available from David Lublin's data site. For more information, we suggest that you contact the investigators of that project directly.

More generally, a wealth of voting data is available through the IQSS Dataverse Network.

15. ### Where can I get more detailed geographic boundary files, or boundaries for other areas?

16. The U.S. Bureau of the Census makes boundary files available for free or at low cost. Geolytics supplies similar coverages that are enhanced and that have been converted to a number of popular formats. Note that these sources supply boundaries for census units such as VTD's, congressional districts, counties and tracts; we are unaware of a source of precinct boundaries.

## Data, File Formats and Such

17. ### How do I get started using the data?

18. Please see the quick start.

19. ### Why won't {stata,spss,sas} read my file?

• Our original files are supplied in SPSS portable file, which is readable by SPSS (from the file menu, or using the "import" command.) SAS can also read if you use the SPSS engine. ( LIBNAME libref SPSS <'filename'>)
If your software cannot read spss portable files, use our "subset" function to retrive data in another format.
• If you created a subset, check the subset options to make sure of the format, since several are available.
• Particular statistics programs may have difficulty with large numbers of variables and/or records. Excel and Stata are known to have such problems. If your statistics programs fails on large subsets, please try again using an industrial-strength statistics package like SPSS or SAS.
• You may also need to add the appropriate 3-letter extension {dta,por,xpt} to make the statistics program recognize the file.
20. ### Why won't ArcView display my shape file?

21. ArcView requires a trio of {.shp,.shx,.dbf} files for each state dataset. These should have the same 8-letter name, and should be in the same directory.

23. The ".dbf" format that Arcview uses for tables is limited to this number. You can import tables into ArcView as comma-separated-values, which avoids this limit, but ArcView will treat these tables as read-only.

24. ### What are the scale and projection of the ArcView coverages?

25. All of the maps are NAD 1983, unprojected lat/lon in decimal degrees. All are at a scale of 1:100:000.

## Data Idiosyncracies

26. ### What are the different data files for, which do I use?

27. Most people will use the MCDgroup data files, which combine census data and political data, and the boundary files, which supplement the MCDgroup files by providing geographic boundaries for them. For more information on these and other files, check the data overview section of the documentation.

28. ### What is an MCDgroup?

29. The smallest geographical unit that combines census tracts and voting precincts without splitting either. For more information, see the "Units of Analysis" section of the documentation.

30. ### How do I make sense of FIPS and CENSUS codes?

31. Most of the geographic codes used in the ROAD project are CENSUS codes, not FIPS codes.

The census code appendix (attached below) contains a comprehensive list of codes. The original can be obtained from the U.S. Dept. of Census Tiger 95 Documentation.

32. ### How do I merge ROAD data with data collected at other levels of geographic aggregration?

• States or Counties -- MCDgroup fit cleanly into states and counties. So one can aggregate the MCDgroup data to the State or County level, and merge the resulting aggregate data.
• MCDS, CCDS, Tracts, Blocks -- These fit into MCD groups. So one can aggregrate the MCD/ccd/tract/block data to the MCDgroup level and merge the results.
• Legislative and Congress Congressional Districts -- In general, the MCDgroup data does not align well with district lines. The precinct level data contains district identifiers, and can be aggregated on this basis.
• VTD's -- In theory, it is possible to match VTD's and precinct level data (roughly, and for most states) using the Census 1990 Public Law (P.L.) 94-171 Data (ICPSR 9516). If the SAC10 field indicates a VTD record, the ANPSADI field contains the list of precincts. Although the format of this list is, as far as we can tell, undocumented, and varies from state to state, it often makes sense after study.
• Other Geographies -- Merging with other geographies, such as VTD's, towns, school districts, and newer/older census geographies is likely to be a painstaking, approximate, and judgement-laden project. Please let us know of successful projects.
- At minimum, you will want to examine the key, and pkey files, which show the matching among counties,MCDgroups,MCD's and precincts.
- You may also find the Geographic Correlation Engine at the University of Missouri Data Center quite useful for examining the overlap among various census geographies.
33. Merging depends on finding a common level of aggregation for both datasets. This , in turn, depends on the level of aggregration within which your data was collected:

34. ### Are there any idiosyncracies in the data?

Yes. The most significant are:

• Missing Data Codes. Missing values are set to system missing in SPSS. However in a number of cases, it appears that missing precincts were miscoded as zero. Be aware that in some precincts, it is not possible to ascertain definitively whether a 0 code indicates no votes or a missing value. In most cases, this has little consequence when the data are aggregated into MCD-groups, however . However, before using the data one should consult the table of potentially miscoded precints (attached below).

( Most cases with mixed 0s and missing data codes are artificial precincts'' added to the data set to represent absentee ballots, or split actual precincts (i.e., those which contain voters for more than one district)
• High Levels of Turnout.This has been observed in a few states and years, and is probably a corollary of the missing data codes. Rarely, a significant number of registration entries were coded as 0, rather than missing, but the vote variables were not missing. When these are aggregated to MCDgrps the result is that it is possible for the turnout to be greater than 100% of registration. If the precinct level data shows 0 registration and positive turnout, suspect this as the cause.
• Matching Precincts to MCDgroups. Great care was taken to match precincts to MCDgroups, and the process is explained in detail in the full documentation. Still, some precincts could not be matched, or were questionable. All exceptions are detailed in the codebook and the individual exceptions files (contained in the full documentation)
• Matching Precincts to Other Units Precinct-level data contains codes that reference other geographic levels. These were supplied in the original data files, and no attempt has been made to validate data aggregated to other units than MCDgroups.
35.

36. What's the deal with california?

The deal is that the California data has been incorporated directly from the IGS (Institute for Governmental Studies) Data Archive, at U.C. Berkeley. Hence, California is an exceptional state, the variables are somewhat different, and the aggregation level is finer than other states. We were able to incorporate data only from '92, however IGS has data for other years as well. (see Details on the the California Block Group Merge in our documentation.)

37.  I've noticed something odd, do you know about ...