Data Enhancement

The goal of data processing and enhancement is to render data usable to researchers interested in accessing them after they are deposited in a repository. In the social sciences, archives add value to data by making them easier to use for secondary analysis. There is wide variation in archival practices, often depending on the condition of the data to be archived and the goals of a particular repository or discipline.

The specific steps depend on the unique characteristics of each dataset, but in general, ICPSR data processors always perform the following procedures:

They may also do the following:

  • Recode variables to address confidentiality concerns

  • Check for undocumented/out of range codes

  • Add question text to variables

  • Create variable labels

  • Create value labels

  • Identify and address foreign language characters

  • Adjust format widths

  • Optimize file size

  • Standardize missing values

  • Check for consistency and skip patterns

  • Make online analysis version with question text

  • Add variables to the Social Science Variables Database

  • Gather citations to related publications for the Bibliography of Data-Related Literature

ICPSR Curation Levels

ICPSR developed three curation levels to specify the set of curation activities to be performed on a given dataset. ICPSR's Curation Levels document describes the work performed at each level of curation and provides a side-by-side comparison of approximate time to release, work performed on the data, and what is included in the study release.