Data cleaning | Datice.is

Checklist for cleaning of data files Have all direct personal identifiers been removed from the data file (names of individuals, social security numbers, e-mail addresses, etc.)? Does the data include any indirect identifiers, and if so, how much personal disclosure risk is there due to such identifiers?
Unnecessary variables Have all unnecessary or inappropriate variables been removed from the version of the data file that is to be published?
Clear and descriptive variable names Have all variables been given clear and descriptive names (e.g., Q1, Q2, etc.)? Do all variables have clear and descriptive labels (e.g., the question asked, or a short description of its content)? Have appropriate values been specified for each variable (e.g., 1 = Never, 2 = Sometimes, 3 = Often, 4 = Always, 99 = Missing)?
Spelling and typing errors Are there any spelling or typing errors, for example in the variable labels or values? How about in free-text variables (string variables)?
Missing values Have missing values been coded in an appropriate manner?
Order of variables Are the variables in the data file presented in a logical order? Sometimes it may be useful to group variables together based on their content or focus, especially when a dataset contains many variables.
Credibility Are any unusually high or low values present in the data which seem unlikely (e.g., a salary figure of one individual has one too many zeros)? Is there any repetition in the data that doesn't make sense (e.g., double entry of some participants)?
Weighting of data Is the data weighted (is there a "weight variable" in the dataset)? Does the weighting variable contain a descriptive label (e.g., on which grounds the data was weighted)?