Data Cleaning Using MS Excel
Last time we discussed briefly about data collection methods. Once we have collected the data, we need to edit it with the aim of making it clean for the next step of analysis. We shall assume that after collection, our data was entered into MS Excel. Editing is the process of examining or scrutinizing data in order to identify any errors, mistakes or omissions. Under this process, we focus on the completeness of data (no missing values), distribution of the data and extreme or outliers in the data. Some of the errors that that we may have to deal with under this process are sampling errors, Non-sampling errors, biased errors, non-biased errors, and positive or negative errors. Sampling errors occur due to the type of sampling method used during the collection of data. Sampling errors mean the difference between the estimate of a value as obtained from the sample and the actual value. Non-sampling errors take place if randomization was not used during sample selection. Biased errors...