Thursday, June 18, 2009

Cleaning your data: A free excel spreadsheet with VBA code to screen your data

I have previously written a post on ensuring that the historical data you use is of reasonable quality. The quality of your data is potentially worsened if your dataset is pieced together from multiple sources, which is not surprising, given that I have spotted a few firms selling data that is older than them. To help myself clean or scrub my data, I created an excel spreadsheet to automatically screen for potential data errors. It is also available free to you through this blog.

1) While it is definitely far from perfect, it is unprotected which means you are free to edit the VBA code for your own purposes.

2) Without additional VBA coding, it can read most indicative data types, i.e. data with or without the volume, high, lows and open prices.

3) The output is calculated using excel functions and formulas – not hard coded. Hence, you can edit the formulas without further coding to customize your results.

4) It does not amend your data. It is up to you to filter them and decide for yourself whether to remove the data points that the spreadsheet flags out.

Screen Shot 1:

Screen Shot 2:

There are two ways you can use the file. You can paste your data into the file, click onto the “Try” button, select where you left your data and a new spreadsheet will be generated with the output. Alternatively, you can open the Visual Basic Editor (Alt+F11) and copy the entire code over to your spreadsheet.

There are only 3 main restrictions that I can think of which will require you to edit the VBA code. Your data must be arranged in rows, not columns. And the first and second columns must contain the day and time information respectively. Finally, real tick-by-tick and quote-by-quote data is inconsistent in frequency due to its nature. Hence, such data will show an especially high error rate in my spreadsheet, unless you change the formula.

If you find this post or tool useful, you can help me by promoting my blog to your friends or sharing an article or tool that you find useful or providing feedback on this tool to finance4traders@gmail.com

Download

Alt site: http://sites.google.com/site/finance4traders/

Notes: You need to enable macro to let it work. Please be informed that I am not liable for any damage or losses out of this spreadsheet and no warranty is provided. Ironically, do not worry about viruses. I sent this file to VirusTotal to be scanned by more than 20 antivirus software engine before I uploaded it.

Related Articles: What is good quality historical data?


No comments:

Post a Comment