Last month, we gave you 5 tips for cleaning your data to make sure your data analytics run as smoothly as possible. Today, I’m going to add 5 more tips to help you get your data into great shape.
- Checking your spelling. It’s hard to spot errors, especially when you are working with mind-numbingly huge amounts of information. While a good spellcheck program will still miss errors such as sound-alike words or typos that create a correctly spelled (yet wrong) word, it will catch most spelling errors in your data. It only takes a few minutes to run and can save you time and frustration later on. You can also ensure things like company names, product names, and other corporate data are always spelled the same way by adding them to a custom spellcheck dictionary.
- Merging and splitting columns. Often when data is coming from several data sources, the information in a given column needs to be split into multiple columns or merged back into a single column, but doing so can be problematic if the data doesn’t easily split or merge cleanly. If merging and splitting operations are part of your data process, you’ll need to keep a sharp eye out to make sure it goes smoothly every time.
- Standardizing fields with input filters. Using standard formats for postal codes, phone numbers, and other data will greatly enhance the accuracy of your reporting, but making sure everyone knows how to enter the data in the appropriate format can be a little trickier. By using input filters to force a data field to be entered a particular way, you don’t need to worry about whether or not the staff manning your data entry points got the message.
- Removing spaces and nonprinting characters. Spacing is a pesky problem, and not just when data is being copied and pasted from one source to another. Copying often carries with it unexpected extra spaces, soft returns, or other nonprinting characters that need to be weeded out of your data. But what about spaces in other places? Mr. James VanEeden and Mr. James Van Eeden may seem like the same person, but are they really? They aren’t – at least as far as your data is concerned, and that can distort your results.
- Having someone else do it. Struggling to get the best results? Using a 3rd party provider to clean your data for you can be a great solution, too.