Whenever information is being gathered, there is a potential for problems to crop up with “dirty” data. Extra spaces, misspellings, city names written in ALL CAPS in one data source and Title Case in another, multiple spellings for the same item,
Many years ago (I won’t tell you how many), my high school computer science teacher first introduced me to the concept of GIGO. It stands for Garbage In, Garbage Out and what it means is your results are entirely dependent on the quality of your beginnings.
Last month, we gave you 5 tips for cleaning your data to make sure your data analytics run as smoothly as possible. Today, I’m going to add 5 more tips to help you get your data into great shape.
Dirty data – it’s not your friend. No, I’m not talking about things that don’t belong in the workplace! I’m talking about informational data sources that have become riddled with misspellings and extra spaces, variations in cases, and three different ways to write the same address.