In order for the data to be useful, it needs to be purged from its problems.
Iteration
Usually, data need to be cleaned in iterations: after resolving a particular data problem, it usually unhides problems that lie deeper.
Misc
Lakshmanan, Sadri, and Subramanian proposed 1996 an extension to
SQL (
SchemaSQL) that allows to operate on messy datasets,
Raman and Hellerstein provides a framework for cleaning datasets («Potter's Wheel») (2001)
Kandel, Paepcke, Hellerstein and Heer developped an interactive tool with a friendly user interface which automatically creates code to clean data (2011).