This project is based on a fictional dataset generated by Robert Dempsey. You should absolutely read the blogpost he wrote to describe his project.
We are going to use the visual preparation recipes in DSS to rework and clean a list of contacts we have so we can make something of it.
We want to:
- clean the names of the contacts and separate first names and last names
- clean phone numbers
- clean postal addresses.
Explore This Sample Project
Take a look at the data pipeline (the flow) to see the successive cleaning recipes.
Note that even though in this project we seperated all of the steps to make the project readable, you can put them all in a single recipe to increase performance.
Look at the recipes to see precisely how to clean the addresses.