#PreppinData

I just finished week 2 of the  Preppin Data challenge and wanted to walk through my approach. One of the things I love about Tableau and Tableau Prep is that there are a number of different ways to get at the same result.

This week Carl & Jonathan gave us a file that had a big header, names that needed to be cleaned, and metrics that needed to be moved to columns. The output needed to be 6 columns and 14 rows.

After setting my connection to the file the first thing I did was check the Use Data Interpreter box. This helper removed the unnecessary header at the top of the file.

PreppinWeek2_A
check the data interpreter box to remove the unnecessary header

Whenever I built something in Tableau Prep I like to always add a clean step after my connection to get a sense of what is in the data. When I did this I noticed that my city field had a value called “city”. I knew from looking at the initial file that this was a secondary header so I right clicked on the value of city and selected exclude.

PreppinWeek2_B
remove the secondary header row

At this point I also added an aggregate to see how many rows were in my data set. I like to add these as I build out a flow to get a sense of how my record counts change as I build out different steps.

I added another clean step and I did this because I like to partition out my changes when I build something new (I’m quirky). I could have done these all in the first step. In this step I grouped the various city names by pronunciation This took care of all but two values. I edited the group and manually added “nodonL” to London and “3d!nburgh” to Edinburgh. In this step I also created the new header field which combined the metric and the measure and then removed those fields as they were no longer needed.

PreppinWeek2_C
used pronunciation group by with a manual add, created new field and removed unnecessary fields

The next step was to move the values from the rows to columns. This is done in a pivot step. Most of the people I help with Prep think Pivot = Pivot table and are confused when they add that step. Pivot will reshape your data.  My pivoted field is my new field that I created in the prior step and my field to aggregate is the value field.

PreppinWeek2_D
this moves the data from rows to columns

At this point I also added an aggregate step to make sure I had 14 rows as the instructions called for. This is the full view of my flow.

PreppinWeek2

Thanks for reading and happy preppin!