#PreppinData

I just finished week 2 of the  Preppin Data challenge and wanted to walk through my approach. One of the things I love about Tableau and Tableau Prep is that there are a number of different ways to get at the same result.

This week Carl & Jonathan gave us a file that had a big header, names that needed to be cleaned, and metrics that needed to be moved to columns. The output needed to be 6 columns and 14 rows.

After setting my connection to the file the first thing I did was check the Use Data Interpreter box. This helper removed the unnecessary header at the top of the file.

PreppinWeek2_A
check the data interpreter box to remove the unnecessary header

Whenever I built something in Tableau Prep I like to always add a clean step after my connection to get a sense of what is in the data. When I did this I noticed that my city field had a value called “city”. I knew from looking at the initial file that this was a secondary header so I right clicked on the value of city and selected exclude.

PreppinWeek2_B
remove the secondary header row

At this point I also added an aggregate to see how many rows were in my data set. I like to add these as I build out a flow to get a sense of how my record counts change as I build out different steps.

I added another clean step and I did this because I like to partition out my changes when I build something new (I’m quirky). I could have done these all in the first step. In this step I grouped the various city names by pronunciation This took care of all but two values. I edited the group and manually added “nodonL” to London and “3d!nburgh” to Edinburgh. In this step I also created the new header field which combined the metric and the measure and then removed those fields as they were no longer needed.

PreppinWeek2_C
used pronunciation group by with a manual add, created new field and removed unnecessary fields

The next step was to move the values from the rows to columns. This is done in a pivot step. Most of the people I help with Prep think Pivot = Pivot table and are confused when they add that step. Pivot will reshape your data.  My pivoted field is my new field that I created in the prior step and my field to aggregate is the value field.

PreppinWeek2_D
this moves the data from rows to columns

At this point I also added an aggregate step to make sure I had 14 rows as the instructions called for. This is the full view of my flow.

PreppinWeek2

Thanks for reading and happy preppin!

Published by

Kate Brown

golfer, yogi, history major data loving person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s