The topic for the first #SportsVizSunday of 2020 is personal health data. I took some leeway with the topic and looked at my golf handicap index and scores. I normally walk the golf course and golf impacts my mental health (sometimes positive and sometimes negative). There were a few times this year where I thought about buying a boat.
For #SportsVizSunday, I wanted to look at where my index fell in relation to other women who keep a handicap and highlight the scores that count towards my current index. As with most public work I do, I like to keep it simple. I spend a lot of time during the week working on dashboards so in my free time I tend to keep it light and simple.
The 2019 season was a bit all over the place for me. I struggled with my irons for the last two seasons and that definitely impacted my score. While that aspect was off the rest of my game was in good shape and that helped me get my handicap index down to an 18.4.
I play most of my golf at two different courses and wanted to see what my score and differentials looked like at those two courses. I felt like I played better at Furnace Brook because I hit my fairway woods and hybrid more than I hit my irons. The data backed that up. I scored better (based on differential) at Furnace Brook than at William J Devine.
In 2020 I’m going to track more of my golf stats and visualize them to see where I can get better. I know where I struggle with my game, but, seeing the data makes it a bit more real.
This is the initial view of the data. Based on the naming convention of the Item my first thought was to split off the number portion of the field and use that as a way to create a hierarchy.
I used the custom split tool to parse the field off a space and took the first field. I trimmed any extra spaces and renamed this field Item Id. These are the calculations:
I then created by hierarchy levels taking the left X number of characters from the new item id field. My thought was I would use these to get the totals & subtotals. These are the fields I created for the hierarchy:
I also created a new field with the indented item names:
Format Item Name
IF LEN([Item ID]) = 2 THEN [Item]
ELSEIF LEN([Item ID]) = 3 THEN SPACE(5) + [Item]
ELSEIF LEN([Item ID]) = 5 THEN SPACE(10) + [Item]
After I had my levels I then created two aggregates to get the totals and subtotals. The first one sums the profit by my new top level field and the second one sums the profit by the new second level field. I joined both of these aggregates back to the prior step where the top level in to totals aggregate = the item id & where the second level = the second level.
The last step is the clean up step. I this step I have 11 changes to the joined data.
remove the duplicated fields from the joins.
merge the profit field from the initial step with the profits from the aggregates
rename the merged profit fields to profit
created a calculated field to get the length of the item id field to sort my rows correctly
renamed the Format Item Name field to item name
removed any remaining unnecessary fields
This was a great challenge to kick off the 2020 #PreppinData series. I love the formatting idea from Ryan and have a few ideas of how I can implement both Ryan’s table and this PreppinData challenge in my day to day work.
If anyone is interested in getting a copy of my flow please let me know. I am more than happy to share my approach.
There are usually a number of different ways to get to the same result in Tableau & Tableau Prep. Week 9 of #PreppinData is another example of this.
For this edition of #PreppinData we looked at Chin & Beard Suds Co’s Twitter complaints. We were given a list of complaints and asked to:
Remove Chin & Beard Suds Co Twitter handle
Split the tweets up in to individual words
Pivot the words so we get just one column of each word used in the tweets
Remove the 250 most common words in the English language (sourced from here for you: http://www.anglik.net/english250.htm)
Output a list of words used alongside the original tweet so we know the context for how the word was used.
Here’s the flow I created
The first clean step splits the text on the space character. The next step pivots all of those splits back together in one column. In this pivot I used a wildcard pivot where the field name contained Tweet –
After consolidating the splits, I did a few clean steps to get the two sources ready for joining. Anytime that I join on text I always make sure to trim all the extra spaces and make the text fields either upper or lower case. I think this is a good habit to be in for text matching. I also excluded the company Twitter handle and any null records in this step.
Now it is time to join them together. The first join I did was an outer join between the tweets and the list of the 250 words. In the step after the join I kept only the records from the 250 list that were null (I used the rank field).
The other way this join can be done is with the left unmatched only join type. When you use this join type all you need to do is remove the two fields from the 250 list.
Initially I didn’t think of the other join type and found a way to get to the final result. Going back and looking at the joins the second option is probably the better way to go. There isn’t a wrong and right way to do it just different approaches.
I just finished week 2 of the Preppin Data challenge and wanted to walk through my approach. One of the things I love about Tableau and Tableau Prep is that there are a number of different ways to get at the same result.
This week Carl & Jonathan gave us a file that had a big header, names that needed to be cleaned, and metrics that needed to be moved to columns. The output needed to be 6 columns and 14 rows.
After setting my connection to the file the first thing I did was check the Use Data Interpreter box. This helper removed the unnecessary header at the top of the file.
Whenever I built something in Tableau Prep I like to always add a clean step after my connection to get a sense of what is in the data. When I did this I noticed that my city field had a value called “city”. I knew from looking at the initial file that this was a secondary header so I right clicked on the value of city and selected exclude.
At this point I also added an aggregate to see how many rows were in my data set. I like to add these as I build out a flow to get a sense of how my record counts change as I build out different steps.
I added another clean step and I did this because I like to partition out my changes when I build something new (I’m quirky). I could have done these all in the first step. In this step I grouped the various city names by pronunciation This took care of all but two values. I edited the group and manually added “nodonL” to London and “3d!nburgh” to Edinburgh. In this step I also created the new header field which combined the metric and the measure and then removed those fields as they were no longer needed.
The next step was to move the values from the rows to columns. This is done in a pivot step. Most of the people I help with Prep think Pivot = Pivot table and are confused when they add that step. Pivot will reshape your data. My pivoted field is my new field that I created in the prior step and my field to aggregate is the value field.
At this point I also added an aggregate step to make sure I had 14 rows as the instructions called for. This is the full view of my flow.
I think creating blog posts/videos are a great way to learn something so I decided to create a series of videos on Tableau Prep. The first one has been published is how to create a union. Hope you enjoy it!