HOW TO PREP TO SQL – SUB-QUERIES

In this installment of the How to Prep to SQL series I’m going to review how you would translate a Prep flow into SQL using a sub-query. A sub-query is a nested query within a SQL statement. All of the examples below use Tableau Prep version 2020.2.3 and MS SQL Server and utilize the Tableau Superstore data set.

In this example I want to return all of the orders from Superstore where the discount is greater than the average discount for all the orders.

There are two different ways I would do this in Prep.

  • Connect to the Superstore data set
  • Insert a clean step and keep only the order ID & discount fields
  • Add an aggregate step and calculate the average discount
  • Join the aggregate and the first clean step together where the discount from the clean step is greater than the average discount

The final flow will look like this

The second approach in Prep is to use a LOD (level of detail calculation) in the first step. The calculation is {FIXED : AVG([Discount])}. Then create another calculation to evaluate if the discount is greater than the average discount [Discount] > [Avg Discount] and then right click on True and select keep only.

The final flow will look like this:

To do this in SQL, first, I am going to write a select statement to return the order id & discount fields from my orders table in the Superstore database:

SELECT
O.[Order ID],
O.[Discount]
FROM Superstore.dbo.Orders$ O

To get the average discount I am going to use a sub-query within the select statement. The sub-query is a select statement that will calculate the average discount from the orders table

SELECT
O.[Order ID],
O.[Discount],
(SELECT AVG([Discount]) FROM Superstore.dbo.Orders$) AS [AVG DISCOUNT] –this is the sub-query
FROM Superstore.dbo.Orders$ O

To limit the results to just the orders that are over the average discount we need to add a where clause that limits the orders to just those where the discount was above the average discount. We’ll do this by saying the discount is greater than the sub-query that calculates the average discount. (in the first prep example this was done through the join. in the second prep example this was done by creating the boolean field and then right clicking on true and selecting keep only)

The query is:

SELECT
O.[Order ID],
O.[Discount],
(SELECT AVG([Discount]) FROM Superstore.dbo.Orders$) AS [AVG DISCOUNT]
FROM Superstore.dbo.Orders$ O
WHERE O.[Discount] > (SELECT AVG([Discount]) FROM Superstore.dbo.Orders$)

PrepSQLSelect3

This is one example of how you could use Prep to help you understand sub-queries. If you are interested in another example or have any other questions on how to learn SQL by using Prep please let me know!

HOW TO PREP to SQL – DATES

This is the second post in a series on how you can use Tableau Prep to learn SQL. In this post I’ll go over how to create custom dates in Prep & SQL. This may also be handy for people who know SQL and aren’t as familiar with the Tableau date functions. The SQL and Prep examples are all based on the Tableau Superstore data set. The SQL is created for MS SQL Server, if you are using a different flavor of SQL the syntax may be different.

The beauty of Prep is that there is point and click functionality that makes creating date dimensions easy. Need to change a timestamp to a date? Click on the field type icon in the column and change it to a date. Need to change a date to a year number? Click column that has your date field and click on the … Click on the menu that says Convert Dates and you’ll see these options. Note if you want to maintain the date field duplicate the field and do the transformation on the new duplicated field.

this is an image of a convert date menu in Tableau Prep. The options are date & time, year number, quarter number, month number, week number, and day of the month
build in convert date options in Tableau Prep

 

If you’re wondering how to create these date dimensions in SQL, the table below will have a snippet of code. The full query is listed below the table.

Date Type (field type returned)TableauSQL
Timestamp to Date (Date)DATE([Order Date]) or click on the field type and select dateCAST(O.[Order Date] AS DATE) AS ORDER_DATE
Day of Month (Number)DAY([Order Date])DAY(O.[Order Date]) AS DAY_NUM
Day of Year (Number)DATEPART(‘dayofyear’,[Order Date])DATENAME(dy,O.[Order Date]) AS DAY_OF_YR_NUM
Epoch Date (Number)DATEDIFF(‘second’,#1970-01-01#,[Order Date])DATEDIFF(s, ‘1970-01-01 00:00:00’, O.[Order Date]) AS EPOCH_DATE
Day of Week Name (Text)DATENAME(‘weekday’,[Order Date])DATENAME(WEEKDAY,O.[Order Date])
Day of Week (Number)DATEPART(‘weekday’,[Order Date])DATEPART(dw,O.[Order Date])
Week Start (Date)DATE(DATETRUNC(‘week’,[Order Date]))DATEADD(dd, -(DATEPART(dw, O.[Order Date])-1), O.[Order Date]) AS WEEK_START_DATE
Month (Number)MONTH([Order Date]) or use convert date menuMONTH(O.[Order Date]) AS MONTH_NUM
Month Name (Text)DATENAME(‘month’,[Order Date])DATENAME(MONTH,O.[Order Date]) AS MONTH_NAME
Last Day of Month (Date)DATE(DATEADD(‘day’,-1,
DATEADD(‘month’, 1, DATETRUNC(‘month’,[Order Date]))
))
EOMONTH(O.[Order Date]) AS LAST_DAY_OF_MONTH
First Day of Month (Date)DATE(DATETRUNC(‘month’,[Order Date]))DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1)) AS FIRST_DAY_OF_MONTH
Max Month in Data Set (Date){FIXED : MAX(DATE(DATETRUNC(‘month’,[Order Date])))}(SELECT MAX(DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1))) FROM Superstore.dbo.Orders$ O) MAX_MONTH
Min Month in Date Set (Date){FIXED : MIN(DATE(DATETRUNC(‘month’,[Order Date])))}(SELECT MIN(DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1))) FROM Superstore.dbo.Orders$ O) MIN_MONTH
Today’s Month (Date)DATE(DATETRUNC(‘month’, TODAY()))DATEADD(DAY, 1, EOMONTH(GETDATE(), -1))
Year (Number)YEAR([Order Date]) or use convert date menuYEAR(O.[Order Date]) AS YEAR_NUM
Today’s Year (Number)DATEPART(‘year’,TODAY())YEAR(GETDATE()) AS TODAY_YEAR
Current Date Time (Timestamp)NOW()GETDATE() AS TODAY_DATETIME
Quarter Number (Number)DATEPART(‘quarter’, [Order Date])DATEPART(qq,O.[Order Date]) AS QUARTER_NUM
Quarter Start (Date)DATETRUNC(‘quarter’,[Order Date])CAST(DATEADD(qq, DATEDIFF(qq, 0, O.[Order Date]), 0) AS DATE) AS QUARTER_START_DATE
Quarter End Date (Date)DATEADD(‘day’,-1,DATEADD(‘quarter’,1,DATETRUNC(‘quarter’,[Order Date])))CAST(DATEADD(d, -1, DATEADD(q, DATEDIFF(q, 0, O.[Order Date]) +1, 0)) AS DATE) AS QUARTER_END_DATE

This is the query in SQL:

SELECT
O.[Order ID],
CAST(O.[Order Date] AS DATE) AS ORDER_DATE,
DAY(O.[Order Date]) AS DAY_NUM,
DATENAME(dy,O.[Order Date]) AS DAY_OF_YR_NUM,
DATEDIFF(s, ‘1970-01-01 00:00:00’, O.[Order Date]) AS EPOCH_DATE,
DATENAME(WEEKDAY,O.[Order Date]) WEEKDAY_NAME,
DATEPART(dw,O.[Order Date]) AS WEEKDAY_NUM,
DATEADD(dd, -(DATEPART(dw, O.[Order Date])-1), O.[Order Date]) AS WEEK_START_DATE,
MONTH(O.[Order Date]) AS MONTH_NUM,
DATENAME(MONTH,O.[Order Date]) AS MONTH_NAME,
EOMONTH(O.[Order Date]) AS LAST_DAY_OF_MONTH,
DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1)) AS FIRST_DAY_OF_MONTH,
(SELECT MAX(DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1))) FROM Superstore.dbo.Orders$ O) MAX_MONTH,
(SELECT MIN(DATEADD(DAY, 1, EOMONTH(O.[Order Date], -1))) FROM Superstore.dbo.Orders$ O) MIN_MONTH,
DATEADD(DAY, 1, EOMONTH(GETDATE(), -1)) AS TODAY_MONTH,
YEAR(O.[Order Date]) AS YEAR_NUM,
YEAR(GETDATE()) AS TODAY_YEAR,
GETDATE() AS TODAY_DATETIME,
CAST(GETDATE() AS DATE) AS TODAY_DATE,
DATEPART(qq,O.[Order Date]) AS QUARTER_NUM,
CAST(DATEADD(qq, DATEDIFF(qq, 0, O.[Order Date]), 0) AS DATE) AS QUARTER_START_DATE,
CAST(DATEADD(d, -1, DATEADD(q, DATEDIFF(q, 0, O.[Order Date]) +1, 0)) AS DATE) AS QUARTER_END_DATE,
FROM Superstore.dbo.Orders$ O

 

Hope that you found this helpful.