Adventures in DateParse
One of the things I was most excited for with the launch of Tableau 8.1 was the new DATEPARSE function. I just knew that someday this magical function would make my life easier. Today was that day!
I’m getting ready to build a visualization based on the playlist of my favorite local radio station, KEXP. They have recorded every song they have played since 2001. It’s pretty amazing. I scraped the playlist for all the songs they played in 2013 (over 300k!) using my favorite web scraping tool, import.io.
From the playlist table scraper I built, time was included in the table, but not date. However, the date for each row can be derived from the source page URL, which always contains the date and hour. Here’s what the data looks like:
I thought about how I would try to extract the date from that string in Excel, some combination of MID, DATE, and other functions) but then I realized that this is the PERFECT use case for using DATEPARSE. To start, I needed to extract just the date part of the URL string. I did that using the MID string function. I haven’t used string or date calculations very much, so I did them as two separate calculations at first, just so I could check that it was working as I expected to. The MID calculation takes the string that you specify as the first argument, in this case my source URL, and takes out a chunk of text starting on the character number that you specify in the second argument. There is an optional third argument, in which you specify how many characters you want to select, but in this case the length of characters varies, so by leaving this third argument out, I’m basically saying “Grab everything to the right of the 30th character.”
I put this calculation on the rows shelf just to do a sanity check and make sure the strings came out as I expected them to.
Next is the fun part! DATEPARSE is super easy to use and incredibly flexible. The first argument, you specify what the format is. In this case, it’s “yyyy/MM/dd/hha”, where y is year, M is month (not to be confused with m which is minute!), d is day, h is hour, and a specifies “period” or AM vs PM. You can find a reference for unicode date symbols here. The great thing about DATEPARSE is that even though my months are only one digit and my days can be one or two digits, it still totally understood what I was doing. Here’s what the calculation looked like:
And here is another sanity check with the date string pulled from the URL on the left and the actual parsed date on the right!
Yay! It works! Since I know that it works now, I made it more efficient by combining these two calculations into one.
And now I can do all kinds of fun stuff with the dates! In conclusion, DATEPARSE freaking rules and doing all your datetime formulas in Excel is totally outdated.