I've not been posting anywhere near as often as I would like, but in my defence I've have been rather busy trying to keep clients happy and make some progress on some new projects. So to get things rolling again here, I thought I'd write the first of a series of posts on dates. Don't worry, not the relationship kind, I'm not building a speed dating QlikView app, I tried but couldn't get hold of any data.
Handling date and timestamp data in QlikView seems to be one of the most common things people have difficulties with. It need not be difficult though and once you understand how dates are stored and a few simple facts about QlikView's inbuilt date and time functions you should have no problem at all. An important thing to note is that how QlikView stores dates is very similar to how most popular databases store them also and so the explanation I'm going to give here can be applied to some other technologies also.
The first thing you need to know about dates in QlikView is that they are really stored as a dual value. This value contains both a string (such as "01/01/2012") representing the date and number (such as "40909") defining the number of days since the epoch date. Dates prior to this base date are simply stored as a negative number of days. A common misconception is that the epoch date is 1st January 1900. What the epoch date is doesn't really matter to us when working with dates but for those that want to know, try creating a text object in QlikView and add the following expression:
=num(makedate(1900,1,1))
The answer you get back should be 2 and from that we can see that the QlikView epoch date is in fact the same as the Oracle epoch date, 30th December 1899.
What does matter to us when working with dates is being able to calculate simple things such as the number of days between 2 dates. Thankfully this is simple because of the way QlikView stores dates, we can simply subtract one date from another to get our answer. QlikView is intelligent enough to know to use the number part of the stored date and the answer will give us the number of days between the two dates.
Timestamps are handled exactly the same as dates with one key difference, rather than being stored as an integer they include a decimal indicating the time as a fraction of a day. For example, today's date (13th July 2012) would be stored as an integer of 41103 and so midday today (12:00pm) would be 41103.5 (midday being 0.5 of a day). 5:15pm today would be 41103.71875 and so on. Just like date, timestamps are stored as a dual value.
A common thing to want to do is to create a data from a timestamp, effectively removing the time. At first glance this might seem simple as follows:
date(MyTimestampField)
On the surface this appears to have worked correctly returning a date such as "13/07/2012". However, the date() function in QlikView is only a formatting function and thus affects only the string within the dual value stored. For example, midday today would be stored as both an numerical value of 41103.5 and a string "13/07/2012 12:00:00". Using the date function would not change the number but would change the string as expected to "13/07/2012". The problem with this is that when displaying values, QlikView will use the string if one exists, but when comparing values, QlikView will take the numerical value as priority. Because of this, two timestamps of different times on the same day may look the same when displayed, but will not be considered the same when QlikView is asked to compare them.
There is a simple solution to this problem, in fact there are 2 solutions. The first is to use the daystart() function which sets the time part of the timestamp (the decimal) to midnight (0). There is one issue with using the daystart() function and that is that it sets the string within the dual value as if it was a timestamp such as "13/07/2012 00:00:00". This can be solved by using the date() function to change the string formatting to just the date like this:
date(daystart(MyTimestampField))
The alternative method to convert a timestamp to a date is a little more old school but effectively does exactly the same thing. Because a timestamp is stored as a number with a decimal, we can use the floor() function to remove the decimal and thus remove the time. However, like using the daystart() function, the floor function outputs a number without a string representation of the date and so we must again us the date function to format it correctly in to a dual value like this:
date(floor(MyTimestampField))
Which method you choose to use is completely up to you, there is no noticeable efficiency benefit of one method over the other.
In the next part we'll look at what to do if your database isn't storing proper dates and how to use the date interpretation functions date#, timestamp#, makedate and maketime to solve this problem.