Tuesday, 4 December 2012

The QlikView dual() Function

As I touched upon in the first part of my series of posts on Dates in QlikView, one of the first things developers learn about QlikView is that there are no traditional data types. What they some times don't learn is that instead QlikView stores field data as dual values containing both a string, which is used for displaying to the user, and a numerical value used for sorting and calculations.

Let's consider a simple example of a field containing calendar months, where as well as the month name, QlikView also stores the number of the month.

NumberText
1Jan
2Feb
3Mar
......

Where possible, QlikView works out the text and numeric parts automatically for many data types when loading the data. All inbuilt QlikView functions also return a dual value and when passed a dual value they automatically use either the text or numeric part depending on what type of input they are expecting. Because of this automatic behaviour, many people are totally unaware that this is how QlikView stores field data. It is however a very powerful feature which adds to the speed and efficiency of QlikView. As a developer, being able to control this behaviour and define the dual values when QlikView can't work them out automatically would also be very powerful and thankfully this is indeed possible using the dual() function.

The dual() function allows us to specify the individual text and number parts to be combined into a single dual value:

dual(MyText, MyNumber) AS MyField
As I mentioned above, QlikView automatically works out which part of the dual value to use depending upon what you are asking it to do and so we normally don't need to give it a second thought. But again we can control this behaviour using the text() and num() functions should we need override and return one specific part:

= text(MyField)
= num(MyField)

There is one important point to note about QlikView automatically assigning dual values when loading data, which confused me for a while when I first came across it and I know has caught out many others. If I load 2 records into the same field for which the text part is different but the underlying number is the same, QlikView will assign the first of the text values it comes across to all following records where the numeric value is the same. So source data that looks like this:

NumberText
4Open
5Complete
5Closed

would be converted and stored in QlikView like this:

NumberText
4Open
5Complete
5Complete

As you can see, the last text value of "Closed" is replaced with the first value QlikView reads in for the same numeric value of 5, which happens in this example to be "Complete".

Once aware that this is how QlikView behaves you can use this to your advantage however. Because it prioritises the first text value it comes across, if you control which value this is, all other values with a matching underlying numeric value will be assigned to that first value and will result in a form of data cleansing. I'll leave it up to you to experiment and find ways to leverage this.

Because of it's simplicity and a lack of understanding on how QlikView stores data, the dual() function isn't used very often and so I thought it would be good to look at some practical examples of how it can be used and the benefits it gives. Hopefully these examples will make you consider how you could used it within your own applications.

Calendar Quarters

In the next part of my series of posts on Dates in QlikView, we'll look at separating a date into it's useful component parts, one of which is calendar or fiscal quarters. When using fiscal quarters particularly, it can be confusing to the user which calendar months fall within a given quarter and so we can use the dual function to give a useful text description to the user as well as retaining the underlying quarter number to use when performing calculations. One way to do this would be to use an inline mapping table like this:

QuarterMap:
MAPPING LOAD * INLINE [
    QuarterNum, QuarterDesc
    1, Apr - Jun
    2, Jul - Sep
    3, Oct - Dec
    4, Jan - Mar
]
;

Calendar:
LOAD
    Date,
    Month,
    Year,
    dual(applymap('QuarterMap', Quarter), Quarter) AS Quarter
RESIDENT MyData;

Assuming our underlying data table has a Quarter field containing the number of the quarter, we can use the applymap() function to look up the quarter number within the mapping table and return a more meaningful description. The dual() function then combines this description with the quarter number to make a single dual value field called Quarter. When displaying the field to the users, QlikView will use the text descriptions which are a lot more meaningful. When asked to perform a calculation using the field or when sorting the quarters, QlikView uses the underlying numeric value.

Shipping Statuses

This technique for using dual() was coined "Dual Flags" by Rob Wunderlich and is both simple and powerful. The basic principal is that rather than simply using 1 or 0 as a flag, you use dual() to create a dual value flag containing a more meaningful description also. For example, let's imagine we have a set of shipping data. Each record relates to an order that has either been shipped or not. Many would be tempted to create a simple flag to indicate the shipping status like this:

if(NOT isnull(ShipDate), 1, 0) AS Shipped
Instead we can use the dual() function as follows:

if(NOT isnull(ShipDate), dual("Shipped", 1), dual("Not Shipped", 0)) AS Shipped
This creates a dual value flag which opens up many more options for how we can use it. We can add the field in a list box which will display the more logical "Shipped" rather than a simple 1. All of the following would now also be possible and valid to use within expressions and/ or the script:

if(Shipped, ....
if(Shipped=1, ....
if(Shipped="Shipped", ....
count({$<Shipping={"Shipped"}>} ShipID)
sum(Shipped)

An interesting point to note here is that we can use sum(Shipped) rather than the less efficient count({$<Shipped={"Shipped"}>}ID) or even just count(Shipped). A sum is a simpler computational calculation than a count and so when working with very large data sets, the speed improvement of using sum() can be considerable.

CRM Statuses

I load data from a CRM system and each opportunity has a status such as "Lead", "In Progress", "Qualified", etc. Like the example of Quarters above, I can use a mapping table and applymap() to give each status an underlying number as well as the text description as follows:

StatusMap:
MAPPING LOAD * INLINE [
     OpStatus, OpStatusID
     Failed, 0
     Lead, 1
     In Progress, 2
     Qualified, 3
     Won, 4
     Closed, 5
]
;

Opportunities:
LOAD
     OpID,
     OpDate,
     dual(OpStatus, applymap('StatusMap', OpStatus)) AS OpStatus
RESIDENT MyData;

As before, when displaying OpStatus to the users we see the meaningful text description. But we can also use some interesting and powerful calculations because of the underlying number. For example we can count all active opportunities using the following simple set analysis:

count({$<OpStatus={">=Lead <=Won"}>} OpID)

I hope these examples show just some of the uses for the dual() function and that you might consider the benefits of using it in your QlikView applications going forward.

5 comments:

  1. awesome. Thanks for the tips and well written examples. Really helps to see it spelled out like this.

    ReplyDelete
  2. Very deeply explained.. i was not aware of this magical behavior of dual function..
    Thanks for such a nice article

    ReplyDelete
  3. excellent article. crisp and clear, thanks.

    ReplyDelete