Wednesday 28 March 2012

Creating Empty Tables

I've had a couple of conversations over the last week or so on the subject of building scripts to be as robust as possible. It's a pretty large subject though and would be a huge post, so instead I've decided to pick out one topic in particular that came up and leave the rest for another day.

You'll have guessed from the title of this post that the topic I've chosen is around how to create an empty table in QlikView, and more importantly, why the hell would I want to anyway.

Let's consider a scenario, I have a number of spreadsheets containing budget data, they all have a similar structure and each spreadsheet contains a single month's budget. The file names all use the same naming convention starting "Budget" followed by the year and month for which they hold data (eg. "Budget 2012 Jan.xls").

My first thought might be to simply use a wild card load in QlikView like this:

MyBudgets:
LOAD
    *
FROM [Budget*.xls] (biff, embedded labels, table is Budget$);


There us one problem with this though, the keen eyed will notice I said that the files have a similar structure which would suggest they aren't all exactly the same. Because of this a wildcard load would fail. Even if it did work, without the same structure the files wouldn't be concatenated into the same table anyway.  

Alternatively I could choose load the files in one-by-one as I know their file names, concatenating them into the same table to use in my app. But this would mean that if I add more budget files in the future, I have to modify the script to add the new tables and this would create a high maintenance app.

Instead I'd like to loop through all the available files and load them into the same table which is simple enough to do in QlikView like this:

FOR EACH File IN filelist('Budgets\Budget*.xls')
    CONCATENATE(MyBudgets)
    LOAD
        *
    FROM [$(File)] (biff, embedded labels, table is Budget$);
NEXT


There is a problem with this code though, the table MyBudgets doesn't yet exist and thus, for the first file it tries to load, the CONCATENATION statement will fail. As with everything in QlikView there is more than one solution to this problem. One approach is to check within the loop if the table exists and if not then create the table instead. This is a perfectly valid method but doubles the amount of code I need to write. My preferred is to create an empty table called MyBudgets first so that the loop can simply concatenate all files to it.

Creating an empty table in QlikView is really simple using an inline load statement. Going back to my scenario, I know that the budget files all contain the field BudgetID and so I can create an empty MyBudgets table containing that field and then concatenate all my files to it like this:

MyBudgets:
LOAD * INLINE [BudgetID];

FOR EACH File IN filelist('Budgets\Budget*.xls')
    CONCATENATE(MyBudgets)
    LOAD
        *
    FROM [$(File)] (biff, embedded labels, table is Budget$);
NEXT


Problem solved! You can use this trick even if you don't know the name of a field. Simply create the empty table with a dummy field name such as "MyDummyField" and then drop the field after you've concatenated the tables to it.

3 comments:

  1. Nice post.

    I usually use this code:
    MyBudgets:
    load 1 as BudgetID autogenerate 0;


    ReplyDelete
    Replies
    1. There are always 10 ways to do everything in Qlik. I would imagine that using autoload could in theory be slightly slower but since we are talking about milliseconds it would be insignificant.

      Delete