Wednesday 29 February 2012

QlikView on Amazon Kindle

I spotted a photo floating around the web earlier of what looked to be QlikView running on an Amazon Kindle (the original not the new touch screen Kindle Fire), so of course I had to have a go myself. I grabbed my Kindle and fired up the web browser, which Amazon describe as "Experimental". Obviously this wouldn't be the richest QlikView experience ever but would it be usable at all?




I first hit the QlikView demo site which seemed to load fine. I chose and app and fired it up.


Apart from a lot of the text appearing surrounded in black, it was just about readable and everything appeared to be there. Now for the real test, can I make a selection?


Yup, after a slightly slower than normal refresh, the selection seemed to be applied and the charts displayed the updated results.

I've yet to find a way of right clicking or clicking and dragging within charts, so functionality is limited and it wouldn't be my first choice of mobile device but at a push it would be usable with nothing else to hand.

Sunday 26 February 2012

QlikView Collaborative Sessions Video

Whilst trawling Youtube I found this very cool video demonstrating collaborative sessions in QlikView.


For those that haven't upgraded yet, collaborative sessions are a new feature in version 11 and allow more than one user to share a single session. Imagine a webex where any user can take control.

Thursday 23 February 2012

QVSource Review

I finally got some time this week to sit and play with QVSource, produced by the guys at Industrial CodeBox. It's something I've been meaning to get round to for some time but other things kept getting in the way. Below is a quick run through of my first play with it and what I thought.

What is QVSource?

On its website (qvsource.com), it is described as follows:
"QVSource makes QlikView the best connected Business Intelligence platform available and enables a new range of QlikView applications to be built which mashup and provide insight into data from a wide range web APIs and other data sources not natively supported."
In simple terms, it allows you to access a growing list of data sources which would normally take some custom development to integrate into QlikView. At the time of writing, it can pull data from 32 APIs including Facebook, LinkedIn, Twitter, Youtube, Google Analytics, even OData and POP3 email accounts.The list keeps growing too.

How does it work?

The basic principal is simple, QVSource plays middle man between QlikView and the APIs. QlikView can natively make web calls to return data. QVSource runs a small web server which QlikView can make a call to and it then handles communication with the relevant API returning the data to QV.

Getting it working

I was provided with a download link where I could obtain 2 things, the software and a license key file. The software comes as a zip file and on unpacking I discovered no installation appeared to be necessary. OK, time to read the documentation maybe? Don't be silly I'm an impatient techie, I just want to play and besides, I figure trying to do it blind would be a good test of how easy it is to use. For those that do wish to read up first there is a Getting Started Guide explaining in detail the things you need to know to get off the ground.

I fired up qvsource.exe and was presented with an initial window like this:


First things first, how to apply the trial license I had been given. Clicking on the License tab presented a screen displaying that I had no license installed and a button to the right labelled Apply License. Clicking this popped up a window with which to find and apply the license key file (.lic) I download earlier. Once applied everything looked set for me to crack on with getting some data into QV.


Moving back to the Connectors tab I decided to play with Twitter first. I selected the Twitter Connector and hit the Configure button to the right of the window. This popped up a larger window with what, at first glance, looked to be rather a lot of settings.


No worries, a helpful link towards the top right labelled Help Page For This Connector took me to the relevant Wiki page (so much for not reading the help) and it didn't take me long to work out my next step was to hit the Authenticate button. This wasn't initially obvious as it appears to the right of an input field and so I assumed I needed to enter a Token before clicking it. Maybe a slight interface tweak for the future but no biggy. Clicking Authenticate popped up a new window asking me to enter my Twitter username and password.


After entering them and hitting Authorise App I was presented with a Pin Code. The box at the bottom left asked me to enter the pin before closing but it had already done this for me.


Closing this window took me back to the Twitter Connector window and the Token and Token Secret fields were now populated.


I populated the Search Term, Users Lookup and Username fields in the lower half of the window with total guesses and then clicked on the Connector Data tab at the very top. At this point things started to look very promising and the screen showed a list of tables of data available to me from the Twitter API.


I chose the Search table as it was top of the list and after QVSource did it's thing I was presented with a table of data showing tweets matching the search term QlikView as I had entered in the configuration screen. But what next? How to get this data into QV? Then I noticed the QlikView Load Script tab.


This tab offered me a ready made Load statement to cut and paste into QV. Leaving QVSource running in the background so it could work it's magic I quickly pasted the Load statement that had been generated into the script of a new QVW and hit Reload. It was as simple as that and I had Twitter API data in QlikView. I quickly built a couple of objects in QV to prove all was as expected.


I performed this test with QV10 but have since also tried with with QV11 and there is no reason it shouldn't work for older versions of QlikView also. I also played with some of the other tables of data available through the Twitter API with equal success.

Pro's

Where do I start? The obvious benefit is in being able to pull data from sources which in the past would have proved difficult. QVSource pays for itself in the time saved doing this alone.

As you can see above, it's pretty simple to use too. Anyone with a basic understanding of QlikView should be able to get things up and running in a matter of minutes. Without reading much documentation I had it pulling data from Twitter in under 10 minutes.

If you do get stuck then the documentation is second to none with a Wiki dedicated to QVSource covering the entire product including how to get the individual connectors working. This documentation you might need, as despite how simple QVSource is to use, some of the web APIs can be a little tricky to get running if you are not familiar with them. Following the Wiki solved any trouble I had and I've since also pulled data from Facebook, LinkedIn, Google Analytics and a mailbox, all without too much trouble at all.

Con's

Well, not a lot really, I had to scrape the bottom of the barrel to find these. Remember though that this is a very new product and so I'm sure the development Industrial CodeBox are doing will continue to improve it going forward.

The first thing to note is that dealing with large volumes of data will be very slow. This isn't a limitation of QVSource however and rather a limitation of the speed with which data is returned from the APIs. Bear in mind that the data will also be pulled over your internet connection and so again this could prove slow. Most APIs limit the amount of data they will pass anyway.

I found dealing with multiple instances of the same connector to be a little tricky too. For some connectors, such as the MailBox Connector, the authentication details must be entered into QVSource rather than in the QV script. Because of this you would need a seperate instance of QVSource running for each mailbox you wish to read data from. This isn't too hard to achieve by simply copying the QVSource folder repeatedly and configuring each instance to run on a different port (a setting on the Advanced tab). Moving the authentication details to be part of the request URL in the QV script would make this work with a single instance of QVSource and this is something that has already been done with some of the connectors such as the Facebook Personal Connector. I'm sure its something they'll get round to implementing for all connectors at some point soon and will drop the guys an email to make sure I'm not overlooking an obvious solution.

Also, the QVSource application needs to be running in the background for QlikView to refresh the data. This is fine when using it with QV Desktop but when running in a server environment, QVSource needs to be able to recover if the server is rebooted or more importantly continue to run if the user logs out. There is however a relatively simple work around to this issue using the Windows Scheduler as you can read here. Maybe the future could bring a version which runs as a service with a web control panel as this would bring it inline with QVS architecture.

Conclusion

The power of being able to analyse social media and web traffic data is unquestionable for marketing companies and departments (possibly even for social recruiters). QVSource coupled with the power and ease of QlikView makes doing just that a very quick and simple task.

Would I recommend it? Without a doubt!

Wednesday 22 February 2012

SET vs. LET

I got into a conversation with someone today who was struggling to understand the difference between SET and LET. It wasn't the first time I've had this conversation so thought I'd explain it here too in case any QV newbies were also struggling with it.

SET is taken literally and what is on the right of the equals is written into the variable. So writing:

SET vMyVariable = 5*2;

Would result in vMyVariable containing "5*2".

LET is used when you wish QlikView to equate what is on the right of the equals and write the answer into the variable. So writing:

LET vMyVariable = 5*2;

Would result in vMyVariable containing "10".

Simples!

Checking if a File Exists

A question that comes up fairly regularly from customers and also on the QlikCommunity is how do you check if a file exists before loading it? As with everything in QV there is more than one way to do this. Below I'll demonstrate the 3 methods most commonly used in QV although there are without a doubt more. 

Personally I would use method 1 if dealing with a single QVD file and method 2 for all other files or when dealing with wild cards in file names. I would avoid method 3; although it is a completely valid method, in my opinion, it belongs within a much more complex custom error handling routine which is outside the scope of this post.

Method 1 - qvdCreatTime()

This method uses an inbuilt function within QV called qvdCreatTime() which returns the date and time the QVD was created. If the QVD doesn't exist, qvdCreatTime() returns NULL and thus we can use it to check if a QVD exists. Obviously this method is limited to QVD files only. It also cannot handle wild cards.

IF (NOT isNull(qvdCreateTime('myqvdfile.qvd'))) THEN
    LOAD * FROM myqvdfile.qvd (qvd);
END IF


Method 2 - For Each File in Filelist()

This method uses the filelist functions within a for loop. This function is only possible within the loop and has the benefit that wild cards can be used within the file name and path. This allows you to load multiple files that match a pattern by looping through them. Assuming the field names are all identical the files will be concatenated into a single table. Unlike method 1 above, this method can be used on any file type.

FOR each File in filelist ('myqvdfile.qvd')
     LOAD * FROM myqvdfile.qvd (qvd);
NEXT File 


Method 3 - ErrorMode & ScriptError

This method isn't strictly checking if a file exists but rather changing the way QlikView reacts to errors if it tries to load a file that it can't find. To do this we change the value of the ErrorMode variable to 0 which prevents QV from stopping the script execution if an error occurs. We then try to load the table and afterwards check the variable ScriptError to see if an error has occurred. If ScriptError has a value of 1 then no error occurred. A value of 8 denotes that the error that occurred was that the file does not exist. We can replace the TRACE statements with the relevant actions to take. We then set ErrorMode back to it's default value of 1 and continue the rest of the script.

SET ErrorMode = 0;

LOAD * FROM myqvdfile.qvd (qvd);

IF ScriptError>1 AND ScriptError<>8 THEN
    TRACE File not found;
ELSE
    TRACE File loaded;
END IF

SET ErrorMode = 1;

Tuesday 21 February 2012

QlikTech Release Pricing

In a very bold move, QlikTech today released their full price list publicly on their website.

http://www.qlikview.com/us/explore/pricing

This certainly sets them apart from most software vendors especially those in the BI space. Will it have any effect though? Time will tell.

Friday 17 February 2012

Compression in QV

I was talking to someone this morning who mentioned how amazed they were at the compression QlikView managed to achieve. This seems to be a comman misconception about QlikView, so after explaining this to him, I thought I'd repeat my explanation here also.

FACT: QlikView does not use compression for QVDs!

So why then is a QVD so much smaller than a CSV containing the same table of data or why is a QVW so much smaller than several QVD's concatenated together into a single table? The answer lies in understanding how QlikView stores data and the key to this is understanding some basic facts about QlikView's Associative In-Memory Technology. This is what QlikTech have to say about it:
"QlikView uses an associative in-memory technology to allow users to analyze and process data very quickly. Unique entries are only stored once in-memory: everything else are pointers to the parent data. That’s why QlikView is faster and stores more data in memory than traditional cubes."
So from this you can easily deduce the reason QlikView appears to compress data so much is that it stores only unique values thus removing repetition in the data. Think about it for a second, if I have a million rows of data but a field contains only 10 unique values, then QlikView stores those 10 values only and not a million values.This applies equally to the RAM QlikView will need to hold the data as well as the hard disk space needed to store the QVW and QVD files. 

You can even use this fact to your advantage when dealing with large volumes of data in QlikView. Breaking down a field into several fields can cause more repetition and thus reduce the amount of resources QV needs to hold the data both in RAM and on disk. Email addresses are prime candidates for this technique as the domain part will likely not be unique to one entry. Take this example:

EmailAddress
bob.jones@mycompanydomain.com
jane.smith@mycompanydomain.com
paul.hill@mycompanydomain.com

QlikView will consider each of these values as unique and thus will be storing the "@mycompanydomain.com" part repeatedly. If you were to split the email addresses at the "@"  then the domain part becomes the same for all 3 records and thus QV will store it only once. You can split the email address using the following lines within a load statement:

....
left(Email,index(Email,'@')-1) AS EmailName,
right(Email, len(Email)-index(Email, '@')) AS EmailDomain
....


When wanting to display the email address in a chart you only have to concatenate the 2 fields together remembering to replace the '@' in the middle like this:

EmailName & '@' & EmailDomain


The same technique can be used for mail addresses, phone numbers, part codes and any other patterned, repetitive data.

QlikTech Q4 and 2011 Results Announced

QlikTech announced their Q4 and full 2011 results after close of trading last night. The highlights are as follows:
  • Total revenue of $108.1 million increases 33% compared to fourth quarter of 2010
  • License revenue of $75.7 million increases 34% compared to fourth quarter of 2010
  • Full Year 2011 total revenue of $320.6 million increases 42% compared to 2010
  • Full Year 2011 license revenue of $204.4 million increases 41% compared to 2010
That's some impressive growth! You can read the full press release here:

http://www.streetinsider.com/Press+Releases/QlikTech+Announces+Fourth+Quarter+and+Full+Year+2011+Financial+Results/7189508.html

Wednesday 15 February 2012

Excluding Zeros in Set Analysis

I've been reading a lot of historic posts on the QlikCommunity (yes I need to get out more) and came across an interesting question, how to exclude zeros when performing an average calculation in an expression. The offered solution is a noteworthy snippet:

avg({< Value -= {0} >} Value)

This is a perfect solution but it got me thinking about the difference between "-=" and "=-". The former will take current selections in the Value field and then remove zeros. The latter will ignore current selection in the Value field and remove zeros from all possible values. To avoid the confusion you could use the following syntax instead:

avg({$-< Value = {0} >} Value)

This method would give the same answer but be careful if you try to add other selections to the set modifier as you would remove more data along with the zeros.

Monday 13 February 2012

Qualifying Field Names

The QUALIFY statement adds the table name to the front of the specified field(s) separated by a full stop (period for our American friends) and applies to fields loaded after the qualify statement only. More than one field can be named in the same QUALIFY statement as follows:

QUALIFY Name, StartDate, EndDate;

QUALIFY can be very useful for ensuring field names are unique and thus avoiding relationships between tables that aren't intentional. For example, too many times have I come across a field called "Name" in multiple tables in a database which when loaded into QlikView results in data model issues such as customers being connected to employees by their name.

As you might expect, QUALIFY has a sister statement UNQUALIFY which is used to eliminate fields from a previous QUALIFY statement. But why would you need to do this rather than simply removing the field name from the QUALIFY statement? Well the key lies with the ability to use wild cards in a QUALIFY statement like this:

QUALIFY *;
UNQUALIFY EmployeeID, OfficeID;


The down side to qualify is that it can result in very long, and some times meaningless field names and so I do not use it in production solutions, preferring to rename fields that cause problems to something more meaningful and unique. I have however found one really powerful use for QUALIFY. When performing "Seeing Is Believing" (SIB) events for prospective clients or when prototyping a new application, time is at a premium and so speed is everything. SIBs can be high pressure environments and you are normally loading in tables from a source you're not familiar with and relying on the client to tell you how the data is structured. In this scenario being able to ensure all field names are unique, except those you explicitly specify as key fields, can save a lot of time and embarrassment. The fastest method I've found is to QUALIFY all fields and then UNQUALIFY all fields which start with a prefix such as "Key_". You can do this as follows:

QUALIFY *;
UNQUALIFY "Key_*";

Employees:
LOAD
   EmployeeID AS Key_EmployeeID,
   Name,
   OfficeID AS Key_OfficeID,
   StartDate,
   EndDate
FROM employees.qvd (qvd);


Noted the double quotes in the UNQUALIFY statement, as these are importand. I hope this little technique saves you as many headaches as it has me over the years I've worked with QlikView.

Sunday 12 February 2012

QlikView TRACE Statement

TRACE is a little used but very useful statement in a QlikView script. It very simply outputs a string to both the script execution progress window and the script log file. The syntax is very simple:

TRACE 'Loading the source tables';

The real benefit of TRACE is being able to write custom place markers or comments into the application script log file. This can help when trying to track down where a script error is occurring or if implementing custom error handling. The latter is an entire topic in itself though and one I might cover in the future.

Empty a QVD

After my post last week about the power of binary loads, I was asked how to empty a QVD and why you would do this rather than deleting the file.

Lets tackle how first. The following sub routine will empty a QVD of all records but leave the table structure definition within the QVD. We'll cover why this table definition is important in a second.

SUB EmptyQVD (Path)
   //Check file exists
   IF (NOT isNull(qvdCreateTime('$(Path)'))) THEN
      //Check if QVD is already empty
      IF (qvdNoOfRecords('$(Path)') > 0) THEN   
         //Load the structure from the existing QVD
         //Where clause returns no records
         _TableStructure:
         LOAD
             *
         FROM '$(Path)' (qvd)
         WHERE 1<>1;

         //Store table definition back to QVD
         STORE _TableStructure INTO '$(Path)' (qvd);
         DROP TABLE _TableStructure;
      END IF
   END IF
END SUB


This code first checks that the QVD file exists. If it does, it then checks to see if the QVD is already empty (no point in doing unnecessary work). Assuming it isn't it then performs a load from the QVD using the where clause "1<>1" which will always equate to false and thus the resulting table will have the same fields as the QVD but no records. This empty table is then saved back to the QVD and the temporary table is dropped. 

So why empty the QVD and not just use a macro or external batch file to delete it? Well there are 2 main reasons:
  1. Depending which method you were to use, you would either need to raise the module security to "System Access" or allow the script permission to "Execute External Programs". 
  2. The second reason is that any QVW which loads in the QVD will fail to reload if it does not exist, unless you check the file exists first. 
Checking the file exists is simple enough to do (we do so in the above code sample) but if you use the QVD in many places you may forget to perform this check. By saving an empty QVD with the correct field structure, any QVW which loads from it will still run successfully even if you forget to check the QVD exists. The resulting table will obviously just contain no data.

Thursday 9 February 2012

Gartner Magic Quadrant 2012

Gartner have released their Magic Quadrant of Business Intelligence Report 2012 and QlikView is sitting pretty in amongst the leaders.

Source: Gartner (Feb. 2012)

You can read the full report at http://www.gartner.com/technology/reprints.do?id=1-1982NPD&ct=120208&st=sb

QlikView Binary Loads

Anyone who has worked with QlikView for long will likely have used at least one binary load. For those that haven't, a binary load allows one QVW to inherit the entire model and data from another. Because the data doesn't need to be reprocessed, binary loads are really quick, simply copying the 1's and 0's from the disk to RAM.

A binary load must be the very first statement on the first tab of your script and you can only perform one binary load per script; that's all there is to know about them. The syntax couldn't be simpler either:

BINARY report\myreport.qvw;

There are basically two main uses for binary loads. The most common is to share a data model between 2 QVWs. This can be as simple as copying the model from an existing QVW when adding a new reporting app to an existing solution. Whilst his is a perfectly valid technique, it assumes you want the exact same data model in both files. But this doesn't have to be the case, there is nothing to stop you dropping tables from the model or loading additional tables once you've performed a binary load. The benefits of this are straight forward, it saves development time and speeds up the reload process by avoiding having to load a complete new model from data source or QVD.

Fig.1: Basic use of binary load
 The other main use of a binary load is within advanced incremental build scenarios. When performing an incremental build, the loading of modified data from the data source may be very quick, but if you have large volumes of history data, loading this from QVD can still take a long time. A binary load can solve this problem shaving precious minutes and even hours off reload times. The technique can be tricky to understand at first but once you have your hear around it, implementing it is relatively simple. As with any incrememental build, be sure to consider if managing deletions is neccessary which can make things a little trickier to implement. The best way to understand the process is as follows:
  1. Start with a standard incremental build process using a QVD to store the transaction table and a reporting interface which loads in the QVD.
  2. A new QVW is created which runs each night or even weekend. This loads in the contents of the QVD and adds it to any history it already has. Once complete it empties the QVD. The incremental process creating the QVD will therefore mean the QVD stores only records added since the last reload of this new QVW.
  3. The reporting interface is modified to first binary load the new QVW to get the history data before then concatenating the QVD. 
Fig.2: Binary as part of an incremental build
Hopefully you can see how a binary load can be a powerful tool in your QlikView arsenal. Using a combination of these two techniques, I have in the past managed to reduced a customer's reload time from 18 hours down to well under an hour. That alone can mean the difference between a successful or failed implementation.

Tuesday 7 February 2012

QV and Sharepoint

The guys over at Industrial Code Box have been busy again, this time adding the ability to read Sharepoint 2010 lists to their oData Connector. You can find full details here:

http://www.qvsource.com/wiki/oData-Connector.ashx

Sunday 5 February 2012

QlikView Power Tools

I spent some time last week playing with the QlikView PowerTools which I hadn't had time to test properly since they were released in in November last year. For those yet to come across them, they offer a series of tools for the more advanced QlikView Server administrators.

You can download the QlikView PowerTools from the QlikView Community site.

Currently the tools included are as follows:
SharedFileViewer
Allows you to view bookmarks, server objects and other data stored in QlikView Server .shared files. It also allows you to repair and defrag large .shared files. 

QlikView Server Agent
Allows you to manager the QlikView Server services more efficiently.

QMS API Client
Allows you to use the new QMS API functions in QlikView Server without writing any code.

QV User Manager
A QlikView Server command-line tool allowing you to add and manager user CALs.

QvsDetector
Scans the network looking for other QlikView Servers and presents some basic information about them.

Reload Schedule Migration Tool
A tool for migrating version 9 QVPR XML databases to versions 10 and 11.

Server Object Handler
Allows administration of server objects.

Server Object Handler Batch
A command line versions of the Server Object Handler above.

XmlDbViewer
Lets you view and edit the QlikView Publisher Repository. 

Saturday 4 February 2012

Mobile QlikView

QlikTech have certainly spent a bit on their latest marketing drive in to mobile BI.


This got me thinking, I wonder what percentage of customers are actively using QlikView on a mobile device? I've been consulting for over 4 and a half years now and worked with too many customers to remember. But I could easily count the ones who have even a single person regularly using an iPad or similar to access their solution.

Is this the future???

Incremental Builds in the Real World

The term "incremental build" (also known as "delta load") gets used a lot in the QlikView world, often as a magic solution to reducing reload times and minimising the load on underlying systems and IT infrastructure. The theory of how to implement them is, on the surface, pretty simple but as with many things in the QV world, theory and reality can be two very different things.

The first question you should ask yourself is "are incremental builds even possible?"

You might think answering that question is simple, the source table in question has a a unique record identifier and a last modified date or timestamp, so the answer is of course yes. Wrong! You need to ask another question, and forgetting this is a trap I've not only seen many people fall in to but which I'm guilty of forgetting more than once.

The important next question before laying finger to keyboard should be "are the last modified date or timestamps accurate and reliable?"

At this point you might be thinking why wouldn't they be accurate. The short answer is because people make mistakes. I have literally lost count of the number of customers I've worked with who, after running incremental builds within their QlikView apps for some time, have discovered the answers they are getting  have drifted from the numbers they get when running other reporting.

The most common cause lies with the accuracy of the last modified fields. A stored procedure or batch process is periodically run which updates records but doesn't update the last modified fields. I've even come across commercial software which updates the last modified fields sometimes and not others. Working out what is changing data but not updating the last modified fields can be tricky. The fasted way is often to talk to any system administrators and DBAs that might have an idea of what other processes update records. A rogue stored procedure might be simple to sort; a tactful word with the DBA could be all that's needed. Bugs in commercial software are not so simple to identify and solve though, even if a fix can be obtained, it may be many months before it can be implemented within the live environment.

If a fix can't be found, unfortunately there is no simple work around in QlikView, but there are a few things you can do to help. The simplest is to perform a full reload over night or after the culprit script has run.

Another important question to consider is "can records be deleted from the datasource?"

This can again cause the version of the data held by QlikView to drift from that in the source system. This is however something which can be addressed in QlikView during the incremental build process. Managing deletions is a whole topic on its own though and one I'll explain another day.

Thinking back to the first question we asked, what if there aren't last modified dates in the datasource tables? Well that doesn't necessarily mean the end of the incremental world. Ideally a DBA should be able to add a last modified field to the table which is updated each time the record is modified using a trigger, allowing you to perform a true incremental build. If this is not possible then there are still techniques and tricks than can be used to speed up a rebuild giving you what is sometimes known as a "pseudo-incremental build". Again this is a whole topic in itself and a subject for another post.

So to sum up, when thinking of implementing an incremental build process in your QlikView solution, ask the following questions before writing a single line of script:
  • Do the source system fact tables have last modified date or timestamps and unique record identifiers?
  • Are the last modified fields reliable?
  • Can records be deleted from the source system tables?