You are here

QlikView: Super boost your concatenate load statements

One of the most common practices we all use is concatenation of tables (usually our facts) inside our load script. One of the givens are that these fact tables don't always have exactly the same columns, but rather only a few overlapping core columns.

Here is a small sample of a 3 facts with concatenate loads:

FACTS:
LOAD
       //Common Fields
       'TRANSACTIONS' as SourceData,
      
[Company Code],
      
[Department Code],
      
FinYearPeriod,
      
//Keys in TRANSACTIONS
       [Account Code],
      
//Facts
       [Opening Balance],
      
Movement,
      
Budget
FROM TRANSACTIONS.qvd (qvd);


CONCATENATE (FACTS)
LOAD
       //Common Fields
       'EMPLOYEES' as SourceData,
      
[Company Code],
      
[Department Code],
      
FinYearPeriod,
      
//Keys in EMPLOYEES
       EmployeeID,
      
TranType,
      
//Facts
       Entitlement,
      
AccruedThisPeriod,
      
Taken
FROM EMPLOYEE_MASTER.qvd (qvd);

Concatenate(FACTS)
LOAD
       //Common Fields
       'SALES' as SourceData,
      
[Company Code],
      
[Department Code],
      
FinYearPeriod,
      
//Keys in SALES
       [Stock Number],  
      
[Sales Type Code],
      
//Facts
       [Sale],
      
[Discount],
      
[Cost]
FROM SALES.qvd (qvd);

 

This seems normal enough, but where can we improve the speed? The answer lies in the behaviour of Qlikview with each concatenate statement. With each new LOAD Qlikview checks which columns are common and how many new columns there are.
Then Qlikview first creates the new columns on the existing table (FACTS in this case). Then it has to assign NULL values to all those rows and only then starts adding the new data underneath.

Bring on the speed already. I hear you.

I have found two ways to help with this scenario:
1. The first approach is purely for easy maintenance and speed boost
2. The second approach gives you the speed boost and offers the benefit of doing basic Type initialisation of columns

I have tried both approaches and currently use number 1. In my case the model reload time went from 20 minutes down to 5 minutes. A huge time save of 15 minutes.

1. Speed boost only

Do an inline load before all your LOAD statements containing all the columns from all 3 tables. Then concatenate all you tables onto this new table.
The benefit here is you can generate a table without any dummy data records, so no code needed later to remove the initialisation lines. Remember to not include any [] brackets you used in you field names as they can't be used inside the INLINE load.

This means all the columns will exist from the start and Qlikview will create the empty values from the start.

Remember these are all going to be on one line in your load script. They are just wrapping here because of page width.

FACTS:
LOAD * INLINE [
       SourceData,Company Code,Department Code,FinYearPeriod,Account Code,Opening Balance,Movement,Budget,EmployeeID,TranType,Entitlement,AccruedThisPeriod,Taken,Stock Number,Sales Type Code,Sale,Discount,Cost
]
;


CONCATENATE (FACTS)
LOAD
       //Fields Here
FROM TRANSACTIONS.qvd (qvd);

CONCATENATE (FACTS)
LOAD
       //Fields Here
FROM
FROM EMPLOYEE_MASTER.qvd (qvd);

Concatenate(FACTS)
LOAD
       //Fields Here
FROM SALES.qvd (qvd);

 

2. Speed and Flexibility

Do an AutoGenerate(1) load before all your LOAD statements containing all the columns from all 3 tables. Then concatenate all you tables onto this new table. After this you can keep the dummy row and ignore it or you can drop it with a resident load and where clause.

One of the benefits here is that you can do basic Type initialisation here.

FACTS:
load
       //Common Fields
       'PREP' as SourceData,
      
null() as [Company Code],
      
null() as [Department Code],
       0     
as FinYearPeriod,
      
//Keys in TRANSACTIONS
       ''     as [Account Code],
      
//Facts
       0.00   as [Opening Balance],
       0.00  
as Movement,
       0.00  
as Budget,
      
//Keys in EMPLOYEES
       null() as EmployeeID,
      
null() as TranType,
      
//Facts
       0      as Entitlement,
       0.00  
as AccruedThisPeriod,
       0.00  
as Taken,
      
//Keys in SALES
       null() as [Stock Number],  
      
null() as [Sales Type Code],
      
//Facts
       0.00   as [Sale],
       0.00  
as [Discount],
       0.00  
as [Cost]
AutoGenerate(1);

 

CONCATENATE (FACTS)
LOAD
       //Fields Here
FROM TRANSACTIONS.qvd (qvd);

CONCATENATE (FACTS)
LOAD
       //Fields Here
FROM
FROM EMPLOYEE_MASTER.qvd (qvd);

Concatenate(FACTS)
LOAD
       //Fields Here
FROM SALES.qvd (qvd);

That's it. Just a LOAD statement before your concatenations, but it works wonders on the performance for the existing loads. Hope you found this helpful.