During your script development process, the GROUP BY clause is often (probably for every single solution you build) needed to summarise data.
You are also, more often than not, required to rename fields from your source to more appropriate field names.
We then end up with syntax that looks something like this:
Although the above code will produce the correct results, it will take 10 times longer to load than it has to.
You need to break the process up into three steps:
- Rename all fields while creating resident table;
- Store the resident table to a QVD; (Don’t forget to drop the resident table);
- Perform the GROUP BY by loading from the QVD that was created in step 2 (which already has the appropriate field names);
Instinct will tell you to skip step 2 and load step 3 out of the resident table, but, although there will be fewer lines of code, your result will be 25% slower.
The resulting code will look like this: