Since I’ve started the series of articles dedicated to Data Warehouse Best Practices, we've been talking a lot about focusing on reusable components like Partitioning Mechanism, Implementing a Business Rule Engine or a Generic Import Mechanism. And since the most reusable component is the code itself, I thought it's a good idea to prepare an article dedicated to the Coding Guidelines to be used for Data Warehouse solutions based on Microsoft SQL Server.
We will start up discussing about SQL Server Developments in the Part 1 of the article, continue in the Part 2 with SSIS Developments and Source Control, and end up with Development Strategy for a Business Intelligence department.
SQL SERVER DEVELOPMENT
First things first. There are quite a few SQL Server Coding Guidelines out there, and one of the best I’ve come around is the SQLServerGuidelines, written by Pinal Dave. I highly suggest you to read it, since it contains a lot of valuable information for a developer.
Beside these excellent guidelines, which are applicable to any SQL Server development, I’ve tried to add some of my own, based on my experience so far, and more specific to Data Warehouse developments based on Microsoft SQL Server.
1. Normalize database structure in the 3rd normalized form, at least for Actual Data Warehouse database. This approach allows greater flexibility when needed to regroup data at Data Mart level. For the Data Mart level, the key to obtaining better performance is de-normalization, since it greatly reduces number of joins to be performed, especially when using a ROLAP Presentation Layer.
2. Avoid using INSERT INTO / SELECT * in query’s. Instead, use in the SELECT statement specific column names. In this way we avoid any unintended impact caused by adding new column in the source table.
3. Avoid using INSERT statement without a column list, since could cause significant problems when new columns are added or the order of the existing columns is changed in the table.
More...