Together we've discovered in Custom Row Versioning - Part 1 what were the main challenges we were facing in order to address the very hot topic of the concurrent data access. Now we will focus on the technical solution that we've implemented for SQL Server 2008 R2 Database Engine. However, as you will see, the solution is pretty straight forward and a similar approach could be applied under different database engines.
What I've realized while working on this data concurrency problem is that the final solution will involve a Row Versioning solution that would have to work with Partition Switch approach. This meant that we should be able to handle any INSERT/UPDATE/DELETE operations that would be reflected in the final table, as a result of the table Partition Switch, without:
- being blocked by an ongoing user query on that FACT / DIMENSION table
- blocking any new user reporting launched while the refresh operation was in progress or completing
The final Custom Row Versioning mechanism will therefore include 3 main components:
1. A Row Versioning Solution, which will ensure a unique Version Key (ex. PUBLISHED_VERSION_KEY=1) assigned to each run of the Custom Row Versioning mechanism. This involves the following main tables:
a. CRV_PUBLISHER - setup table containing the list of all parameterized data publishers. Where the publisher concept corresponds to each table for which we want to track the changes occurred during a refresh operation. Contains the following fields:
i. PUBLISHER_KEY (integer, auto-increment);
ii. CRV_TABLE_NAME (varchar - ex. CT_FACT_SALES);
iii. FINAL_TABLE_NAME (varchar - ex. FACT_SALES)
b. CRV_STATUS - setup table, containing the possible status values for a published version. Contains the following fields:
i. STATUS_KEY (integer, auto-increment);
ii. STATUS (varchar - ex. FINISHED; ONGOING; FAILED; DISABLED)
c. CRV_PUBLISHED_VERSION - transaction table, with one line added for each run of the Custom Row Versioning mechanism for a given publisher. Contains the following fields:
i. PUBLISHED_VERSION_KEY(integer, auto-increment);
ii. PUBLISHER_KEY (integer, FK);
iii. STATUS_KEY (integer, FK)
2. A Custom Change Tracking Mechanism, which will record in a separated CT table (ex. CT_FACT_SALES) all differences between the TMP table (TMP_FACT_SALES), and the FINAL table (FACT_SALES):
· In order to do this, we've adapted an existing custom build Data Compare component, which was inserting one line in the CT table for each INSERT/UPDATE/DELETE it detected as a result of the compare between the TMP and the FINAL table, for the list of partitions that needed to be compared. All these records will be reported on the Version Key generated by the Row Versioning Mechanism, and we will mark for each record the type of operation detected (I/U/D).
· The Custom Change Tracking Mechanism it's also automatically generating the CT table, using a Table Cloning mechanism, generating an identical structure with the FINAL table, plus the additional columns: PUBLISHED_VERSION_KEY;CT_OP_TYPE (I/U/D)
· In this way we will know exactly which changes have been determined at each run of the Custom Row Versioning Mechanism
· Initially, when the Custom Change Tracking Mechanism starts running for a given publisher, it will insert into CRV_PUBLISHED_VERSION one record with STATUS='ONGOING'. In this way, this version will be invisible to the associated CRV SQL View. Only when the Custom Change Tracking Mechanism finishes running, the STATUS of the current PUBLISHED_VERSION_KEY will be changed to 'FINISHED', which doesn't generate blockages since STATUS='ONGOING' is ignored by the CRV View
3. A CRV SQL View (ex. CRV_VW_FACT_SALES), designed to expose the data from both the FINAL and the CT table in a way that doesn't generate blockages during reporting and ensures data consistency (data currently loaded by the CRV mechanism is invisible to the user). Basically, the role of this view is to display the latest data to the final user, with one PK being returned just once, regardless of the number of times that record was modified by different published versions. This view will replace any previous usage of the final FACT table at Presentation Layer level (ex. Business Objects).
The view includes the following logic:
1. Step 1: We build a Common Table Expression (CTE) containing all PUBLISHED_VERSION_KEY's to be taken into account, for the selected publisher (ex. FACT_SALES), having STATUS='FINISHED'. This will contain an actual snapshot of all versions that need to be taken into account when the view is running.
2. Step 2: We build 2 SELECT statements, returning 2 result sets that will be merged using UNION ALL, based on the following logic:
a. First SELECT will return ALL records from the FINAL table, which don't have a correspondence in the CT table, for the versions recorded in the CTE, based on a join made on the tables PK.
b. Second SELECT will return the latest version (MAX(PUBLISHED_VERSION_KEY)) of the record from the CT table (ex. CT_FACT_SALES), for each individual PK and for the versions recorded in the CTE. All records are returned, with the exception of the records for which the latest modification is a delete operation (CT_OP_TYPE<>'D')
The final result will be a view that will look-up like this:
CREATE VIEW VW_CRV_FACT_SALES AS
/*Step 1: CTE creation*/
WITH FINISHED_PUBLISHED_VERSIONS
AS
(
SELECT PUBLISHED_VERSION_KEY
FROM CRV_PUBLISHED_VERSIONS CPV
INNER JOIN CRV_PUBLISHERS CP ON CPV.PUBLISHER_KEY=CP.PUBLISHER_KEY
INNER JOIN CRV_STATUS CS ON CPV.STATUS_KEY = CS.STATUS_KEY
WHERE CP.FINAL_TABLE_NAME = 'FACT_SALES'
AND CS.STATUS='FINISHED'
);
/*Step 2: Proceed with main SELECT build*/
/*2.1: Records found exclusively into Main table*/
SELECT STORE_KEY, CUSTOMER_KEY, INVOICE_ID, INVOICE_LINE, INVOICE_DATE, QUANTITY, AMOUNT_EURO
FROM FACT_SALES FS
WHERE NOT EXISTS
(
SELECT TOP 1 TRANSACTION_SALES_KEY
FROM CT_FACT_SALES CFS
INNER JOIN FINISHED_PUBLISHED_VERSIONS FPV ON CFS.PUBLISHED_VERSION_KEY = FPV.PUBLISHED_VERSION_KEY
WHERE CFS.TRANSACTION_SALES_KEY = FS.TRANSACTION_SALES_KEY
/*If the table is partitioned, than adding the Partitioning Key in the join will provide a significant performance boost/*
AND CFS.PARTITION_ID = FS.PARTITION_ID
)
UNION ALL
SELECT STORE_KEY, CUSTOMER_KEY, INVOICE_ID, INVOICE_LINE, INVOICE_DATE, QUANTITY, AMOUNT_EURO
FROM CT_FACT_SALES CFS
INNER JOIN
(
SELECT MAX(PUBLISHED_VERSION_KEY) AS PUBLISHED_VERSION_KEY, TRANSACTION_SALES_KEY
FROM CT_FACT_SALES CFS
INNER JOIN FINISHED_PUBLISHED_VERSIONS FPV ON CFS.PUBLISHED_VERSION_KEY = FPV.PUBLISHED_VERSION_KEY
GROUP BY TRANSACTION_SALES_KEY
) SQRY ON CFS.PUBLISHED_VERSION_KEY = SQRY.PUBLISHED_VERSION_KEY
AND CFS.TRANSACTION_SALES_KEY = SQRY.TRANSACTION_SALES_KEY
WHERE CFS.CT_OP_TYPE<>'D'
Now that we've clarified the technical solution, let’s talk a bit about the advantages and disadvantages of this Custom Row Versioning solution.
Advantages:
· We finally have a solution for all those escalating blockages at database level. That, most importantly, works with Partition Switching!
· Works correctly with any Reporting Solution build on top, regardless if its ROLAP, MOLAP, HOLAP or In Memory OLAP
Disadvantages:
What? Are there disadvantages? Isn't this the magic cure for all diseases affecting your Data Warehouse??? Sadly, no, and here are some reasons for this:
· Adding an extra layer (the CRV view) on top of your Fact Table will never be as fast as using the Fact table directly in your presentation layer application. The performance hit will vary, depending on the complexity of your query, but it will always be there
· If you are already experiencing performance problems with your existing ROLAP reporting solution, which aren't caused by database blockages, it likely that these performance problems will get worst by adding a CRV view. Remember that the Custom Row Versioning mechanism has been designed to address database blockages, and NOT query performance problems. Other options, like using ColumnStore indexes, should be considered in order to improve the performance, especially while considering they work fine with Partition Switching.
· I would always recommend using a Custom Row Versioning approach only as a last resort for database blockages, due to above reasons. Instead, you should explore first alternative solutions for limiting the blockages occurring at database level. I'll give you some examples of the alternative solutions we've implemented:
1. Instead of proceeding directly with Partition Switch at the end of our ETL Process (ALTER TABLE SWITCH PARTITION), we've now implemented a check to launch the command only if the table is NOT in use. If the table is in use, we will enter a waiting queue where we are waiting for a specified number of seconds before checking if the table is not in use. Therefore, by launching the Partition Switch only when it makes sense to, we get rid of the Attempt Exclusive Lock at table level, which was causing a lot of blockages
2. One colleague of mine come up with the brilliant idea of having a Generic MERGE mechanism, as an alternative to Partition Switch, for refreshing the data from the Final table, based on the one from the TMP table. The generic Stored Procedure receives as parameters the Source Table (ex. TMP_FACT_SALES), Destination Table (ex. FACT_SALES) and the Partition List and generates a generic MERGE statement that can be used for data refresh. Very useful alternative for refreshing Dimension and smaller Fact tables.
Basically, what we've ended up with is a Data Refresh Framework that can use several alternative mechanisms for refreshing the data:
· Partition Switching - the preferred refresh solution when it can be done
· Generic Merge - parameterized as an alternative refresh solution for a given table, when Partition Switching cannot be done and when TMP table record count is under a specific threshold
· Custom Row Versioning - used as last resort, when Partition Switching cannot be done and Generic Merge is NOT a viable option, either from the performance point of view or it still generates dangerous blockages which are affecting the performance of other queries
In practice, we are using log tables integrated into each of the Data Refresh Framework components in order to trace for how many times we've entered a waiting pattern before being able to perform Partition Switch. Based on this information, we take the decision to activate the remaining refresh options as the Generic Merge or Custom Row Versioning. What is important to understand is that, each time we are able to perform Partition Switching or Generic Merge, the STATUS for ALL existing Published Versions associated to the Final Table will be changed to DISABLED, and the CT table will be truncated (if it's not already in use). In this way, the users will be able to access the latest version of the data.
At the end of this article, I underline 2 major things:
· The importance of stepping out of your comfort zone. Initially we've had a Data Mart design that was dependent of a certain reporting solution, in our case the Microsoft Analysis Services backbone. Now, after years of evolution, we've ended up with a Data Mart design that's independent of the reporting solution used, regardless if it's MOLAP, HOLAP, ROLAP or In Memory OLAP. We've also ended up with a clean Atomic Level implementation, that was hunting us for years. And this is what is called evolution.
· The importance of teamwork. Great ideas and best solutions to complex problems appear when having separate individuals looking at the same problem from different angles. And nothing can beat this.
And now, back to you: Do you use a Row Versioning solution in order to address data concurrency issues? Is it a database engine feature, like SQL Server Snapshot isolation, or is it a custom solution?
Cheers,
Nicolae