Who needs a Data Warehouse Administrative Dashboard?

by Nicolae Guse 11. December 2011 22:31


Having a perfectly running Data Warehouse is not an easy job. It’s a mix of hardware and software challenges to address, spiced with a lot of data verifications in order to make sure that the system is not only up and running, but it also delivers the expected level of performance and data quality. 

On the hardware and software challenges, many of them are related on having the system operational no matter what. Let’s try to think at some of the scenarios a Data Warehouse has to handle on a daily basis, and why is it so hard to have it running perfectly:

1.       Because the whole system is usually very complex, and integrates data from multiple sources. If one of these data sources, which are spread over a wide geographical area, has a problem, you will have a problem at Data Warehouse level. And there are a lot of things to go wrong here, from power and hardware failures, networking up to software problems and crashed billing processes. None of them are under the control of the BI Department control, but any of them will have an impact over the Data Warehouse, when they are used as active Data Sources

2.       Going back to the Data Warehouse, it can also be affected by the same problems which affect the source applications. And this is why the whole Data Warehouse system must be build with redundancy in mind:

a.       Implement the use of Network Aliases, for each main component of the Data Warehouse. These aliases will point to either the Main or the Secondary servers, and will allow the applications which are based on them to continue operation, even if one of them goes down. Your should define such aliases for each main component of the Data Warehouse:

·         The servers which are hosting the Data Warehouse databases

·         The ETL servers

·         The Presentation Layer servers

All processes should take place on the Main server for each component. A Secondary server should always wait on standby, synchronized with the Main one by using features like Mirroring, Replication and Log Shipping. If something goes wrong with the Main server, and it cannot be contacted anymore by a Witness server, than the Network Alias will automatically be re-routed to the Secondary one for the affected component.

3.       Hardware resources are slowly overwhelmed by increased volumes of data. You should be able both predict and to trace the evolution of such trends before they turn into a problem. And this is where the term monitoring and verifications comes into place.  

4.       And the list can continue....

 

Daily Verifications (mental) Challenges

Beside these hardware and software challenges, which are hopefully addressed by the Data Warehouse Technical architecture and a whole army of various administrators, you should also have a process of Daily Verifications, which ensures that all Data Warehouse internal processes have run without a problem, and all the data has been happily refreshed. This process usually involves:

1.       Verifying the status of all SQL jobs which are part of the Daily Data Warehouse process

2.       Verifying the standard and custom logs available at each Data Warehouse layer (ETL)

3.       Running a lot of customized SQL scripts in order to check for standard data consistency issues, based on both foreseen scenarios and problems occurred in the past

4.       Running Standard reports which make sure that you have the same data in the Data Warehouse and in the Presentation Layer. For example, you need to check if the amounts in the FACT_SALES table are identical with the one from the corresponding OLAP SALES cube

5.       Analyze the duration of each job, and determine if there is any abnormal behavior, like sudden increases or decreases in duration

6.       Record the time when each of the Presentation Layer components, like OLAP cubes, have been processed and are available for browsing, and double cross this information with the established SLA.

7.       The Daily Verification Report is created based on all this information’s and published in something like SharePoint or any other application used for information sharing. This is where users connect when they need to know if the latest updated data is available.

Now, either way you put it, such a Daily Verification process is a time consuming one. And this is without taking into account the SQL jobs which are refreshing various Data Marts, and run frequently during the working day. And besides time consuming, I’ll dare to say pretty boring and prone to human errors. Depending on your company organization, you could have either a Data Warehouse Administrator or a BI Developer doing this activity. As a process, Data Warehouse verifications are a MUST but, for the poor soul who has to perform them, is anything but glamorous. 

The kinds of problems which tend to occur are related to the redundancy of the performed operations. It will take one hour or more from your life, with minimal or no added value to you as a professional. It's very easy to say, as a manager, that you are paid to do this job, but, in practice, it's very hard to keep the motivation and the attention of the people responsible to do the operation. Far too often can happen for a professional to fail to notice a trend on a certain job or process, and not because he does a crappy job, but because he's solely focused on the current day verification. He wants to finish this redundant process as soon as possible, because there are other projects that he's got to work on, with pressing deadlines.

This type of approach is a slow but sure way to go into trouble. Because certain trends which haven't been detected will sooner or later come out to the surface, and then all hell will break loose. SLA's will be breached and people will get very focused on trying to solve the problem ASAP, only to find out that there may not be a quick fix on this. Then managers will start asking how this could have happened, since this is exactly why we have Data Warehouse Verifications in the first place. I think you know the drill.

Overall, I’ll say that usually it's a bad idea to have a man do a machine job, and this is such a scenario.

 

What’s all about this Data Warehouse Administrative Dashboard?

And this is where the whole idea of Data Warehouse Administrative Dashboard comes into place. The whole point of this Dashboard is to provide a graphical manner to verify the health of the Data Warehouse processes. Rather than having an administrator or a programmer wasting 1+ hour on a daily basis, you would have an automated process performing all those verifications for him, and expose to the person in charge of the operation just the conclusions, in the form of a graphical Dashboard. And this means using Business Intelligence tools to manage the main Administrative tasks on the Data Warehouse. And, if you think of it, it's kind of ironic for the Business Intelligence department to not use Business Intelligence tools on their own Data Warehouse Administrative tasks.

The main purpose of such a Data Warehouse Administrative Dashboard would be to show in a friendly and easy to understand manner the following type of information:

1.       Analysis on a time dimension, in order to determine trends for a specified time interval:

a. A selection of dates (ex. Days from the current week)

b. Different weeks

c.  Month, Trimestrial, Semestrial and Year level

2.       We are mainly interested to capture the trends for:

a.All jobs that exist on a given list of monitored BI servers:

i. Total duration 

1. Job Steps

2. ETL packages (ex. SSIS):

a. Containers 

b. Tasks

c. Cube Process commands

ii. Errors:

1. Total Number of errors for each monitored component from above list

2. Categories of errors

3. Each individual error detailed

3.    The main sources of information will be:

a. Custom Audit table

b. SQL Server Job History  

4.    The aim of this Dashboard will be to determine in a graphical manner:

a.    The evolution for a given monitored element (ex. The DAILY process on the DWH server)

b.    Where the spikes appeared and which are their causes

c.     Based on a parameterizable Base Line, established for each monitored element, we should be able to see, in color coding, any significant deviation from this value (ex. Green for Base Line, Yellow, Orange and Red.) For each color we should have range of percentages defined for the deviation from the Base Line.

d.    Which are the sources of problems and to be able to perform a drill down on them until the most detailed level

As a technical approach, you should treat this as any standard BI project, therefore it should include:

1.    Data Warehouse developments:

a.    The use of the Staging Area, to import all necessary data from the source logging systems

b.    The use for the Work Layer, where data is being processed and the concept of monitored indicators in introduced

c.     The use of Actual Data Warehouse layer, to hold the fact table

d.    Optionally, the use of a physical Data Mart, which would contain the final fact tables and the dimensions

2.    Presentation Layer:

a.    Backend

i.  Microsoft Analysis Services cube

ii. Any other OLAP server you may have in your company

b.    Frontend

i.  Microsoft Excel

ii. Any other OLAP Browsing tool you may have in your company


Advantages

1.       Detection of trends for the monitored jobs

2.       Early detection and warning, starting from a pre-defined list, for SQL Jobs / Steps which didn't run

3.       Detection of share each step / package has in each monitored job. This will allow us to determine on which area of the DWH we should focus with improvements, in order to have the best ROI.

4.       Provide the management accessible reports, in order to trace the DWH health status and compliance with the established SLA

5.       You could extend it in order to provide information regarding the quality of the developments:

a.       Objects conformed / not conformed with a pre-defined naming convention, for each object type

b.      Missing indexes, based on a pre-defined indexing strategy (ex. Each table should have a PK defined)

6.       Free up programming resources by reducing the time spent on redundant task, which will allow them to be more involved on projects, were they can really bring a great added value

7.       Having this degree of automation allows you to be proactive, rather than reactive, on the problems which can occur on the Data Warehouse level. By early detecting trends, you will have the necessary time to prepare an action plan

8.       Business Intelligence Departments are always driven by Business Projects; therefore Technical interventions outside these projects are usually a tough pill to swallow, especially while working on a tight schedule. Using such an easy to understand, graphical tool is a great help for the Technical Team in order to sustain to the Management the need to take action based on a detected trend.

9.       Provide you with one of the best ways to identify performance problems at Data Warehouse level and plan future technical evolutions  in order to address these issues

10.   Find out that the sky is the limit when using such a tool. You’ll always discover new information which can be captured in order to better understand the health of your Data Warehouse. You’ll also realize that now it’s possible to combine this information with the logs gathered from DBA and System Administrators team, in order to trace what really happened on the server when a certain problem occurred.

 11. Have the possibility to monitor multiple servers simultaneously. This way, the health of servers like DEV, which are never included in the Daily verifications, can be monitored based on exactly the same parameters as PROD environment. 

At the end, I would say that having such a solution it’s a MUST for any Data Warehouse, in order to prevent the fire from starting rather than working frantically to extinguish it.

And now, back to you: How do you tackle the Data Warehouse Verifications in your company? Have you already implemented an automated solution for such operations?

Tags:

Data Warehouse Best Practices

Pingbacks and trackbacks (1)+

Add comment

biuquote
  • Comment
  • Preview
Loading

RecentPosts

Disclaimer

The ideas you'll find on this site represent the opinion of their owner. They are not either about or being endorsed by a specific company. This site is a place of sharing ideas and trying to find together best practices on Data Warehouse and Business Intelligence projects.

Some of the technical approaches will be praized, some of them will be criticized, based on the concepts presented by their owners. You shoudn't feel offended if some of the ideas which either you or your company consider best practices, here are being criticized. You should also realize that what some people consider best practices on a particular system and business scenario, can be worst practices on another.  If you don't agree with a particular idea, feel free to comment it in a civilized manner.

Only by crashing different ideas and experiences we will find a better way of doing Data Warehouse and Business Intelligence projects.