Is My BI Fat? Database Partitioning Can Help!
Your data warehouse started so perfect! It was a lean mean analytical machine. Then reality set in. The users fed it all the data they could get their hands on. When it got fat and slow, they refused to put it on a diet because they need “everything!” Data archival just isn’t in the cards and never seems to be with any data warehouse these days.
What’s a BI technologist to do? Enter database partitioning. Before we get into the details though, a few questions need to be answered:
- Does my data warehouse have tables that contain hundreds of millions of records?
- Are the queries accessing identifiable subsets of the data?
- Are the subsets of rough equal size and are there enough of them?
- If your data changes, will it stay in the same subset?
If you said yes to all of these questions, you might be a candidate for partitioning.
There are many ways to partition data and the strategy is often case specific. For our example we’ll assume that the best way to partition the data is to group the data by sales region. Imagine a company with five sales regions and billions of orders spread relatively evenly across all sales regions.
Why would I partition this way? Primarily, sales data is generally analyzed by region and then by time and other attributes. Additionally, the sales regions are static and rarely change. When the company grows larger, new sales regions can be added or we can re-partition by region and additional attributes.
What’s the overhead of partitioning?
- Licensing. Partitioning is often considered an advanced or enterprise type feature. This means that standard edition SQL Server will not support this feature and you will have to upgrade.
- These partitions don’t maintain themselves. Initial setup and ongoing care and feeding will be required.
- Partitioned tables will consume more disk space. This is a tradeoff for increased performance.
- Experienced resources that can design and implement the solution from a business and technical perspective will also be needed.
What to do?
- First and foremost, profile the data and how it is used in the business
- What parts of the information flow are experiencing the most pain? Some of you will experience painful data integration whereas others will see issues at the client interface layer. The symptoms vary by case.
- Design the partitioning strategy
- Assess how future growth and change will affect your partition strategy
- On a smaller scale, validate your strategy before committing too much time or resources.
You may want to consider bringing in experience consultants for guidance, design validation or actual implementation to ensure your project is a success.
Additional Business Intelligence Posts
If you enjoyed this post from Chad, you may be interested in reading a few of his past posts on Business Intelligence:
It’s The Data Quality
My Search For The Business Intelligence Chupacabra
Do Tableau And MDS Make Strange Bedfellows?
If They Only Had Tableau
Technology Meetups Make A Difference
What’s the R in ROI (for BI)?
Microsoft BI: Catching the Deadliest Dashboard
Tech Smart vs. Business Smart: Which Hat Does Your IT Consultant Wear?
Recommended Business Intelligence Posts
You might also be interested in reading some additional posts from SWC’s Business Intelligence experts:
How to Fast Track Business Intelligence
OAuth 2.0 – Google API Business Intelligence Implementation
Ask SWC: What’s New with MDS In SQL Server 2012?
Can’t afford BI? Try the BI Analytics Tools in Everyday Software
How to Break Business Intelligence Users’ Excel Addiction
An Agile Approach to Business Intelligence
No Biggie: Data As A No Brainer
Marry your database. Date your dashboard
Ask SWC: What is the New R Feature for Tableau 8.1?
Ask SWC: What’s So Great About Tableau?