The Executive Guide to Data Warehousing

A well-designed and structured data infrastructure can lead to savings of 10-30% of revenue.

DAMA Data Management Body of Knowledge estimates that organizations spend between 10-30% of revenue on handling data quality issues. 

However, while all companies are striving to be ‘data-driven’, evidence-based decisions and policies are only as good as the data they are based upon.

Two fixable drivers of this are:

  • Data analysts wasting valuable time and opportunity costs when cleaning data.
  • Further costs resulting from poor decisions made on the back of weak analysis.

In this article, we'll show you how to build a data warehouse that is based on four key principles:

  1. Speed 
  2. Flexibility
  3. Centralization
  4. Accuracy

Your new system will help you routinely and more efficiently unlock reliable insights that will help your business grow faster. 

Speed

Facebook, Netflix, Google, Amazon and many other high-growth companies all follow the “10,000 experiment rule”. This rule states that the number of insights you derive through experimentation and analytics will generally turn out to be the core driver of growth. 

If you want to grow and learn at speed, you need to analyze and adapt at speed. There are two obstacles here: 

  1. Difficulty making the right early, data-driven decisions as analysis is lagging behind
  2. Analytics teams not having the capacity to keep up with product changes, because ad hoc analyses are slow to run

To unlock analytical speed, we believe your data warehouse should be built of reusable components and structured with three layers:

  1. Raw data - ‘Bronze layer.’ This collects and standardizes your data sources, for example, raw subscription data from payment providers.
  2. Business logic - ‘Silver layer.’ This contains your core business logic, for example, how you recognize monthly recurring revenue.
  3. Reporting - ‘Gold layer.’ This contains tables used in analyses and reporting, for example, recurring revenue split by geography and business group 

The ‘Gold’ layer is what you end up seeing in your dashboards.

However, the speed at which these insights are generated depends on having built solid analytical foundations in the ‘Bronze’ and ‘Silver’ layers.

Flexibility

As your business grows, the tools that you need to continue to grow your business will change. You don’t want your reporting and analytics infrastructure to stop you from making the changes that unlock further growth. 

However, upgrades are often stressful. If all of your reporting is directly pulling from data from a specific tool, changing this tool might bring about the need for a large and complex analytics project to rebuild all your existing reporting. Mammoth’s approach is to create a ‘Bronze’ model that replicates the raw data but removes the tool dependency.

Let’s say you collect revenue on a monthly basis and you are looking to change your payment provider from Chargebee to Stripe. We would create a ‘Bronze model’ that pulls ‘active subscriptions’. When you change to Stripe, you just need to map the Stripe columns to the columns created when you made ‘active subscriptions’ for Chargebee. 

If you had 100 pieces of analysis referencing this table, you would only have to change this mapping in one place and the rest of your downstream models would be unaffected. If you didn’t have this abstraction, all 100 pieces of analysis would need to change to pull Stripe data instead of Chargebee. This has been a big headache for data leaders all over the world.

With our approach, you can unblock future tool changes, and your analytics stack will never be an impediment to business growth.

Centralization

Ask five people in your company to define revenue and it's likely that you will receive six answers! No executive is happy about this situation, as it casts doubt on the credibility of the company’s analysis. Their focus inevitably shifts to why these numbers are different rather than what the analysis actually says. 

Analysts hate it too. They want to be producing high-quality reports, and not spending their time figuring out why the numbers aren’t consistent.

Solving this problem is beneficial for all parties involved. But it does require a set of well-documented and tested tables that contain organization-wide metrics, such as revenue calculations, users, sales calculations, and categorizations. 

At Mammoth, we perform this task in the ‘Silver tier’.

There are two real benefits of centralization:

  1. Your core business logic is centralized and used by all departments. If separate teams each have their own queries or models that calculate key metrics, not only does this lead to inconsistencies but also duplicate work building and maintaining the models.
  2. It creates multiple org-wide owners for key business metrics, reducing key analyst risk. Most organizations have multiple business-critical reports built on queries that only one or two analysts understand. If these analysts leave, queries become owner-less and remain static for fear of breaking them. 

Accuracy

The leading driver of data costs is the cost of making bad decisions based on poor data. This is most pronounced in the marketing department. We find that consistent marketing efficiency gains of 15-30% are possible when implementing best-in-class marketing analytics. Please refer to our marketing-specific data warehouse blog post for information on how to realize this. 

Successful decision-making calls for accurate, reliable data in the format that each department uses for its analysis. The data used in decisions is often in the ‘Gold’ layer and therefore relies on the accuracy level achieved in earlier layers. 

The layered approach boosts report accuracy in a number of ways: 

  1. Analysts defining new reports are building off the ‘Silver layer’ rather than starting from scratch, which leads to quicker analysis and a lower chance of errors 
  2. Any data inconsistencies in core business metrics will be quickly identified as they are used across the organization 
  3. As more people are using the lower layers there is more of a business case to invest in robust model testing and checking

Conclusion

In conclusion, the importance of a well-designed and structured data infrastructure can lead to savings of 10-30% of revenue. 

By adopting a layered approach, starting from the raw data at the 'Bronze layer' to the core business logic at the 'Silver layer', and finally, the reporting at the 'Gold layer', organizations can improve analytical speed, increase flexibility, centralize metrics, and ensure data accuracy. 

This approach can help organizations make better-informed decisions -  and unlock further growth without being hindered by their reporting and analytics infrastructure.

Data Team as a Service is Mammoth Growth’s product. We provide a turnkey team that has every element of a high-end, enterprise-grade analytics department, at a fraction of the cost.  

You gain access to industry-leading experts in the field and can instantly spin up your BI project without the risk, overhead, and delays of building out a full-time team.

Schedule a Discovery Call: email info@mammothgrowth.com

Ready to unlock new
growth opportunities?

We and selected third parties collect personal information. You can provide or deny-  your consent to the processing of your sensitive personal information at any time via the “Accept” and “Reject” buttons.