BI Technology

How to Revitalize Your BI with In-memory Analytics

As humans, we have very flexible thought processes.  We ask questions that lead to answers.  If we don’t like the answer, we ask a different question.  Unfortunately, traditional Business Intelligence (BI) methods do not share this flexibility with us humans.
 
Throughout my career, I have had the opportunity to experience BI both as a consumer, when I was an analyst, and as a provider, when I was a BI administrator.  These roles were both fascinating and frustrating.  As an analyst, I often felt that my hands were tied, and I was limited to working with restrictive data models that only allowed me to access very limited information.  I needed a way to perform analysis across more business dimensions than the dozen-or-so that were provided. As a BI administrator, I was drowning in requests to provide more and more data, to satisfy more and more requests, so the Analysts could answer more and more questions.  Answers would lead to more questions, which would lead to more requests, so they could obtain more answers.  Rinse and repeat. Again and again.
 
It was, and still is for many, a never-ending cycle with the goal of obtaining more business insight. It’s not that the data wasn’t there, it was just largely inaccessible to those who needed it most. The BI solutions we worked with were bulky, had stringent requirements for interaction, and were considered more of an IT solution than a business one.  Thus, dimensional analysis was a limited, somewhat painful process for analysts and administrators alike.
 
My goal for this post is to provide you with a closer look at traditional BI methods, discuss what many consider to be their limitations, and present an alternative method which leverages modern technologies and provides the BI practitioner with great analytic agility.

Background

Effective dimensional analysis is an interactive process by which analysts can gain greater insight through the exploration of data by different business measures,attributes, and dimensions.
 
Traditionally, dimensional analysis has been conducted by leveraging online analytic processing, or OLAP, models.  These models contain various data elements which can be categorized as either business attributes (such as sales region, products, and dates) or measures (such as sales quantity and revenue).   Which attributes and measures are included in the model needs to be carefully defined to accommodate the anticipated business requirements for analysis.
 
Other important considerations when defining an OLAP model are 1) defining a drill path for the attributes; and 2) choosing how to aggregate the measures.  A drill path is a logical sequence by which a business analyst can start with summary level information, and then “drill” to a more detailed level.  For example, starting with sales figures at the “Sales Region” level, drilling to a more detailed “Sales State” view, and finally to the individual “Sales Representative” level.   At each of these levels, the analyst can view measures such as “Sales Quantities” or “Revenue”, which have been aggregated at each level of the attributes.
 
OLAP solutions have generally been adopted in one of two ways:

  1. Multidimensional Online Analytical Processing (MOLAP).  Commonly referred to as “cubes”, MOLAP solutions involve extracting data elements from one or more operational sources, and building a separate data structure, which will serve as the source for the analysis.
  2. Relational Online Analytical Processing (ROLAP).  This method involves defining an OLAP model, in which the data elements are linked back to the originals sources of the data.  No separate data store is created in a ROLAP model, and requests are sent to the underlying data sources.

Organizations must then decide which model (or perhaps both) best fits their requirements.  MOLAP solutions are best suited when requirements allow for smaller data volumes; contain a finite set of attributes and measures; and require fast response times.  ROLAP models provide for analysis against larger data volumes, with a broad range of attributes and dimensions, however, response times are typically much slower (as compared to MOLAP.  To provide an acceptable level of performance using ROLAP, summary tables which contain pre-aggregated data are often leveraged, as are materialized views.

So, what’s the problem with the traditional OLAP methods?
 
Simply put, these models are highly inflexible, do not support an iterative analytic process, and require significant administrative overhead.
 
In the previous section, I described the need to pre-determine which data elements to include in the model pre-define the drill paths, and pre-aggregate the measures.  This would be fine if all questions that were ever going to be asked were known ahead of time.  Guess what?  That is never the case.  All of this pre-determining, pre-defining, and pre-aggregating, prevents you from gaining true business insight.  What if I want view a business measure by some attribute that’s not in the model? Or, if I didn’t want to drill from “Sales Region” to “Sales State”, but instead wanted drill directly into “Products”?  And, what if instead of only viewing my “Total Revenue” by month I wanted to quickly see my “Average Revenue” by month.  Well, back to IT we go to redefine the model!

Exploring new dimensions with extreme analytic agility

An iterative process is one in which a series of operations is repeated until a desired goal, target or result is achieved.  As described above, the limitations of traditional OLAP methods do not support iterative analytical processes.  You can have speed with MOLAP, or analyze large volumes of data with ROLAP, but not both.  And in either case, rigid data models often prevent you from obtaining answers to questions, which arose from asking other questions.  Remember, as humans, we have very flexible thought processes!!  There is an undeniable need for solutions that will break through this inflexible barrier, and provide analysts with the speed, agility, and flexibility required to gain true business insights.
 
Technological advances in the areas of in-memory storage and computing have provided a means by which we can now achieve truly flexible, iterative analytics.  Cutting-edge analytics solutions allow organizations to analyze terabytes of data, across all business dimensions and measures, while delivering extreme performance at scale.
 
It has been said that memory is the new disk.  By loading, or caching, your data into RAM, your analytic performance will improve by orders of magnitude over traditional, disk-based solutions.  This is extremely important as iterative processes require extremely fast response times.  Delays will lead to decreased user adoption and lack of insights.
 
But in analytics, the importance of speed extends far beyond the “wow factor” of immediate results.  The need to pre-process data for the OLAP models described above is primarily to provide adequate performance.  The only way these methods have been able to satisfy business users, is to aggregate the measures, at each level of attribute, prior to any analysis.  This results in a constrained data set, and does not provide the flexibility for true data scrutinization.

Now, with the power, capacity, and speed of in-memory computing, all calculations can be performed on-the-fly.  This allows you to aggregate data interactively, however your needs demand; drill through and across all dimensions (not just the dozen-or-so modeled in your cube); while experiencing rapid response times across large volumes of data.  Therefore, the extreme performance that results from powerful in-memory processing, not only provides the “wow factor”, but also enables an effective way for analysts to gain deep understanding of their data and their business, without the restrictions of the traditional, rigid OLAP process.

Conclusion

Many would argue that data has become the single most valuable commodity in the world. However, it is only valuable if it is accessible and can deliver insight.  BI practitioners have long had to work with inflexible analytic models, with a heavy reliance on IT.  In-memory computing provides a powerful, flexible platform which enables users to analyze data at scale, aggregate business measures however needed, and truly explore new dimensions.

 

×