Thursday, March 13, 2014

Using Enterprise Bus Matrix for developing your BI solutions

Enterprise Bus Matrix

The enterprise bus matrix is used in the requirement gathering process while designing enterprise wide BI systems.Many organizations who are using an "Agile" approach to designing warehouse solutions use the Bus matrix to get their requirements specified.

How do you make the bus matrix ?

  • Invite your business users,stakeholder and an experienced facilitator to drive the conversation
  • Identify your key business needs.
  • Involve multiple departments using the system.
  • Identify all the data that will be used in the system.
Conformed dimensions are the heart of the bus matrix. Making sure that consistency is there across different business units makes key stakeholders looks at KPI's over a variety of business processes.
Facts should also be conformed.

The objective of using a a bus matrix approach is so that stovepipe data marts are not created.
The matrix defines processes that can be used. Dimensions represent the grain that can be defined.

An overview of the Bus Matrix
  • Conformed dimensions and conformed facts
  • Shared business process and dimensions
  • Planning,communication and expectation management.
  • Rows-Present processes,Columns-represents dimensions
  • After the core processes and dimensions are identified, you shade or “X” the matrix cells to indicate which columns are related to each row.

Matrix extensions

  • Opportunity matrix:
Replace dimension columns with business functions.
  • Analytics matrix:
In this case, reference the stable bus matrix rows but list the complex analytic applications as columns, shading the boxes to indicate which business processes are needed by each application to convey the prerequisite building blocks.
  • Strategic business initiatives:
List organization key initiatives or executive hot buttons as columns mapped to the underlying process metric rows.
  • Detailed implementation bus matrix:
Matrix rows are expanded to list individual fact tables or OLAP cubes, along with their specific granularity

Tuesday, February 18, 2014

Tableau reports

Using Tableau Desktop to create basic reports.

In this blog post I am looking at creating a very basic report using Tableau.

We will use one of the sample excel sheets to create this basic report. I will provide a brief walk through with screenshots.

Tableau Desktop

This is the initial data menu. You have a vast list of database types that you can choose from. Tableau gives its own set of sample excel sheets that will use for this example.

Select Sample-superstore-Excel


We see that the dimensions and the measures show up on the left hand side.
Tableau gives you a platform to create a wide variety off reports. You can create bar graphs,scatter plots and it has some amazing visualization tools for geography enabled data as well.

Lets say we're interested in finding the customer segment information. Drag and drop the customer segment from the dimension to the "Rows" tab. You should get a table with the following information.




Lets analyze the sales across different states for the consumer segments. Drag and drop state to the "rows" and then drag and drop sales on the "columns" tab. The final result should look something like this.



You can play around with the visualizations and get Maps as well. Click on "Show me" and change the visualization type to maps.

Here's a map type report of the same data set.

Let's see if OBI delivers the same reporting experience.

Signing off !!!

Tuesday, February 11, 2014

Types of Dimensions

In the previous blog we looked at classifying facts and measures into different categories,this blog post is related to dimensions.
We will not be covering the different ways to preserve history in a dimension rather we will look at ways in which a dimension table structure is broadly classified.

  • Conformed Dimension
Multiple keys are referenced from this dimension. These dimensions "conform" i.e they mean the same across different tables referencing them.For eg: Date dimension in a sales fact table must be the same as a date dimension connected to a product table


  • Junk Dimension
A junk dimension is a grouping of attributes that might not be required in the fact table but are required for storage purposes. Rather than having these attributes in the main fact table it is easier to store them in a separate dimension and access them whenever required.
Typical junk dimensions will have flag attributes,text description and can be unrelated to each other.

  • Degenerate Dimension
A degenerate dimension does not need a separate table to be modeled. It exists in the fact table and may often be the primary key of the fact table.
Common degenerate dimensions include transaction id numbers,invoice number,ticket numbers etc.

  • Role-playing Dimension
 Dimensions that can be used multiple times in different contexts are called role playing dimensions. A date dimension is a common role playing dimension. For eg: When different dates(Date of Birth,Date of Joining..) are part of a single fact table,they play a role playing dimension.


Next post we'll look at how we can model some "Slowly Changing Dimensions" .

Tuesday, February 4, 2014

Its all about the Facts

Types of FACT tables

Fact tables can be classified broadly into 3 types:

  • Transactional Fact Table
A transaction fact table gives you information "one row" at a time. This is a single specific event that happens in the transaction.This fact table has multiple dimensions associated with it. Example: Take an order receipt from your store. Every item that you purchase is recorded as a row in this transactional fact table.

  • Periodic Snapshot Fact table
A periodic snapshot fact table as the name suggests are accessed at specific periods in time. These contain measures which might be semi additive.For example: Lets say that the store you purchase your groceries from needs an aggregated report of the daily sales that the shop generates.In such scenarios we utilize the Periodic Snapshot Fact table. This is done in scenarios where you need point in time data. Accessing bank statements on a quarterly basis can also be used as an example for periodic snapshot fact tables.


  • Accumulating Fact Table

The accumulating fact table (also referred to as the accumulating snapshot fact table) is used to capture a series of events in a well defined process. For example: When a student enrolls into a University,there are a series of steps that have to take place for the student to be admitted. We can record these steps until the student is finally admitted to the university. There will be multiple date value columns to identify when different steps in the process were completed. We revisit the grain and keep updating it to show the different states.

On the same lines we can look at classifying measures as:

  • Additive: Can be added across the different rows( example: Sum of Sales quantity)
  • Non Additive: Measures that are stand alone and are only specific to a fact(example: Color of a product)
  • Semi Additive: Measures that can be added across some dimensions(example: Adding sales quantity for a specific year to report yearly).

We'll look at dimensions in a later post..Cheers !


Friday, January 24, 2014

Diving into BI


The Data Warehouse Toolkit,3rd edition by Ralph Kimball and Margy Ross will serve as a reference to this blog.  




The Basics

Business Intelligence is a process which comprises of many steps and goals :
  • Understanding various business processes within the system,this may include requirements from a wide variety of users.
  • Once processes have been identified they need to be mapped to specific data sets or entities.
  • Massive amounts of data are stored in a data warehouse,which is essentially the crux of BI.
  • The BI environment must be sustained and any application accessing the data warehouse must be able to read the data without any "hiccups". High quality and accessibility are also some features that a BI solution must have.
Datawarehouses are de-normalized so that data access is fast and aggregates are easier to obtain.

Facts 

As the title suggests this is the very "grain" of information required by business users. Multiple fact tables may exist within a system.
For example a transaction you make while purchasing an item is recorded as a fact in the data warehouse. The fact row might contain attributes such as transaction#,customer info,item purchased,sales amount,quantity sold etc.
If this was to be stored in a normal OLTP system we would store the data based on different themes i.e customer info would ideally go into a table which has all the customer information and item would go into a an item table.
Lets take a step forward and see how we store it in a OLAP system.
We de-normalize the data and store it in "FACT" tables. A fact table represents a business process which contains measures (In this case sales amount,quantity) and dimensions (Date of transaction,customer information,Item purchased)
Facts contain foreign keys to different dimensions present

Dimensions
Dimensions are different ways in which the data can be viewed. These are tables which contain themes present in the data. Viewing a fact from different points of view are what a dimension essentially achieves.
In the example above customers,items are different dimensions that will be created in the system.

Star Schema
An architecture wherein a central fact table references multiple dimensions is referred to a star schema.


Next post we'll be digging in to the different types of fact tables. Stay Tuned !