Monday, June 23, 2008

Dense or Sparse - Part 1

Dense or Sparse 1:

Undoubtedly, it’s the world of ASO. But, BSO is not dead. In the initial days of cube designing, one would always say to himself .
“This looks like dense, hmm, may be sparse …I am not sure”

After this topic of my blog, I’ll try to reduce the number of people questioning this one.

I have always been told by many that I explain things from basics. The same goes here too.
First, we’ll spend some time on dimensional model.
No, don’t close the window right now .I mean, only 2 terms of dimensional model.
I shall talk of
FACT
DIMENSION.

Ex: The sales of Honda cars, in Charlotte of North Carolina, in the month of Jan of year 2008 are 5000.

Request all to re look at the above sentence and pull out the dimension names

In the first go, I can make out the following
1. Honda
2. Charlotte
3. Jan

More scrutiny would give more details
4. Sales
5. 2008
6. 5000

Now, what are these numbers or names suggest .Lets try with the names of dimensions

1. Products (from Honda)
2. Place (from charlotte)
3. Time (from Jan)
4. Sales (let it be this way, for some time)
5. Time (from 2008 year)
6. 5000 (this is the number, which is of sales)

Now, let’s refine it further.
Products
Place
Time
Sales

Lets assume that -
There might be more products i.e. Honda, GM, merc
Place, In US, North Carolina is a state and charlotte is one of the cities
Time, this one has a year , quarter , month ..Etc

Now, the final outline can be made with the limited knowledge which we had gained from the above

-Time
-Q1
-Q2
-Q3
-Place
-States
-North carolina
-Charlotte
-Sales (Lets keep is the same , for some more time )

This looks near to our outline.

Fact table contents are the ones, which a user/analyst/decision maker is interested to look at, and understand the business of his company.
The fact table content, when seen against other dimensions ( in our case , time , place , product) gives a user more information about his company.

I.e. Sales in a place, in the month of XYZ for a product.

Now, we are clear with FACT and DIMENSION and what fills them in a dimensional model

Coming to the DENSE, SPARSE, this is our focus
Generally, Dense has contents of fact table and sparse has the rest of dimensions.
To elaborate, Sales is the one, which is a measure/ metric which a typical user would be interested to know of his company.

If you had looked at the definition given in the DBAG (our Bhagvathgeetha) , it says that it has the maximum probability of occurance .

Now ,its easy to interpret that , we might ask questions like

Sales of Honda in Jan
Sales of GM in New jersey
Sales of Honda in 2000

We would be posing questions like
Honda in jan in New Jersey…? And what …can’t even make a question from user’s perspective

Thus, ‘sales’ as a dimension member has max probability of occurrence.
Hence, it’s a dense member and other dimensions like Place, Time , Product are sparse .

I shall continue this post with topic
DENSE OR SPARSE 2..

Questions feed back , I invite

I invite blog topics too ,as this topic is also made on demand and need basis.
Hope it helps


Sandeep Reddy Enti

2 comments:

Anonymous said...

Excellent Work Sandeep... You have clearly explained how the business Analysts analyze the business in creating the Metadata and Its like a spoon feeding for the beginners.. I really appreaciate your efforts and will be waiting for the future posts...

Thanks,

Venkat.

Unknown said...

Hi Sandeep,
Thank you for the information you provide and for sharing it. But what would you say to the experienced overall Oracle professional with PL/SQL programming, modeling and knowledge of antiquated ETL/OLAP tools that want to make the move to Hyperion? ETL tools have really picked up in the last 5 years (I had a manager that insisted I use vi and PL/SQL for ETL).