Saturday, June 23, 2018

What are the index based rules and why they are used by cube designer?

In the previous blog, we discussed how to create a dimension build rules file and how it creates a dimension with few clicks from raw data.

In this blog, we will cover the basics of index based rules and the difference between them.
  • Good news is they are the similar except the way column operations(join, create columns, ...) are handled. Another difference is index based rules are supported for dimension build only and not for data load. 

  • Another good news is, if your rules are in regular format, there is no need to convert them to index based. If you are writing new rules, index based rules are easy to develop and understand.

Let's dive into the column operations for regular.
Regular rules support column operations at global level and they are performed sequentially.

Lets say there are five columns: Marker, Hardware, Year and Software

When we will open this in rules editor, it will create four fields. field1... field4

Marker column will be used to prefix like iPhone, iPad ...

To do this, we will first join fields field1 and field2.

Now data will look like this

Notice that fields are merged and we are happy. We will tag first as Hardware's generation 2, field2 as Year's generation 2. Let's look at field3 which we want to create a new generation of Software however we want "i"(Marker) to be prefixed to it  and there is no "i" now. To get out of this situation , we have to remove the operations we did earlier and create a copy of field1. 
We are lucky as we created only one operation. What if we created 50 operations and we have to start all over again. It would be frustrating experience.
As these operations are managed at global level and sequential chain is created, it will be very hard to make changes which will affect all other fields and operations.

This sequential chain is stored like below:
Fields: field1, field2, field3, field4
Operations: join(field1, field2) -> field1

These operations make fields depending on each other. If we change field2 out of 50 fields, all remaining 48 fields may/will be affected.
Complex, isn't it?

Now what if Essbase tells that there is a new way of managing these operations in OAC and these operations are managed at field level and each field will be independent.

Lets take same example:

Four fields are created. Field1, field2, field3, field4. If we join field1 and field2, it will create expression for the field1 as join(source-column1[Marker], source-cloumn2[Hardware]).
Notice that join operation is performed on source columns and fields are not touched.
This simple change will make you to perform dynamic operations on source data and keep fields independent. So now "i" can be used for both field1 and field3 as below:

field1: join(Marker, Hardware)
field2: Year
field3: join(Marker, Software)

Even if you created 50 operations  and want to create new one, you only have to think about specific field than all of them.

In fact, OAC provides an expression editor for each field to build complex operations.

Another difference between these two types are the field reference and ordering. In Regular rules, generation-2 field should appear before generation-3. This is why we need to move the fields.

Index based rules refer to the parent by a parent index than moving fields.
Properties as well can appear in any order and they refer to the generation/level/parent/child field by a refer index.
Probably this decoupling between fields and columns and index reference is the reason why they are called "index based".

Due to its simplicity and maintainability, cube designer preferred to use these rules than regular ones.

In next tutorial, we will create an index based load rule which will make things more clear.

Stay tuned...

No comments:

Post a Comment