Thursday, August 16, 2018

Loading data into OAC-Essbase from legacy systems using REST apis

As promised in my previous blog, we will try to load the data from a legacy on-premise system to OAC-Essbase.
As we already discussed, not all legacy systems generate data which can be directly imported into Essbase and not all support JDBC. We may need to write a connector which understands both systems.  After moving to the OAC-Essbase this task will be even challenging as Essbase will stay in the cloud and legacy system may stay at on-premise.

Well, there is a good news! OAC-Essbase does have streaming REST apis to do exactly this.

One can think of writing an exporter for legacy system to a file, upload it and load. This will take some time though. Can we load the data while exporting like producer-consumer pattern?

Yes, its really easy.

Current Sequential Solution

New Parallel Solution

As we see in figure-1, there are lot of idle resources and it could be significantly slow compared to the figure-2.
Essbase data load allows multiple streams at the same time so you can divide and concur the data as well. Dimension build will be sequential so only queue is possible currently per cube.

Let's say that we have already exporter code which can start generating stream of the data from our on-premise system.

Let's not store this data while generating and send directly to cube. I am assuming that rule file is already there to process this data in the cube directory.

For data load, there are three steps:
1. Start a data load stream
POST https://<host>/essbase/rest/v1/applications/sample/databases/Basic/dataload

This will return the unique identifier based URL where we can keep on sending the data like below:


2. Keep sending the data
POST https://<host>/essbase/rest/v1/applications/Sample/databases/Basic/dataload/123456789

POST https://<host>/essbase/rest/v1/applications/Sample/databases/Basic/dataload/123456789

3. End stream
DELETE https://<host>/essbase/rest/v1/applications/Sample/databases/Basic/dataload/123456789

You can start as many processes as possible to do this in parallel.

Same can be done with dimension build. 
1. Start a dimension build stream
     "restructureOption":  "PRESERVE_ALL_DATA"

Other restructure options are :

This will return the unique identifier based URL where we can keep on sending the data like below:

2. Keep sending
It is always a good idea to load all the dimensions and then restructure than building one dimension at at a time followed restructure.

2.1 Start dimension build for first rule file

POST  https://<host>/essbase/rest/v1/applications/sample/databases/Basic/dimbuild/123456789/product

2.2  start sending POST https://<host>/essbase/rest/v1/applications/sample/databases/Basic/dimbuild/123456789/product

"Product","Diet","","~","Diet Drinks","","","","","","",""

"Diet","200-50","","","Diet Root Beer","","","","","","","",""
"Diet","300-60","","","Diet Cream","","","","","","","",""

You can repeat this until product is done.

2.3 You can repeat 2.2 for all remaining dimensions

3. End stream

DELETE https://<host>/essbase/rest/v1/applications/Sample/databases/Basic/dimbuild/123456789

This is where restructure will happen once for all the loads done in 2.x

Data is sent over HTTPS and there is no security concern. You can use your favorite language like Java, JavaScript, Python... or you can use command line using cURL.

Try it out and let us know if you have any questions... 

No comments:

Post a Comment