Thursday, December 29, 2011

Parallelism of scenarios

Lets take a deep dive into the parallelism capability of ODI using StartScen and Asynchronous mode.
Here is a nice article from Christophe (http://blogs.oracle.com/dataintegration/entry/parallel_processing_in_odi) that illustrates how this can be achieved.
However, the important thing to understand here is that the job kick-off process is not truly parallel.

A truly parallel process should look like the following where the Start will spawn 3 parallel processes.

Whereas in ODI, the above behavior is mimicked by connecting jobs sequentially to each other and then executing them in Asynchronous mode.

Does the question arise - What's the difference as long as the jobs are executing in parallel ? A slight difference.
Consider a scenario when the scenario "StartEmpLoad" cannot start due to some reason (Scenario not available or Agent selected for the Scenario is down). In that case, the next (or did we say parallel) job "StartSalesLoad" will never be kicked off. Since both of them are connected with an "OK" and since the first job wasnt OK, the flow never reaches the next process. This error is a job kick-off error and is different from the data errors that would normally be encountered during a job execution (Duplicate keys, Data not found, etc).

Each StartScen in here kicks off a job and then moves on to the next StartScen. Being Async jobs, even though they dont wait for the results to come back, they do make sure that the job was kicked off successfully (vs completed successfully).

Wednesday, December 28, 2011

ODI Tools without Hardcodings

ODI provides us with a set of utilities that can assist us in day to day tasks such as Ftp, FileCopy, SendMail, SQLUnload, etc. These tools make ODI powerful.
These tools are available in the Toolbox (once you start creating a package).

A common problem of all the tools is the need of hardcoding file paths, user names, passwords, etc. This is the least desirable of any tool. As a developer you will want the flexibility to change the parameters whenever the need arises. Unfortunately, the way the tools are laid out this change is not very intuitive.
It seems that the values need to be changed and then the scenarios need to be recompiled.

Wait ... Did I say recompile the scenarios for changing a parameter ? Ohh. Such a basic problem.
C'Mon. Its better to write programs in C where the parameters can be passed from the Command line using argv[].

How did such an excellent tool miss on this one ?
Ok. Ok. Lets explore a couple of ways in which the life can be simplified.

  1. Use them as Sunopsis API - Each tool can be used as a Sunopsis API in the ODI Procedure. Its very easy to see the syntax of the corresponding API for each tool. Once you drag-drop the tool inside a package, in the Properties Explorer click on the "Command" tab. As you keep adding parameters on the "General" tab, the "Command" tab keeps getting updated with the values. ODIexperts specify an excellent way of invoking the ODI tools inside an ODI procedure - http://odiexperts.com/calling-odisqlunload-using-odi-procedure-with-no-hardcoded-password

  2. Use ODI variables - Define an ODI Refresh variable for each of the parameter that you want to parameterize. Before invoking the tool, refresh the ODI variable from a database configuration table. Use this ODI variable in the parameters section of the tool. The following snapshots will illustrates how this can be done.


  3. This method will mandate that the package contains several variables. This may make the package look too full with too many connections.

  4. Use Jython Variables - This method is an enhancement to the method mentioned above. The only difference being the use of Jython variables instead of ODI variables and a bit more :D . The real advantage of this approach is the less cluttered ODI package as all the variables being used are created in the ODI procedure.
    This will be a 2 step process:
    • Create and Refresh all the Jython variables in an ODI procedure.
    • Use those variables in parameters of ODI Tool inside the package.

    In the example that is provided here, the odiSQLUnload tool is used and the Jython variables are being used for the Connection information. The procedure that will contain the commands to declare and populate the jython variables will look as follows:


    Then the package will be created where this procedure will preceed the ODI tool OdiSqlUnload.

    The OdiSqlUnload tool's command should be configured to look similar to this:

    OdiSqlUnload "-FILE=/MyPath/sqlunloadfile.txt" "-DRIVER=<@=jvJDBC@>" "-URL=<@=jvURL@>" "-USER=<@=jvUser@>" "-PASS=<@=jvPass@>" "-FILE_FORMAT=VARIABLE" "-FIELD_SEP=," "-ROW_SEP=\r\n" "-DATE_FORMAT=yyyy/MM/dd HH:mm:ss" "-CHARSET_ENCODING=ISO8859_1" "-XML_CHARSET_ENCODING=ISO-8859-1" "-FETCH_SIZE=5000"
    select emp_no from emp
The above method illustrates an effective usage of Jython variables.
In a similar fashion other values can be refreshed from a DB configuration table and then used inside the ODI tools.

Hope this helps in laying a path towards cleaner usage of ODI tools without hardcoding.