Wednesday, December 5, 2012

Oracle Endeca Information Discovery – Partition Transformation

 

I have been working with Oracle Endeca Information Discovery tool to join structured and unstructured data together in interesting business solutions. I wanted to share my experience with the Partition Transformation and help show how it can be used to conditionally split a pipeline in an Intergrator (CloverETL) graph.

clip_image001

The transformation has properties called Ranges and Partition key. Setting the Partition key is easy. Just pick the fields from a list. Setting the Ranges is not so easy to understand.

clip_image003

See the table below that shows an example of setting the Ranges property. Setting ranges requires using an awkward syntax that uses < and > to include and ( and ) to exclude. A comma defines the low end and high end of the range. A semi-colon defines separate ranges. Sounds confusing? It is. Fortunately, there is a much easier way using CTL2.

clip_image005

For the easy way, click on the Partition attribute that drops you into the CTL2 editor. You can refer to your input fields with a $ prefix. $DOC_TYPE was defined on the import port and is an input field. Then write a conditional statement. In my example below, all rows with a $DOC_TYPE = “Operation Instructions” will be sent to port 0. All others will be sent to port 1.

clip_image007