Friday, July 6, 2018

OSB 11g Tips and Guidelines

This blog will explain the best practices and guidelines to be followed while designing and developing services on Oracle Service Bus on 11g weblogic server.

Processing of large files

While processing large files, there are certain parameters to be considered – 
1) Heap Memory 

2) CPU Utilization

Heap Memory
Loading large files into memory would require enough heap size. Insufficient heap size will lead to “Out of memory” errors and might also crash the weblogic server. In order to avoid these error, we might need to add more memory to the server. But this is not always a feasible option as the server memory also has a limit. Instead, one of the following options can be tried.
a.    Content Streaming
b.    Process in chunks

               With Content Streaming, the whole object will not be loaded into memory buffer and hence, we can consume large messages. But content streaming has certain limitations because the whole object is not loaded in memory. You can find these limitations at following link and decide if you can implement your use case with these limitations.

http://download.oracle.com/docs/cd/E13159_01/osb/docs10gr3/userguide/context.html#wp1110513

               Process in chunks – Another option would be to break the file into smaller chunks and then use OSB to read and process those chunks.


CPU Utilization

When processing large files in OSB, just passing the content as-is to backend system will not be an issue. But transforming the large content in OSB is the pain point. Any transformation (be it XQuery or XSLT) takes lot of CPU cycles and transformation of huge payloads will consume 100% CPU, resulting in overall drop of the performance. Hence, it is not recommended to have complex transformations and transformations of large XMLs within OSB. 
Following are few options:
a. Opt to move transformations on large payload to backend systems. 

b. Try for XML Appliances which are specifically meant for large transformations with little impact on CPU utilization.


Caching Mechanisms in OSB

When caching data in OSB, two points should always be considered – 
1) How much data needs to be cached 
2) Which caching API to be used.
There are two caching mechanisms that we can look for –
1)    Java Caching System

2)    Oracle Coherence Caching

Java Caching System

This is a distributed caching system written in Java. This caching system provides a means to manage cached data, thus speeding up the processing. Beyond simply caching data into memory, it provides additional features like memory management, thread pool controls, etc. This is a simple caching scheme where static data will be cached per server. This can be easily integrated with OSB. But it is not suitable for caching large amounts of data. It might fail under heavy load in an OSB.

Oracle Coherence Caching

This is the Oracle recommended caching mechanism where the coherence cluster should be configured in the Weblogic server. This is very robust and stable cache mechanism which can handle heavy load and large amount of data. The cached data will be available throughout the cluster.


Design and Coding Best Practices


XQuery Tuning

In OSB, transformations are done through XQuery/XSLT. Hence, it is very important to understand the best practices while performing transformations in XQuery.

a.    Avoid using wild card XPaths (*) and recursive XPaths (Double front slash ‘//’). ‘//’ in an XPath searches the entire xml irrespective of the location of the node in the xml. This would in turn increase the processing time. Hence, avoid the use of // in XPath.

b.    In scenarios, where you know that there can be only one instance of the node, use indexes for efficiency and faster processing. For eg: Instead of $body/Root/Child/Subchild, you can use $body/Root[1]/Child[1]/Subchild[1]. This will minimize the amount of payload parsing. But when an array has to be used, do not go for indexes.

c.    FLWOR (FOR, LET, WHERE, ORDER BY, RETURN) expressions – Instead of using multiple assign actions for creating intermittent one off use variables, LET can be used to combine multiple assigns into one and improve performance.
For eg: Assign 1 – Copy $body/Root/Child/Subchild/value/text() to variable Filter.
            Assign 2 - $RespBody/RespRoot[@name=Filter]/value/text() to variable RespValue.

The above two assigns can be combined into one to get the RespValue as follows:
let $Filter:= $body/Root/Child/Subchild/value/text() return $RespBody/RespRoot[@name=Filter]/value/text() to variable RespValue.

d.    Avoid declaring namespaces above XQuery. It will lead to different namespace prefixes during runtime. Instead you can put the namespace inline with XQuery xml tags.

e.    Reduce the number of source parameters passed to the XQuery. Instead you can pass a single xml which will have all the parameters mapped.

f.     For simple and single field updates, using Delete, Insert, Replace, Rename are more efficient than using an Assign with the complete xml(If the xml is big). Assign generates/loads the entire xml even if the change is in single field, where as the other actions mentioned above deal with only particular fields mentioned in the expression.


Proxy and Business Service settings

a.    At least Basic Authentication should be enabled for all secured services especially the REST based ones.

b.    For all synchronous protocol services, Read Timeout and Connection Timeout should be set, which ensures that the transactions time out after configured time if the target system is down. This would ensure that the calling systems does not wait for a very long time.

c.    For one way service, Retry Count and Retry Iteration Interval should be set, so that the process automatically retries for certain number of times if target system is down.

d.    For REST based proxy service, Content-Type and Accept should be set, which would indicate the type of content that the service is expecting in the request and type of response that would be sent respectively.

e.    Make sure that local protocol is used, when you want to call a proxy service from another proxy service. This would make the call internal and would not have to go through the F5 again to hit the internal proxy.

f.     Create a constants xml for all static data so that the actual code need not be touched if the constant values are changed.

g.    When publishing or routing to multiple services use of table version of those actions are more efficient than if/then/else conditions.

h.    For services which need to be highly available, there is a good option of specifying multiple URIs. Thus, if one URI is not reachable, consumers can automatically use the other.

i.      “Alert” is a very useful feature which allows to track the flow. But it is advisable to use alerts only during development phase for testing. Please remove them from the actual code.

j.     Simple schema validations like mandatory/optional fields, enumerations, etc can be done directly through “Validate” action.

XQuery vs XSLT in OSB 

OSB provides 2 options to perform transformations - XQuery and XSLT.
There have been many debates and discussions on which to be used when. There is no concrete answer, but there are some points which would help us to decide.

XSLT loads the complete xml document into memory whereas XQuery loads only the mapped fields of the xml into memory. So for large xmls, where every field needs to be transformed, XSLT might perform better as the complete xml document is loaded into memory. But in case of large xmls, where only few fields need to be mapped, XQuery is more applicable, as it loads only those fields into memory which needs to be mapped rather than the complete xml. For smaller xmls, either of the approaches work similarly.

From a development perspective, XQuery is easier to develop as it is native transformation method of OSB. OSB provides a graphical mapper for XQuery and test console which allows to test just the XQuery part significantly.

Split Join

Split join is another useful feature provided by OSB. A split join, as the name suggests, splits the input into chunks and sends to multiple flows/services concurrently. This is similar to flow/flowN activities in SOA. In split-join, routing is done concurrently to all the target systems and all the responses are aggregated and returned as one single response. Implementing split-join improves the overall performance of the service. Normally, messages in a payload are routed to target systems in a sequential manner. Hence, the overall completion time of the service would be the sum of the responses of each call. But in split-join, because of concurrent processing, the overall response time would be the time taken by the longest call, which would be less than the sum up time.

Split Join limitations:

1.    Split-joins do not handle transaction rollback in the case of exceptions.
2.    You cannot create a split-join with a WSDL file that includes policies, and you cannot call a WSDL-based business service that contains WSDL policies from a split-join.
3.    You cannot call REST based services from split-join. It always needs to be a WSDL based service. If you need to invoke REST based service multiple times, you might need to create a SOAP wrapper and then call it from Split Join
4.    Number of parallel requests should be decided after a well thought process.

Adapters/Protocol

Throttling is a very important, useful but under-used property. Setting this property in OSB business services, helps to manage the load/capacity.
Once throttling is enabled, the Message Concurrency defines the number of requests allowed before reaching the max limit.

MQ Protocol

1.    Unlike SOA, OSB doesn’t proactively listen to MQ and consume the message as and when it comes. In OSB, polling interval needs to be set for the OSB. If the messages need to be read in real-time, polling interval can be set to as less as a second.
2.    Connection Type for MQ resource should be TCP mode if the queue resides in a different server and Binding mode, if queue resides in same server as the OSB.
3.    It is a good practice to explicitly set the CharacterSetID/Encoding/Format as per the expectations of target systems, to avoid any default conversions happening.
4.    Similar to JMS, MQ also has useful properties like MessageID, CharacterSetID, ReplyToQueueName (for async processes).
5.    For OSB MQ Proxies, the Request Message Type should be chosen careful between Text/XML. If the input is an xml with additional headers being passed, choosing xml will not pick up message.
6.    OSB can accept or send MessageID/CorrelationID only in Base64 format. Hence, encoding/decoding should be done to use them anywhere in between.

JCA Protocol for DB calls

Time outs are very important while making DB calls, hence, QueryTimeout property needs to be explicitly added in JCA file for any DB business service.

FTP protocol

OSB allows to push into/pull from ftp locations in 2 ways – FTP adapter of SOA or FTP protocol of OSB.
1.    FTP protocol being native to OSB is more effective in terms of performance and handling comparatively larger files.
2.    MFL transformation is the best way to convert the file content into proper xml data when FTP OSB protocol is used.
3.    When FTP protocol cannot achieve certain functionality, then SOA’s FTP adapter can be used.

Deployments and Customization File

1.    OSB deployments are simpler compared to SOA deployments due to fact that OSB does not validate the existence of target webservice during deployment. This is a major issue in SOA, where if the target service is down, direct dependent processes cannot be deployed. But OSB saves us from this discomfort.
2.    Urls can be changed in different environments easily through the execution of Customization File.
3.    Customization file can also be extensively used for changing properties like timeouts, retries, etc, without changing the actual code. 

Hope this blog is useful.