This blog will
explain the best practices and guidelines to be followed while designing and
developing services on Oracle Service Bus on 11g weblogic server.
Processing of large files
While processing large files, there are
certain parameters to be considered –
1) Heap Memory
2) CPU Utilization
Heap Memory
Loading large files into
memory would require enough heap size. Insufficient heap size will lead to “Out
of memory” errors and might also crash the weblogic server. In order to
avoid these error, we might need to add more memory to the server. But this is
not always a feasible option as the server memory also has a limit. Instead,
one of the following options can be tried.
a. Content Streaming
b. Process in chunks
With Content Streaming, the whole object will not be loaded into memory buffer and hence, we can consume large messages. But content streaming has certain limitations because the whole object is not loaded in memory. You can find these limitations at following link and decide if you can implement your use case with these limitations.
http://download.oracle.com/docs/cd/E13159_01/osb/docs10gr3/userguide/context.html#wp1110513
a. Content Streaming
b. Process in chunks
With Content Streaming, the whole object will not be loaded into memory buffer and hence, we can consume large messages. But content streaming has certain limitations because the whole object is not loaded in memory. You can find these limitations at following link and decide if you can implement your use case with these limitations.
http://download.oracle.com/docs/cd/E13159_01/osb/docs10gr3/userguide/context.html#wp1110513
Process in chunks – Another option would be to break the file into smaller chunks and then use
OSB to read and process those chunks.
CPU Utilization
When processing large files in OSB, just passing the content
as-is to backend system will not be an issue. But transforming the large
content in OSB is the pain point. Any transformation (be it XQuery or XSLT)
takes lot of CPU cycles and transformation of huge payloads will consume 100%
CPU, resulting in overall drop of the performance. Hence, it is not recommended
to have complex transformations and transformations of large XMLs within
OSB.
Following are few options:
a. Opt to move transformations on large payload to backend systems.
b. Try for XML
Appliances which are specifically meant for large transformations with little
impact on CPU utilization.
Caching Mechanisms in OSB
When caching data in OSB, two points should always be considered –
1) How much data needs to be cached
2) Which caching API to be used.
There are two caching mechanisms that we can look for –
1) Java Caching System
2) Oracle Coherence Caching
Java Caching System
This is a
distributed caching system written in Java. This caching system provides a
means to manage cached data, thus speeding up the processing. Beyond simply
caching data into memory, it provides additional features like memory
management, thread pool controls, etc. This is a simple caching scheme where
static data will be cached per server. This can be easily integrated with OSB.
But it is not suitable for caching large amounts of data. It might fail under
heavy load in an OSB.
Oracle Coherence Caching
This is the
Oracle recommended caching mechanism where the coherence cluster should be
configured in the Weblogic server. This is very robust and stable cache
mechanism which can handle heavy load and large amount of data. The cached data
will be available throughout the cluster.
Design and Coding Best Practices
XQuery Tuning
In OSB, transformations are done through XQuery/XSLT. Hence, it is very important to understand the best practices while performing transformations in XQuery.
a. Avoid using wild card XPaths (*) and recursive XPaths (Double front slash ‘//’). ‘//’ in an XPath searches the entire xml irrespective of the location of the node in the xml. This would in turn increase the processing time. Hence, avoid the use of // in XPath.
b. In scenarios, where you know that there can be only one instance of the node, use indexes for efficiency and faster processing. For eg: Instead of $body/Root/Child/Subchild, you can use $body/Root[1]/Child[1]/Subchild[1]. This will minimize the amount of payload parsing. But when an array has to be used, do not go for indexes.
c. FLWOR (FOR, LET, WHERE, ORDER BY, RETURN) expressions – Instead of using multiple assign actions for creating intermittent one off use variables, LET can be used to combine multiple assigns into one and improve performance.
For eg: Assign 1 – Copy $body/Root/Child/Subchild/value/text() to variable Filter.
Assign 2 - $RespBody/RespRoot[@name=Filter]/value/text() to variable RespValue.
The above two assigns can be combined into one to get the RespValue as follows:
let $Filter:= $body/Root/Child/Subchild/value/text() return $RespBody/RespRoot[@name=Filter]/value/text() to variable RespValue.
d. Avoid declaring namespaces above XQuery. It will lead to different namespace prefixes during runtime. Instead you can put the namespace inline with XQuery xml tags.
e. Reduce the number of source parameters passed to the XQuery. Instead you can pass a single xml which will have all the parameters mapped.
f. For simple and single field updates, using Delete, Insert, Replace, Rename are more efficient than using an Assign with the complete xml(If the xml is big). Assign generates/loads the entire xml even if the change is in single field, where as the other actions mentioned above deal with only particular fields mentioned in the expression.
Proxy and Business Service settings
a.
At least Basic
Authentication should be enabled for all secured services especially the REST
based ones.
b.
For all synchronous protocol services, Read Timeout
and Connection Timeout should be set, which ensures that the transactions time out
after configured time if the target system is down. This would ensure that the
calling systems does not wait for a very long time.
c.
For one way service, Retry Count and Retry
Iteration Interval should be set, so that the process automatically retries for
certain number of times if target system is down.
d.
For REST based proxy service, Content-Type and
Accept should be set, which would indicate the type of content that the service
is expecting in the request and type of response that would be sent
respectively.
e.
Make sure that
local protocol is used, when you want to call a proxy service from another
proxy service. This would make the call internal and would not have to go
through the F5 again to hit the internal proxy.
f.
Create a constants
xml for all static data so that the actual code need not be touched if the
constant values are changed.
g.
When publishing or routing to multiple services use
of table version of those actions are more efficient than if/then/else
conditions.
h.
For services which need to be highly available,
there is a good option of specifying multiple URIs. Thus, if one URI is not
reachable, consumers can automatically use the other.
i.
“Alert” is a very useful feature which allows to
track the flow. But it is advisable to use alerts only during development phase
for testing. Please remove them from the actual code.
j.
Simple schema validations like mandatory/optional
fields, enumerations, etc can be done directly through “Validate” action.
XQuery vs XSLT in OSB
OSB provides 2 options to perform transformations -
XQuery and XSLT.
There
have been many debates and discussions on which to be used when. There is no
concrete answer, but there are some points which would help us to decide.
XSLT loads the complete xml document into memory
whereas XQuery loads only the mapped
fields of the xml into memory. So for large xmls, where every field needs to be
transformed, XSLT might perform better as the complete xml document is loaded
into memory. But in case of large xmls, where only few fields need to be
mapped, XQuery is more applicable, as it loads only those fields into memory
which needs to be mapped rather than the complete xml. For smaller xmls, either
of the approaches work similarly.
From a
development perspective, XQuery is easier to develop as it is native
transformation method of OSB. OSB provides a graphical mapper for XQuery and
test console which allows to test just the XQuery part significantly.
Split Join
Split join is another useful feature provided by OSB. A split
join, as the name suggests, splits the input into chunks and sends to multiple
flows/services concurrently. This is similar to flow/flowN activities in SOA.
In split-join, routing is done concurrently to all the target systems and all
the responses are aggregated and returned as one single response. Implementing
split-join improves the overall performance of the service. Normally, messages
in a payload are routed to target systems in a sequential manner. Hence, the
overall completion time of the service would be the sum of the responses of
each call. But in split-join, because of concurrent processing, the overall
response time would be the time taken by the longest call, which would be less
than the sum up time.
Split Join limitations:
1.
Split-joins do not handle transaction rollback in the case of
exceptions.
2.
You
cannot create a split-join with a WSDL file that includes policies, and you
cannot call a WSDL-based business service that contains WSDL policies from a
split-join.
3.
You
cannot call REST based services from split-join. It always needs to be a WSDL
based service. If you need to invoke REST based service multiple times, you
might need to create a SOAP wrapper and then call it from Split Join
4.
Number
of parallel requests should be decided after a well thought process.
Adapters/Protocol
Throttling is a very important, useful but under-used property.
Setting this property in OSB business services, helps to manage the
load/capacity.
Once throttling is enabled, the Message Concurrency defines the
number of requests allowed before reaching the max limit.
MQ Protocol
1. Unlike SOA, OSB
doesn’t proactively listen to MQ and consume the message as and when it comes.
In OSB, polling interval needs to be set for the OSB. If the messages need to
be read in real-time, polling interval can be set to as less as a second.
2. Connection Type for
MQ resource should be TCP mode if the queue resides in a different server and
Binding mode, if queue resides in same server as the OSB.
3. It is a good practice
to explicitly set the CharacterSetID/Encoding/Format as per the expectations of
target systems, to avoid any default conversions happening.
4. Similar to JMS, MQ
also has useful properties like MessageID, CharacterSetID, ReplyToQueueName
(for async processes).
5. For OSB MQ Proxies,
the Request Message Type should be chosen careful between Text/XML. If the
input is an xml with additional headers being passed, choosing xml will not
pick up message.
6. OSB can accept or
send MessageID/CorrelationID only in Base64 format. Hence, encoding/decoding
should be done to use them anywhere in between.
JCA Protocol for DB calls
Time outs
are very important while making DB calls, hence, QueryTimeout property needs to
be explicitly added in JCA file for any DB business service.
FTP protocol
OSB allows to push into/pull from ftp locations in 2 ways – FTP
adapter of SOA or FTP protocol of OSB.
1. FTP
protocol being native to OSB is more effective in terms of performance and
handling comparatively larger files.
2. MFL
transformation is the best way to convert the file content into proper xml data
when FTP OSB protocol is used.
3. When FTP
protocol cannot achieve certain functionality, then SOA’s FTP adapter can be
used.
Deployments and Customization File
1. OSB deployments are simpler compared to SOA deployments due to
fact that OSB does not validate the existence of target webservice during
deployment. This is a major issue in SOA, where if the target service is down,
direct dependent processes cannot be deployed. But OSB saves us from this
discomfort.
Hope this blog is useful.