Thursday, February 20, 2020

Distributed Polling in SOA

In a clustered environment, the most common issues we see while polling database for records is that the same record is being retrieved twice. This issue would not occur in development environments because most of the development environments are not clustered. They are single node environments.

But as we move our BPEL code with DB poller to higher environments which are clustered (with multiple nodes) with active-active setup, then this issue is common.

In order to resolve this issue, the solution is to set up Distributed polling in DB adapter. This feature marks SELECT FOR UPDATE SKIP LOCKED on the rows fetched, which would prevent other nodes from retrieving the same records.

Lets see how the distributed polling is done and also discuss other jca properties which would help to tune the adapter processing.

Here are few properties from activation-spec of jca file which needs discussion -

<property name="PollingStrategy" value="LogicalDeletePollingStrategy"/>
<property name="PollingInterval" value="5"/>
<property name="MaxRaiseSize" value="2"/>
<property name="MaxTransactionSize" value="4"/>
<property name="NumberOfThreads" value="2"/>
<property name="ReturnSingleResultSet" value="false"/>
<property name="RowsPerPollingInterval" value="20"/>       

When distributed polling is checked, with above properties set in jca and DB adapter is deployed, the following process takes place.
  1. Based on the NumberOfThreads configured, polling threads are created. Each thread will initiate a transaction and search for matching rows from database with SELECT FOR UPDATE SKIP LOCK issued. 
  2. If no matching records are found, the thread will release the transaction and sleep until PollingInterval (PI) duration is met. 
  3. When matching rows are found, each thread (which already has started a transaction and found rows) will issue a FETCH of certain number of rows from database. This "certain number of rows" is defined by MaxTransactionSize (MTS)
  4. Once the rows are fetched, the thread will not send all the rows as-is to destination. It will loop over the fetched rows, group them based on MaxRaiseSize (MRS) set and then send to destination.
  5. After sending to destination, the thread will compare the number of rows delivered to destination with RowsPerPollingInterval (RPPI). 
If rows delivered => RPPI, the thread will sleep till the duration of PollingInterval and then wake up.
If rows delivered <  RPPI, the thread will continue fetching, looping and delivering process.

Lets take the example of the values set in above code snippet and see how polling happens.
  1. 2 threads are created as NumberOfThreads is set to 2.
  2. Each thread initiates a transaction and search for matching rows. Say, there are 10 matching rows returned from DB cursor.
  3. Each thread will only issue fetch for 4 rows as MaxTransactionSize is set to 4.
  4. Once 4 rows are fetched, the thread will loop and group into 2 rows batches based on MaxRaiseSize.
  5. 2 rows are delivered to destination.
  6. Thread will compare rows delivered (2) and RowsPerPollingInterval (20).
  7. As rows delivered < RPPI, the thread will loop through the next batch and send to destination.
Here’s how the processing looks like -


The RowsPerPollingInterval property is used to throttle the polling threads.  If the RPPI property is not set, then the polling threads will continue to FETCH and process rows until there are no more available.  If the RPPI property is set, the smaller the value, the slower the processing.  Another way to view the RPPI value is with the following scenarios:

MTS = 10 and RPPI = 10, then each thread will only process one MTS batch before sleeping
MTS = 10 and RPPI = 20, then each thread will process two MTS batches before sleeping
MTS = 10 and RPPI = 30, then each thread will process three MTS batches before sleeping

This way we can ensure distributed polling in db adapter.




1 comment:

  1. Thank you so much for providing information and digging out tools related to SOA and even SOAP which play a crucial part in creating dashboards and reports.

    Powerbi Read Soap

    ReplyDelete