The Enigma of WebSphere Application Server Plug-in, Part-2: Parameters that you can tune


In the first part of this series, we learned how the WebSphere application server plugin works and different components involved in its operation. In this second part, let us discuss different parameters that can affect the operation of the plugin and tuning options.

TCP timeout on operating system

This parameter controls when to drop a request when the client is not able to communicate with the server using TCP/IP. In case of the plugin, this parameter controls when the backend application server will be marked as down. For example, you have TCP/IP timeout on your linux operating system as 90sec and the plugin failed to communicate with an application server hosted on that machine, then it waits for the TCP/IP timeout setting and marks the server as down.

It is very important to know that since this parameter is set at operating system level, it will affect all other TCP/IP communications on that machine.

Connect timeout in plugin-cfg.xml

This setting allows plugin to perform a non-blocking connection with the backend server.

<Server CloneID=”abcd123x” LoadBalanceWeight=”2″ Name=”WASNClone1″ ConnectTimeout=”X”>

  • A value of ‘0’ allows plugin to perform non-blocking connection.
  • A value greater than 0 specifies the number of seconds you want the plug-in to wait for a successful connection. If a connection does not occur after that time interval, the plug-in marks the server unavailable and fails over to one of the other servers defined in the server group.
  • If no value is specified, the plugin wait until TCP/IP timeout occurs.

If no clone responds within the specified connect timeout, then a HTTP 500 error will be thrown.

When a clone is marked down, the plugin will retry for connection based on RetryInterval.

Plugin-log entry when connect timeout occurs

ERROR: ws_common: websphereGetStream: Connect timeout fired

ERROR: ws_common: websphereExecute: Failed to create the stream

ERROR: ws_server: serverSetFailoverStatus: Marking WASNClone1 down

ERROR: ws_common: websphereHandleRequest: Failed to execute the transaction to ‘WASNClone1’on host ‘test.josephamrithraj.com’; will try another one

RetryInterval in plugin-cfg.xml

This parameter specifies after how many seconds the plugin should retry for connection after a clone has been marked as down.

<ServerCluster Name=”WASNCluster” RetryInterval=”420″>

  • The default value is 60sec [this will be applied when no value is specified]
  • A higher value will keep the clone down/offline for longer times.
  • A small value will make the request response slow as it has to try the failed/marked down clones.

Web Container thread pool maximum size in application server

This parameter determines how many concurrent requests can be serviced by the application server.

If the “Allow thread allocation beyond maximum thread size” option is not checked, the maximum size of the thread pool will be 50. If it is checked, then the application server will allow nearly unlimited threads; limited only by the system and other configurations/capacities. Use this option with caution. If you check the “Allow thread allocation beyond maximum thread size” option, be aware that it could allow unexpectedly high load to the application server’s JVM memory, database connections, CPU, EJB container, etc., thus possibly rendering the system unstable.

Custom properties on web container thread pool

MaxKeepAliveRequests

Specifies the maximum number of requests which can be processed on a single keepAlive connection.

  • Defaults to 100 if not specified by the user. Setting this property to
  • A high value provides better performance.
  • A low value can help prevent denial of service attacks if a client tries to hold on to a KeepAlive connection indefinitely.
  • This custom property is ignored if MaxKeepAliveConnections is equal to zero.
MaxKeepAliveConnections

Enables reuse of HTTP connections that have already been established between the plug-in and the application server’s HttpTranport.

It provides a performance boost because it prevents each new HTTP request from creating a new connection.

The KeepAlive connections will be terminated by either the MaxKeepAliveRequests parameter or the ConnectionKeepAliveTimeout parameter.

MaxConnectBacklog

If the application server’s Web container receives more concurrent requests than it is configured for, the requests start queuing up at a TCP/IP level. The MaxConnectBacklog setting controls the number of such requests that get queued up before the plug-in is refused more connection requests. If this number is exceeded, the requests from the plug-in will not be able to connect to the HttpTransport port.

If not specified by the user, the default value of this parameter is 512.

For the MaxConnectBacklog parameter to work correctly, the corresponding OS level value of the backlog parameter should be equal to or greater than the one defined in WebSphere Application Server’s HttpTransport configuration.

ConnectionKeepAliveTimeout

The maximum time to wait for the next request on a KeepAlive connection

  • If the next request on this KeepAlive connection is not received within this time, the connection will be closed.
  • Default value is 5 seconds.
  • This custom property is ignored if MaxKeepAliveConnections is equal to zero
ConnectionIOTimeout

This is the maximum time (in seconds) to wait when trying to read data during the request. This timeout determines how long to wait to read at least one byte of data.

  • The default value is 5 seconds.
  • You may have to increase it if you experience extremely slow network connections where two subsequent data packets come in spaced more than 5 seconds apart.

Advertisements

3 thoughts on “The Enigma of WebSphere Application Server Plug-in, Part-2: Parameters that you can tune

  1. koti says:

    Hi Joseph,
    This information was very helpful for me in understanding Plugin Concept,
    Will please post the remaining two parts in this Four part series

  2. Stephen Dubicki says:

    I am experiencing the following issue however, we see no network problems. There are 20 web servers and all are available each time this error occurrs. Any suggestions?

    UNICA application was encountered the below RC 107 exception in the logs.

    [9/28/15 19:45:40:043 CDT] 00002e6d SystemOut O ERROR com.shc.unica.api.web.UnicaKPOSRequestHandler readHttpRequest IOException while getting POST data
    java.io.IOException: Async IO operation failed (1), reason: RC: 107 Transport endpoint is not connected

    Issue reported timings:

    • Sep 23rd 11:20 AM to 11:30 AM CT
    • Sep 28th 7:15 PM CT to Sep 29th 5:30 AM CT
    • Sep 29th 1:29 PM CT to 1:30 PM CT
    • Sep 29th 2:50 PM CT to 2:55 PM CT
    • Sep 30th 9:00 AM CT to 9:06 AM CT

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s