Websphere : Plug-in Workload Management Failover

Posted By Sagar Patil

We have a 2 node clustered websphere 6.x vertical cluster . Number of times we see the system going down and coming back up in less than 5 mins.

Investigation:

Websphere  uses SESSIONID to divert user sessions to relevant JVMs . Plug-in polling interval keeps track of status of JVMs (up/down/hung).  Under situation we had, HTTP plug-in should direct user session to another JVM (JVM2 here) . But I think it didn’t .  To do so, we need to configure parameter “ConnectTimeout”  to force it to look for another server.

“ConnectTimeout” makes plug-in use a non-smoking connect.
Setting ConnectTimeout to a value of zero (default here) is equal to not specifying ConnectTimeout attribute, that is, the plug-in performs a blocking connect and waits until the operating system times out  (For Linux it can take up to 5-10 minutes for the Socket to time-out).

ConnectTimeout

The ConnectTimeout attribute of a Server element enables the HTTP plug-in to perform non-blocking connections with a backend cluster member. Non-blocking connections are beneficial when the HTTP plug-in is unable to contact the destination to determine if the port is available or unavailable for a particular cluster member.

If no ConnectTimeout value is specified, the HTTP plug-in performs a blocking connect in which the HTTP plug-in sits until an operating system TCP timeout occurs (as long as 2 minutes depending on the platform) and allows the HTTP plug-in to mark the cluster member unavailable. A value of 0 causes the HTTP plug-in to perform a blocking connect. A value greater than 0 specifies the number of seconds you want the HTTP plug-in to wait for a successful connection. If a connection does not occur after that time interval, the HTTP plug-in marks the cluster member unavailable and fails over to one of the other cluster members defined in the cluster. 

Caution: In an environment with busy workload or a slow network connection, setting this value too low could make the HTTP plug-in mark a cluster member down falsely. Therefore, caution should be used whenever choosing a value for ConnectTimeout.

Set attribute “ConnectTimeout” to an integer value greater than zero to determine how long plug-in should wait for a response when attempting to connect to a server.  A setting of 15 means that the plug-in waits for 15 seconds to time out than 5-10 minutes thru OS settings.

<Server CloneID="10k66djk2" ConnectTimeout="10" ExtendedHandshake="false" LoadBalanceWeight="1000" MaxConnections="0" Name="Server1_WebSphere_Appserver" WaitForContinue="false">
<Transport Hostname="server1.domain.com" Port="9091" Protocol="http"/>
</Server>

Leave a Reply

You must be logged in to post a comment.

Top of Page

Top menu