Let's find the definition from official documentation, here and here.
There are two fault detection processes running. The first is by the master, to ping all the other nodes in the cluster and verify that they are alive. And on the other end, each node pings to master to verify if its still alive or an election process needs to be initiated.
ping_timeout How long to wait for a ping response, defaults to 30s.
ping_retries How many ping failures / timeouts cause a node to be considered failed. Defaults to 3.
So this setting is use by master node and data node for detection if the node is okay or not and once ping, the duration time to wait is 30seconds (default) and if 3 times, a node is considered down/failed if it exceed 3 times. Okay, let's go into the code.
Within the class NodesFaultDetection, there is a inner class SendPingRequest. As it implement interface Runnable, the run method will be execute by an executor. Instead, object are read and write so emulate a ping behaviour, you can read the class PingRequest for more information. As you noticed, ping_timeout is pass to the super class of PingRequest.
The essence of logic is pretty much written in the statement transportService.sendRequest(final DiscoveryNode node, final String action, final TransportRequest request, final TransportRequestOptions options, TransportResponseHandler<T> handler). You would think it would be an ICMP ping, but it is not, there isn't isReachable() is called.
In the method, handleResponse(PingResponse response), we see that, the retry count is reset to 0 and then this SendPingRequest object is schedule again with the ping_interval you set earlier. In the method, handleException(TransportException exp). we see that the variable retryCount is increase by one, and if the current retry count exceed the default 3 times, then the node is considered dead and were removed. If the current retry count is less than the default 3 times, then another ping is send with the same ping_timeout.
Master fault detection is a little different than nodes fault detection. When the public method of MasterFaultDetection start is called and then method innerStart(), object MasterPinger is created.
this.masterPinger = new MasterPinger();
// start the ping process
threadPool.schedule(pingInterval, ThreadPool.Names.SAME, masterPinger);
So there is a periodic ping of default 1 second. When instance of MasterPinger is run, we noticed that it goes through the same process of sending the request using transport service. transportService.sendRequest(final DiscoveryNode node, final String action, final TransportRequest request, final TransportRequestOptions options, TransportResponseHandler<T> handler)
The logic of this request sending is same with NodesFaultDetection. What interesting is the method override in class BaseTransportResponseHandler. In handleResponse, so we see the the retry count is reset back to 0. Then another ping is scheduled.
In the override method handleException(TransportException exp) , so there are three exception check on the master if it no longer a master, or ping to a non master ping a master but does not exists on it. Now at the stage, retry count is increase by one. If current retry count greater than or equal to the default 3 times, then ping to node by this master is falied, this node consider failed. if current retry count less than the default three, another ping is sent.
That's it, if you think this analysis need improvement, please leave your comment below. Thank you.