Why my HTTP request did not timed out? Quick tour inside JDK and OkHttpClient for better understanding of timeouts
So you have JVM based service (A) that is communicating with another service (B) using some kind of HTTP client.
You know what you are doing so you gathered metrics statistics from your dependency. Especially it’s response times.
Let’s assume that p99 of its response times form service B are 200ms. Also, this service is fairly close to you for example in the
same data center. You adjusted your timeouts accordingly, for example, you’ve set connection timeout
to 50ms
and socket timeout
to 250ms. Everything works fine but you are really thorough, have great
observability in your service and monitor metrics regularly. One day you noticed something:
Wait, what? How is this even possible? You have set timeouts and in the worst-case scenario, your requests should be timed out after 250ms.
Connection timeout restricts how long we will wait until the connection is established. So the result can be either an open connection or unreachable host. So there shouldn’t be any problems. Let’s take a look at socket timeout maybe we can find something interesting.
“Talk is cheap, show me the code”
To get a better understanding of whats going on we could write a simple socket server and simple client. All of the code was
written in java 11, on AdoptOpenJDK 11.0.4
Here is the code of server:
|
|
It is not rocket science, we are creating a new instance of SocketServer
running on port 59090
. Next in line (9) we
prepare the message that we will send and it is a single byte, we will be sending it 200 times.
Next, we have an infinite loop that will wait for client connections and when this happens we will send
him our message.
Now, let’s take a look at the client:
|
|
Not much to see here, we are creating new Socket
instance that will connect to our localhost on port 59090
then we set
soTimout
to 100ms
this is what we will be monitoring. Next, there is a prepared byte array to store the response from our
server. With everything prepared we can start reading. We know that we should read 200 bytes
so we will read until we
have collect 200 bytes and there is something to read. At the end to check if everything went well and we read the amount of
bytes that were expected we can print our reading offset.
(Function InputStream.read(byte b[], int off, int len)
returns amount of bytes that were read).
Fantastic we have written some code so we can run it and measure how long it took to read the whole message.
I ran it a few times and got an average response of 17ms
and we have much time to spare until we reach the timeout.
Now lets see what happens when connection becomes slower. To simulate it we will add sleep in our server just after we
accept new connection:
|
|
After running again our code we will get expected timeout exception :)
|
|
Everything works as expected. We can take a closer look right now what is going on inside of JDK code.
When we call socket.setSoTimeout(100)
, this piece of code is executed:
|
|
As we can see in line (7) SocketOptions.SO_TIMEOUT
option is set on our java.net.SocketImpl
class.
In this scenario, java.net.PlainSocketImpl
implementation of a socket is used. When we look at our client code in
line (13) we get InputStream
form our socket and it is java.net.SocketInputStream
. Next in line (19) we are calling
read(byte b[], int off, int len)
method, underneath it is using written in C method Java_java_net_SocketInputStream_socketRead0
This method is long and you do not need to read it all. (But I encourage you to do it 😉) The important for us is part:
|
|
In the line (3) NET_Timeout
function is called, when we look inside it we can see that it is calling operating system
polling function that waits for events on file descriptor, in Linux it is poll(2)
in BSD select(2)
and select(4)
in Windows.
For all of those functions timeout that we specified earlier is passed. Those polling functions will wait without blocking
until the end of our timeout or for events on file descriptors to occur. They return how many descriptors registered some events.
In next steps errors are handled and SocketTimeoutException
is thrown if polling function returned -1
which means it
timed out. If everything went well this piece of code is executed:
|
|
In line (1) recv
function is called it is used to read bytes from the socket, then these bytes are populated to byte array
we have passed to InputStream.read
function and number of read bytes is returned.
Returning to our simple client
So we know how socket timeout work, it is pretty simple and it does what most of us probably thought it do. But it still did not explain why some of the requests to service B took more than timeout, so let’s dig deeper.
What happens when we put our sleep just after sending the first byte?
In our logs, we can see that we have read a single byte and then we got SocketTimeoutException
, so our client read the first
part of the message and then encountered a problem. But it was expected because the server took much longer to respond
than we thought it will. I think we can test at least one more thing, we can play with our sleep time. Lets set it to 50ms
and run our code. Here is the result of execution:
|
|
Whaat? Operation took: 10490
how is that even possible? We set socketTimeout
to 100ms
and our connection took
more than 10 seconds. Maybe there is some bug? Let’s run it again:
|
|
Almost the same result. Can it be our original problem? Yes it is! It is almost exactly how OkHttpClient
works. Here is
code sample from this library:
|
|
This client have to read the expected amount of data and will read in parts until whole message was send, so even if
single byte was send via socket before timeout elapsed SocketTimeoutException
will not be thrown.
But how does it know how many bytes should it read?
The answer to this question is actually pretty simple. HTTP
response has a predefined scheme. First of all, we have Status line
which contains status code, next we have headers
and finally message body
. Headers and body are separated with CRLF
Also one of the headers is Content-Length
which indicates how long message body will be. So while receiving response
from another service via HTTP/1.1
we can calculate when to stop reading from the socket after receiving an appropriate amount
of bytes.
Summing up
After reading this short article you should know how timeouts work inside of JDK
and why your request to another service
took longer than expected. The Most important piece of knowledge you should remember is that when you set socketTimeout
for
example on RestTemplate
while using Spring you set a timeout for every single read form socket which effectively becomes
time from last bit of information after your connection will be interrupted. You should keep it in mind when you set timeouts
next time in your application and adjust them accordingly. Fortunately since 3.12.0
version of OkHttp
client you can specify
call timeout OkHttpClient.Builder.callTimeout()
which will limit whole operation time.
That all from me, thank you all for reading my first post. Feel free to let me know how you liked it, also I encourage you to comment, ask questions and share the knowledge with your coworkers.