Violating TCP

This is strictly a violation of the TCP specification by Marek Majkowski.

From the post:

I was asked to debug another weird issue on our network. Apparently every now and then a connection going through CloudFlare would time out with 522 HTTP error.

522 error on CloudFlare indicates a connection issue between our edge server and the origin server. Most often the blame is on the origin server side – the origin server is slow, offline or encountering high packet loss. Less often the problem is on our side.

In the case I was debugging it was neither. The internet connectivity between CloudFlare and origin was perfect. No packet loss, flat latency. So why did we see a 522 error?

The root cause of this issue was pretty complex. After a lot of debugging we identified an important symptom: sometimes, once in thousands of runs, our test program failed to establish a connection between two daemons on the same machine. To be precise, an NGINX instance was trying to establish a TCP connection to our internal acceleration service on localhost. This failed with a timeout error.

It’s unlikely that you will encounter this issue but Majkowski’s debugging of it is a great story.

It also illustrates how deep the foundations of an error, bug or vulnerability may lie.

Comments are closed.