[NLNOG] Curious problem with connections from Ziggo customers to Linode nodes in some data centers
Stefan van den Oord
stefan+nlnog at medicinemen.eu
Wed Aug 23 14:18:00 CEST 2023
Dear NLNOG community,
I’d like to present to you a problem that we’re experiencing. To us it is very strange, we are out of ideas. Of course solutions would be much appreciated, but also ways of diagnosing this and work-arounds are very much appreciated.
Background: we’re a small Dutch company developing the Viduet platform: a platform to help chronically ill patients better manage their wellbeing together with their care providers. Our product is web-based and we also have a mobile app.
The problem: since almost two weeks we’re getting reports from users that sometimes (!) get connection timeouts in their browsers/apps when they connect to our web platform. We have narrowed down the potential sources of the issue and found a small setup to reproduce the issue:
We setup a clean Linode node in Frankfurt (smallest type, shared CPU) with Apache (just `apt install apache2`). Requesting the default apache index.html using `curl -i http://172.104.202.142` <http://172.104.202.142`/> (or http://172.104.202.142/large.html) causes timeouts more than 1 in 10 tries for some users, who have in common that they are all customers of the Ziggo internet provider.
Doing the same with a Linode node in Paris has the same result.
Doing the same with a Linode node in London works as you would expect, so does not have this strange behaviour.
Running `mtr -rwzbc100 172.104.202.142` (the Frankfurt node mentioned above) shows no packet loss, nothing out of the ordinary.
Running nginx instead of apache makes no difference: same issue
Forcing apache to use http/1.0 makes no difference: same issue (but frankly I don’t think it goes wrong on this level of the protocol stack)
There seems to be a relationship with the content length of the HTTP response. For shorter HTML files like the index.html on the Frankfurt node, it sometimes times out, and sometimes succeeds after a delay of a few seconds (and sometimes returns as fast as you’d expect). Using slightly larger HTML files it just times out.
I’m hesitating whether this is relevant at all, but when using HTTPS instead of HTTP, the problem also manifests in TLS handshake errors.
We have been in touch with Linode support and they found nothing out of the ordinary on their side. They point to the Ziggo network, saying:
> The reason that the issue only exists from to the Frankfurt data center and not London is likely because the problem is related to the particular route that the traffic takes from one place to the other. While the MTRs you shared look good, I did find evidence in another ticket of a particular hop within Ziggo's network showing issues, but we can't say for sure what the issue is. Here is the information about the hop in case that helps with your communication with Ziggo:
> AS33915 asd-tr <http://asd-tr0021-cr101-be64.core.as9143.net/>0021 <http://asd-tr0021-cr101-be64.core.as9143.net/>-cr <http://asd-tr0021-cr101-be64.core.as9143.net/>101 <http://asd-tr0021-cr101-be64.core.as9143.net/>-be <http://asd-tr0021-cr101-be64.core.as9143.net/>64 <http://asd-tr0021-cr101-be64.core.as9143.net/>.core.as <http://asd-tr0021-cr101-be64.core.as9143.net/>9143 <http://asd-tr0021-cr101-be64.core.as9143.net/>.net <http://asd-tr0021-cr101-be64.core.as9143.net/> (213.51.64.193)
Again, any help is much appreciated!
Kind regards,
—
Stefan van den Oord
CTO @ Medicine Men B.V.
Not in the office on Wednesdays
Regulierenring 22
3981 LB Bunnik
The Netherlands
+31 85 1307020
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.nlnog.net/pipermail/nlnog/attachments/20230823/079ced6f/attachment.html>
More information about the NLNOG
mailing list