November 17, 2012

Better routing, less bad apples

Another month, another update to http.debian.net. This time around most of the work was done outside the redirector's code base, as strange as it may sound.

The redirector heavily relies on the mirrors doing at least a couple of things right, for the rest it can and does compensate. When it needs to compensate, certain requests are redirected to automatically-detected good mirrors, thus avoiding mirrors that might work fine for some parts of the day but cause headaches during the rest.

So, part of the work done since the last update was to prod more mirror administrators to upgrade to the latest version of ftpsync. This reduces the number of mirrors for which compensation is needed in order to avoid errors during installations and upgrades. Hopefully, no additional work is needed for the redirector to notice the upgrades. This results in immediate improvements.

However, not all mirrors comply with the bare minimum requirements. As stated in my previous blog post, running rsync once is not enough. When mirrors break these assumptions they lead to the "bad apple" effect. The effects in this case are temporary errors, as experienced by some people. The interesting part of those issues is that the affected population may quickly change given the redirector's use of geo location and the way it creates mirror subsets.

As interesting as the distribution of the effects may be, they are not really welcome. So I put together some code to attempt to detect the bad apples. This resulted in a list of mirrors that have now been disabled in the redirector and whose administrators are going to be contacted so that they comply with the minimum requirements. Given that detection is time-sensitive, there's no 100% guarantee that all of them have been identified so far. The code to detect them will have to be adapted and integrated into the redirector's code base to be proactive on avoiding this kind of issues.

Last but not least, the redirector is now using a database of AS peers for better (re-)routing. This is the next move towards a decision making based more on network location/topology than in geographic location. This first use of a peers database is limited to IPv4 and is based on a recent routing table dump and on feedback provided by interested people. If you are a mirror or network administrator, or you are familiar with the topology of your network please drop me an email so that the redirector can make a better use of your peering agreements.

N.b. in the case of the database, the term peer may also include transit providers. It is used to refer to and establish a relationship between two AS(N)s.


Feedback is, as always, welcome. I read each and every email but it may take me some time to get to it, or reply.