[NLNOG] Internet operations during pandemics
Paul de Weerd
weerd-nlnog at weirdnet.nl
Thu Mar 19 17:26:52 CET 2020
In light of Job's presentation and Marco's comment regarding
maintenance I would like to share what ColoClue was planning with
regards to maintenance, what changed there and why we made these
ColoClue (AS 8283) is a small ISP structured as a not-for-profit
association, our focus is on knowledge sharing / learning about
internet hosting technologies in a pretty broad sense .
In 2009 we connected every colocated server to Ipoman Power
Distribution Units (PDUs). Unfortunately, not too long ago the first
of these PDUs failed with bulging capacitors, so we started looking
for replacements. As the failure rate increased, the priority of this
project increased and we sped up sourcing new PDUs.
To replace our aging / failing PDUs, we ordered a set of new ones.
These were delivered around mid February, after which we started
planning a maintenance window to replace the old PDUs. The first round
of replacements were scheduled for 7 March. After that went fine, the
second and third maintenance windows were scheduled on 21 and 28
Our new PDUs are physically higher than the old ones (1.5U versus 1U),
so it is not a simple matter of bringing down connected hardware and
replacing the devices. We have to reorganise the cabinets too. We
asked our members for volunteers to help out with the maintenance to
speed things up and were all ready to go this Saturday until...
COVID-19 stopped us in our tracks.
We weighed the pros and cons of cancelling or going ahead with our
plans. In the end, we decided to go ahead with the maintenance anyway.
The risk of failing PDUs was hanging over our machines like a Sword of
Damocles, should that happen during a further lock-down we wouldn't be
able to replace PDUs which may impact our users who potentially use
their machines to work from home (in fact, I do this myself -
However, we've changed our approach: instead of gathering a bunch of
volunteers in the datacenter, only two people will go with remote
support for administrative work and remote checking of machines that
are brought back online. The two people that are going both have been
in close contact in recent days anyway, further reducing the risk of
spreading the virus.
Of course, this isn't perfect. The Dutch government may order a
lock-down before we can complete the maintenance. Either of the two
people who will be doing the work could fall ill. A PDU could fail
before we replace it. How we deal with those cases, we'll need to
figure out if and when they happen. For now, we believe we have a
solid strategy that addresses our current operational risk and the
Paul 'WEiRD' de Weerd
On Wed, Mar 18, 2020 at 06:12:56PM +0000, Job Snijders wrote:
| Dear all,
| I threw together a slidedeck today on the potential impact and second
| order effects of COVID-19 on Internet network operations.
| I hope we together over time can add and extend projections in the deck
| on what will happen and how we can mitigate the negative effects on
| Internet operations.
| We have to answer questions such as:
| 1) what problems already exist today because of a few weeks of C19?
| 2) What problems are still coming? Will those be localized or globally?
| 3) What possible workarounds can we plan for those problems?
| I would appreciate feedback, comments, corrections or whatever you want
| to tell me. None of us have been in this situation before, so my guess
| is as good as yours.
| Kind regards,
| NLNOG mailing list
| NLNOG at nlnog.net
More information about the NLNOG